Responsibilities:

Design, Develop and Implement Big data engineering projects in Hadoop ecosystem.
Engineer solutions with Cloudera, MapR or HDP for both batch & streaming data with high quality and with a sense of urgency.
Develop application and custom integration solutions using spark streaming and Hive.
Understand specifications, plan, design and develop software solutions, adhering to process – either individually or collectively within a project team
Work in state-of-the art programming languages and utilize object-oriented approaches in designing, coding, testing and debugging programs.
Work with support teams in resolving operational & performance issues
Selecting and integrating any Big Data tools and frameworks required to provide requested capabilities
Integrate data from multiple data sources, Implementing ETL process using APACHE NIFI
Monitoring performance and advising any necessary infrastructure changes
Management of Hadoop cluster, with all included services such as Hive, HBase, mapReduce and Sqoop
Cleaning data as per business requirements using streaming API’s or user defined functions.
Build distributed, reliable and scalable data pipelines to ingest and process data in real-time, defining Hadoop Job Flows.
Managing Hadoop jobs using scheduler.
Apply different HDFS formats and structure like Parquet, Avro, etc. to speed up analytics.
Work with various hadoop ecosystem tools like Hive, pig, Hbase , spark etc.
Reviewing and managing Hadoop log files.
Assess the quality of datasets for a hadoop data lake.
Fine tune Hadoop applications for high performance and throughput.
Troubleshoot and debug any Hadoop ecosystem run time issues.

Being a part of a POC effort to help build new Hadoop clusters

Education:

Bachelor’s Degree or higher in Computer Science, Information Systems or related engineering disciplines

General Knowledge, Skills & Abilities

Be a good detail-oriented data engineer
Systematic and organizational skills important.
Willing to commit for completing deliverable on time.

Preferred Qualifications:

Must have experience with Spark, Hive, Scala or py spark.
Preferred experience in one of the following technologies: Nifi, Kafka, or any other streaming technologies.
3+ years experience in data engineering building ETL pipelines using JAVA or Python or Scala
Should be good at Pig, HIVE scripting.
Solid understanding of HDFS is important.
Work experience within a Data Warehousing/Business Intelligence/Data analytics group, and have hand’s-on experience with Hadoop
Create tables/views in Hive or other relevant scripting language
Have experience with Agile development methodologies
Experience with NoSQL databases, such as HBase, Cassandra, MongoDB.

Experience Architecting Solutions Utilizing any of the following:

JAVA or Python or Scala programming languages
Nifi, Kafka-topics, or any other streaming technologies
Parquet/Avro/ORC/XML/JSON/ORC/CSV/TXT formats

Location: Jacksonville, FL

Senior Big Data Engineer

Apply for this position