Big Data EngineerCloudera
Posted on Mar 19, 2021 by TechnogenInc
TECHNOGEN, Inc. is a Proven Leader in providing full IT Services, Software Development and Solutions for 15 years. TECHNOGEN is a Small & Woman Owned Minority Business with GSA Advantage Certification. We have offices in VA; MD & Offshore development centers in India. We have successfully executed 100+ projects for clients ranging from small business and non-profits to Fortune 50 companies and federal, state and local agencies. Please send me the resumes to and call me at Position: Big Data Engineer (Cloudera, Data Bricks, Azure) Location: Chicago, IL Duration: 12 Months Overview We are looking for a strong Hadoop data engineering talent to be a part of the data integration team for the company. We are building a new application integration platform which will allow bi-directional real time integration between SAP, MDM and Salesforce and also bringing new data into current Enterprise Data Lake. Role offers opportunity to be key part of a challenging, fast-paced environment and build ground-up core data integration offerings and shape the technology roadmap in our high-growth, knowledge-driven team. Role Description Able to translate functional and technical requirements into detail design. Responsible for analyzing large data sets to develop custom data pipelines to drive business solutions Work with team leads for regular updates, requirement understanding and design discussions Develops data pipeline for ingesting & cleansing data, applies data validation & transformation, stores data in various zones using Python, Hive & PySpark. Hands on experience in design of Near Real time Data Processing solutions using various data streaming technologies in on-Prem and Azure cloud. Implements solution to stream data from various sources to Azure using Cloud native services Hands on experience with Zaloni data management tool implementing data pipelines Gathers the data from various third-party data in to cloud and on-Prem data platforms, optimizing it and joining with internal datasets to gather meaningful information. Stores data in various zones in right file format using appropriate compression techniques (ORC, Parquet and Avro) Plan the closure of the solutions and see through the implementation of the solutions Prepare and produce releases of data pipelines. Good troubleshooting and application performance tuning experience. Participates in day to day project and production delivery status meetings and provide technical support for faster resolution of issues. Ability to co-ordinate work with geographically distributed team Share knowledge, mentor and provide technical guidance to team members Works with various scrum teams contribute to strengthening of DevOps and Agile methodologies to improve cycle times, consistency, and quality. Excellent understanding / knowledge of Hadoop architecture and various components. Solid understanding of Azure Data Platform and components Qualifications Bachelor's degree in Computer Science, Mathematics, or Engineering 5+ years of experience in Data Engineering and Data Platforms 3+ years of Hadoop platform implementation and administration skills 3+ years of experience in Implementation of Azure Big data Services 2+ Years of experience in Zaloni Data Management Software 2+ years of experience in setting up and administering Big Data Management Tool (Zaloni), Databricks, etc. 3+ years of experience running data pipelines at scale in production Strong Linux and power shell scripting skills 5+ years of designing, implementing and successfully operationalizing large-scale data lakes solutions in production environments using Big data Stack (On-prem and Azure) 3+ years of experience in implementing end to end Azure cloud big data solutions. 2+ years of experience implementing Real-time Solutions & data integrations Hands on experience with building, optimizing the data pipe CI/CD, integrated build and deployment automation, configuration management, test automation solutions. Professional training and certifications in various big data solutions (Preferred Solid understanding of Azure Cloud Stack including ADF Data flows, Event Hub, Databricks, HDInsight, Azure DevOps Deep hands on experience with Hadoop, HIVE, HBase, Spark, Kafka, Snowflake, Python, SQL, Java, Scala, Zeppelin, Spark RDDs and Data Frames. Communication and presentation skills, to articulate functionality, issues and risks with business communities
Set up alerts to get notified of new vacancies.