Senior Data Engineer
Posted on Oct 31, 2019 by Mason Alexander
Do you want to impact decisions to help people live healthier lives? Are you an innovative Data Engineer who thrives on developing new solutions to overcome challenges?
A leading MedTech company that uses a data-driven approach to improve healthcare is seeking a Senior Data Engineer with experience in a commercial environment.
As a Senior Data Engineer, your role is to strategically lead key projects that can improve people's health by designing new innovative solutions.
To succeed in this role, you will thrive on working independently with the ability to communicate effectively across teams and have confidence in carrying out negotiations.
You will join a newly-formed team in Dublin focused on developing a new, cutting-edge big-data analytic platform to improve people's health outcomes with machine learning. You will be responsible for the integration of multiple complex data sources into the data platform with a mix of different platforms (ie Kubernetes, Hadoop) and various data applications (ie Spark, Hive, HBase, Airflow, etc..), that supports the Medicare STARs program.
- Design and build data pipelines (mostly in Spark) to process terabytes of data
- Orchestrate in Airflow the data tasks to run on Kubernetes/Hadoop for the ingestion, processing and cleaning of data.
- Create Docker images for various applications and deploy them on Kubernetes
- Design and build best in class processes to clean and standardize data.
- Troubleshoot production issues in our Elastic Environment
- Work on Proof of Concepts for Big Data and Data Science
- Modelling of big volume datasets to maximize performance for our BI & Data Science Team
- Computer Science bachelor degree or similar.
- Hands on experience on the following technologies required:
- Demonstrated experience developing processes in Spark
- Significant experience writing complex SQL queries
- Considerable experience building ETL/data pipelines
- Exposure to Kubernetes and Linux containers (ie Docker) for at least 1 year
- Related/complementary open source software platforms and languages (eg Scala, Python, Java, Linux)
- Exposure to the following technologies is beneficial, but not essential:
- Cloud technologies: Amazon AWS or Microsoft Azure
- Previous experience with Relational Databases (RDBMS) & Non- Relational Database
- Experience working in projects with agile/scrum methodologies
- Exposure to DevOps methodology
- Data warehousing principles, architecture and its implementation in large environments