Data Engineer
Data Engineer
Solihull - Needs to be on site 2 days a week
Business Unit - FS
IR35 Status - mandated PAYE
We are heading up a recruitment drive for a global consultancy that require a XXX to join them on a major government project that's based remotely.
Key skills (most important) Spark, Airflow, Hadoop, Oozie, Docker, Kubernetes containerization, COS, Scala, SQL, Dremio, Git, GitLab, Jenkins, Bash
About the Role:
As a Data Engineer at BNP Paribas, you will be instrumental in designing, building, and optimizing the data processing pipelines in a centralized data platform "DATAHUB." This platform consolidates, federates, and enhances massive data assets for various use cases, including reporting, analytics, and machine learning. You will work with multiple data sources, ensuring the seamless integration, transformation, and quality of data, while also migrating Hadoop infrastructure to cloud environments.
Key Responsibilities:
- Data Integration: Integrate data from multiple sources and formats into the Raw Layer of the DATAHUB.
- Data Modeling and Pipeline Development: Model data and develop efficient data pipelines to enrich and transform large volumes of data with complex business rules, automating data pipelines and streamline data ingestion. Designing and implementing scalable and secure data processing pipelines using Scala, Spark, cloud Object storage, Hadoop, Cloud object storage
- Data Transformation and Quality: Implement data transformation and quality control processes to ensure data consistency and accuracy. Utilize programming languages such as Scala and SQL and tools like Spark for data transformation and enrichment operations.
- Scheduling with Airflow: Schedule data processing tasks using Airflow.
- Validation Testing: Write and conduct unit and validation tests to ensure accuracy and integrity of code developed.
- CI/CD Pipeline Implementation: Set up CI/CD pipelines to automate deployment, unit testing, and development management.
- Documentation: Write technical documentation (specifications, operational documents) to ensure knowledge capitalization.
- Code Improvement: understand existing code, modify the existing code as per business requirements and Continuously improve for better performance and maintainability and prepare relevant documentation.
- Infrastructure Migration: Migrate the existing Hadoop infrastructure to cloud infrastructure on Kubernetes Engine, Object Storage, Spark as a service, and Airflow as a service
- Performance Optimization and Security: Ensure the performance and security of the data infrastructure and follow the best practices of Data engineering.
- Production Support and Maintenance: Contribute to production support, incident and anomaly correction, trouble shoot data related issues, and implement functional and technical evolutions to ensure the stability of production processes.
- Team Collaboration: Work closely with data squads and business teams to understand data needs and provide tailored solutions.
- Agile Experience - understand of agile principle and rituals
- Software Development Lifecycle (SDLC) awareness
Reference: 2821504452
Data Engineer
Posted on Sep 10, 2024 by fortice
Data Engineer
Solihull - Needs to be on site 2 days a week
Business Unit - FS
IR35 Status - mandated PAYE
We are heading up a recruitment drive for a global consultancy that require a XXX to join them on a major government project that's based remotely.
Key skills (most important) Spark, Airflow, Hadoop, Oozie, Docker, Kubernetes containerization, COS, Scala, SQL, Dremio, Git, GitLab, Jenkins, Bash
About the Role:
As a Data Engineer at BNP Paribas, you will be instrumental in designing, building, and optimizing the data processing pipelines in a centralized data platform "DATAHUB." This platform consolidates, federates, and enhances massive data assets for various use cases, including reporting, analytics, and machine learning. You will work with multiple data sources, ensuring the seamless integration, transformation, and quality of data, while also migrating Hadoop infrastructure to cloud environments.
Key Responsibilities:
- Data Integration: Integrate data from multiple sources and formats into the Raw Layer of the DATAHUB.
- Data Modeling and Pipeline Development: Model data and develop efficient data pipelines to enrich and transform large volumes of data with complex business rules, automating data pipelines and streamline data ingestion. Designing and implementing scalable and secure data processing pipelines using Scala, Spark, cloud Object storage, Hadoop, Cloud object storage
- Data Transformation and Quality: Implement data transformation and quality control processes to ensure data consistency and accuracy. Utilize programming languages such as Scala and SQL and tools like Spark for data transformation and enrichment operations.
- Scheduling with Airflow: Schedule data processing tasks using Airflow.
- Validation Testing: Write and conduct unit and validation tests to ensure accuracy and integrity of code developed.
- CI/CD Pipeline Implementation: Set up CI/CD pipelines to automate deployment, unit testing, and development management.
- Documentation: Write technical documentation (specifications, operational documents) to ensure knowledge capitalization.
- Code Improvement: understand existing code, modify the existing code as per business requirements and Continuously improve for better performance and maintainability and prepare relevant documentation.
- Infrastructure Migration: Migrate the existing Hadoop infrastructure to cloud infrastructure on Kubernetes Engine, Object Storage, Spark as a service, and Airflow as a service
- Performance Optimization and Security: Ensure the performance and security of the data infrastructure and follow the best practices of Data engineering.
- Production Support and Maintenance: Contribute to production support, incident and anomaly correction, trouble shoot data related issues, and implement functional and technical evolutions to ensure the stability of production processes.
- Team Collaboration: Work closely with data squads and business teams to understand data needs and provide tailored solutions.
- Agile Experience - understand of agile principle and rituals
- Software Development Lifecycle (SDLC) awareness
Reference: 2821504452
Alert me to jobs like this:
Amplify your job search:
Expert career advice
Increase interview chances with our downloads and specialist services.
Visit Blog