Data Engineer (ETL, Hadoop, Python, Spark, R) - Banking Client
Posted on Jan 9, 2020 by Salt
Data Engineer (ETL, Hadoop, Python, Spark, R) - Banking Client - Brussels
Duration: 12 months assignment
Rate: €800-€900 per day
The Group Digital Capabilities (GDC) Division ensures competitiveness by delivering reliable and sustainable IT solutions for the financial securities markets.
Our technical teams deliver new IT solutions and improve existing applications for both our internal and external clients. We deploy changes into the production environment in a controlled and structured way that does not compromise production stability and we ensure applicative production support.
Our non-technical people maintain the maturity of the IT project delivery with appropriate controls in line with the group's risk appetite and reducing development and running costs.
Within the Group Digital Capabilities (GDC) Division, the Big Data Analytics team supports the needs for advanced analytics from all the entities of the Group. As a competency centre for analytics, the team helps to transform data into insight using techniques such as text mining, process mining, network analytics or predictive modelling.
Description and what we have to offer - External
The Advanced Analytics team in Group Data Capabilities division is currently looking for a new Data Engineer whose core objectives will be:
- Collect, clean, prepare and load the necessary data - structured or unstructured - onto Hadoop, our Big Data analytics platform, so that they can be used for reporting purposes; creating insights and answering business challenges
- Act as a liaison between the team and other stakeholders and contribute to support the Hadoop cluster and the compatibility of all the different softwares that run on the platform (Spark, R, Python)
- Experiment new tools and technologies related to data extraction, exploration or processing
- Identify the most appropriate data sources to use for a given purpose and understand their structures and contents, in collaboration with SMEs when required
- Extract structured and unstructured data from the source systems (relational databases, data warehouses, document repositories, file systems), prepare such data (cleanse, re-structure, aggregate) and load them onto Hadoop.
- Actively support reporting teams in the data exploration and data preparation phases. Where data quality issues are detected, liaise with the data supplier to do root cause analysis
- Contribute to the design, build and launch activities
- Ensure the maintenance and support of production applications
- Liaise with Technology Services teams to address infrastructure issues and to ensure that the components and softwares used on the platform are all consisten
- Experience with understanding and creating data flows, with data architecture, with ETL/ELT development and with processing structured and unstructured data
- Proven experience with using data stored in RDBMSs and experience or good understanding of NoSQL databases
- Ability to write performant SQL statements
- Ability to analyze data, to identify issues like gaps and inconsistencies and to do root cause analysis
- Knowledge of Java
- Experience delivering scripts
- Experience in working with customers to identify and clarify requirements
- Ability to design solutions that are fit for purpose whilst keeping options open for future needs
- Strong verbal and written communication skills, good customer relationship skills
- Have a true agile mindset, capable and willing to take on tasks outside of her/his core competencies to help the team
- Strong technical skills and a strong interest in the financial industry.
Will be considered as assets
- Knowledge of R, Python, Scala and Spark
- Understanding of the Hadoop ecosystem including Hadoop file formats like Parquet and ORC
- Experience with open source technologies used in Big Data analytics like Spark, Pig, Hive, HBase, Kafka,
- Ability to write MapReduce & Spark jobs
- Knowledge of Cloudera
- Knowledge of IBM Mainframe
- Knowledge of AGILE development methods such as SCRUM is clearly an asset.