Data Engineer

 Posted 15 days ago
     
2-5 years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

Design, build, and maintain scalable data pipelines and ETL processes using Databricks and Apache Spark. Collaborate with data scientists to implement Lakehouse architecture and ensure data quality and security.

At Capgemini Engineering, the world leader in engineering services, we bring together a global team of engineers, scientists, and architects to help the world’s most innovative companies unleash their potential. From autonomous cars to life-saving robots, our digital and software technology experts think outside the box as they provide unique R&D and engineering services across all industries. Join us for a career full of opportunities. Where you can make a difference. Where no two days are the same.

Job Description

Key Responsibilities: 

  • Design, build, and maintain data pipelines and ETL processes using Databricks and Apache Spark.
  • Optimize data workflows for performance, scalability, and cost efficiency.
  • Implement data Lakehouse architecture and manage data ingestion from multiple sources.
  • Collaborate with data scientists and analysts to enable advanced analytics and machine learning workloads.
  • Ensure data quality, governance, and security across all data assets.
  • Monitor and troubleshoot Databricks clusters, jobs, and workflows.
  • Integrate Databricks with cloud services (AWS, Azure, or GCP) and other enterprise systems.
  • Document processes, standards, and best practices for data engineering.

 

Required Skills & Qualifications: 

  • Bachelor’s degree in Computer Science, Data Engineering, or related field.
  • 3+ years of experience in data engineering or big data technologies.
  • Hands-on experience with DatabricksApache Spark, and PySpark.
  • Strong knowledge of SQLPython, and data modeling principles.
  • Experience with cloud platforms (AWS, Azure, or GCP) and their data services.
  • Familiarity with Delta LakeLakehouse architecture, and data governance.
  • Understanding of CI/CD pipelines and DevOps practices for data workflows.
  • Excellent problem-solving and communication skills.

 

Nice to Have 

  • Experience with Delta Lake and Lakehouse architecture.
  • Knowledge of Azure Data Factory (ADF) for orchestration.
  • Exposure to CI/CD pipelines for data engineering.
  • Experience with SQL and Spark SQL.
  • Azure certifications related to data engineering.

 

#LI-DC10

#LI-Remote

Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its market leading capabilities in AI, generative AI, cloud and data, combined with its deep industry expertise and partner ecosystem.

Similar Jobs

See all Remote Software Development jobs →

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in Data Engineer

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified