Data Engineering

 Posted 4 hours ago
  
 India
  
5-10 years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

Design, build, and maintain scalable data pipelines and backend services to support analytics and enterprise data platforms. Implement AI-enabled data solutions including RAG pipelines and LLM integrations.

Role Overview

We are seeking a skilled Data Engineer with strong hands-on experience in ETL development, Databricks, PySpark, Python, cloud services, and data warehousing. The primary responsibility of this role is to design, build, and maintain scalable data pipelines and backend data services that support analytics, reporting, application integrations, and enterprise data platforms.

The ideal candidate should also have backend development experience and practical exposure to modern AI-enabled data solutions, including RAG, LangChain, vector databases, embeddings, and LLM-based applications. Generative AI experience is considered an added advantage and should complement the core data engineering responsibilities.

Key Responsibilities

  • Design, develop, and maintain scalable ETL/ELT pipelines for batch and near-real-time data processing.

  • Build data engineering solutions using Python, SQL, PySpark, Apache Spark, Databricks, Airflow, Matillion, DBT, and related technologies.

  • Develop and optimize Databricks notebooks, jobs, workflows, Spark transformations, and Delta Lake-based processing pipelines.

  • Ingest, transform, validate, and load data from APIs, cloud storage, databases, logs, FTP/SFTP servers, files, and enterprise applications.

  • Work with cloud services across AWS and/or Azure, including storage, compute, serverless processing, monitoring, logging, and managed data services.

  • Build and maintain backend services and APIs using Python frameworks such as Flask, FastAPI, or similar technologies to expose data, trigger pipelines, and support downstream applications.

  • Design and support data models, curated datasets, warehouse tables, and lakehouse layers for analytics, reporting, operational dashboards, and AI-driven use cases.

  • Work with modern data warehouses and databases such as Snowflake, Amazon Redshift, PostgreSQL, SQL Server, MySQL, Oracle, OpenSearch, or similar platforms.

  • Implement data quality checks, logging, monitoring, alerting, exception handling, and pipeline failure recovery mechanisms.

  • Collaborate with data analysts, data scientists, backend developers, cloud engineers, and business stakeholders to deliver reliable and production-ready data solutions.

  • Support GenAI-enabled data use cases where required, including RAG pipelines, document ingestion, embedding generation, vector search, and LangChain-based workflows.

  • Assist in integrating enterprise data with LLM applications while ensuring proper metadata filtering, access controls, tenant isolation, and grounded response generation.

  • Participate in CI/CD, version control, deployment, and production support activities using Git, GitHub, Jenkins, Docker, CodePipeline, ECS, or similar tools.

Required Qualifications

  • 6 years of experience in data engineering, ETL development, backend data services, or cloud data platforms.

  • Strong hands-on experience with Python, SQL, PySpark, Apache Spark, and ETL/ELT pipeline development.

  • Practical experience with Databricks Data Engineering, including notebooks, jobs, workflows, Spark jobs, and scalable transformation pipelines.

  • Experience working with cloud platforms such as AWS or Azure.

  • Experience with cloud services such as AWS S3, Glue, Lambda, EMR, Redshift, RDS, ECS, EC2, SQS, Azure Blob Storage, ADLS, Cosmos DB, Azure AI Search, Key Vault, Application Insights, or Log Analytics.

  • Strong understanding of data warehousing concepts, dimensional modeling, data marts, curated layers, and warehouse performance optimization.

  • Experience working with data warehouses and databases such as Snowflake, Redshift, PostgreSQL, SQL Server, MySQL, Oracle, OpenSearch, or similar systems.

  • Experience building backend services or APIs using Python, Flask, FastAPI, or similar backend frameworks.

  • Good understanding of data ingestion patterns, file formats, data validation, schema handling, metadata management, and pipeline orchestration.

  • Experience with workflow orchestration and transformation tools such as Airflow, Databricks Workflows, Matillion, or DBT.

  • Ability to troubleshoot production data issues, optimize Spark jobs, tune SQL queries, and improve pipeline performance.

  • Strong documentation, communication, and cross-functional collaboration skills.

GenAI / AI Add-On Skills

  • Working knowledge of RAG architecture, including document ingestion, chunking, embeddings, retrieval, and response generation.

  • Experience or exposure to LangChain, prompt orchestration, LLM integration, and AI-powered search workflows.

  • Familiarity with vector databases or vector search platforms such as Azure AI Search, OpenSearch, ChromaDB, FAISS, Pinecone, Weaviate, Milvus, or similar tools.

  • Understanding of how structured, semi-structured, and unstructured data can be prepared and indexed for GenAI applications.

  • Exposure to LLM platforms such as OpenAI, Azure OpenAI, Claude, AWS Bedrock, or Hugging Face.

  • Ability to support AI applications from a data engineering perspective by preparing, securing, indexing, and retrieving enterprise data.

Preferred Qualifications

  • Experience building production-grade data pipelines that process large volumes of records.

  • Experience with Delta Lake, Medallion Architecture, data lakehouse design, and Databricks performance optimization.

  • Experience with backend API development for data access, pipeline triggering, metadata management, or analytics integration.

  • Exposure to enterprise search, document intelligence, log analytics, observability, or security data platforms.

  • Experience with containerized deployments using Docker, AWS ECS, Kubernetes, or similar cloud-native services.

  • Familiarity with CI/CD pipelines, automated deployments, release documentation, and production support practices.

Technical Skills

Category

Skills / Technologies

Primary Skills

Python, SQL, ETL/ELT, PySpark, Apache Spark, Databricks, Cloud Services, Data Warehousing

Data Engineering

Airflow, Matillion, DBT, Delta Lake, Data Pipelines, Data Quality, Batch Processing, Orchestration

Cloud

AWS, Azure, Snowflake

Data Platforms

Snowflake, Redshift, PostgreSQL, SQL Server, MySQL, Oracle, OpenSearch

Backend

Flask, FastAPI, REST APIs, Python Services, API Integration

GenAI Add-On

RAG, LangChain, Vector Databases, Embeddings, LLM Integration, Prompt Engineering

DevOps

Git, GitHub, Jenkins, Docker, CodePipeline, UCD, ECS

Education

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Systems, Artificial Intelligence, Machine Learning, or a related technical field.


At Zensar, we’re “experience-led everything”. We are committed to conceptualizing, designing, engineering, marketing, and managing digital solutions and experiences for over 130 leading enterprises. We are a company driven by a bold purpose: Together, we shape experiences for better futures. Whether for our clients, our people, or the world around us, this belief powers everything we do. At the heart of our culture is ONE with Client - a set of four core values that reflect who we are and how we work: One Zensar, Nurturing, Empowering, and Client Focus.

Part of the $4.8 billion RPG Group, we’re a community of 10,000+ innovators across 30+ global locations, including Milpitas, Seattle, Princeton, Cape Town, London, Zurich, Singapore, and Mexico City. Explore Life at Zensar and join us to Grow. Own. Achieve. Learn. to be the best version of yourself.

We believe the best work happens when individuality is celebrated, growth is encouraged, and well-being is prioritized. We are an equal employment opportunity (EEO) and affirmative action employer, committed to creating an inclusive workplace. All qualified applicants will be considered without regard to race, creed, color, ancestry, religion, sex, national origin, citizenship, age, sexual orientation, gender identity, disability, marital status, family medical leave status, or protected veteran status.

Similar Jobs

See all Remote Others jobs →

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in Others

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified