Senior Data Engineer

 Posted 2 months ago
  
 Peru
  
2-5 years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

The Senior Data Engineer will develop and maintain large-scale data processing systems using AI-assisted tools to automate pipelines and improve data quality. They will collaborate with data scientists and stakeholders to build infrastructure that supports analytics, machine learning, and business decision-making.

Job Summary

We're opening eyes, hearts and minds to the impact that a pharmacy team can have in changing lives.

 

Join our group of talented, committed team members-pharmacists, pharmacy care coordinators, technologists, product strategists and more-to create and expand the delivery of personalized health support that people didn't even know could be possible.


The Senior Data Engineer for Stellus Rx will be a key member of our Technology Team, working closely with Stellus Rx leaders and across the organization to unlock the health of millions of Americans. We are a culture that is unabashedly driven by purpose — making a difference to patients and team members while growing at an accelerated rate.


This role is built for a data engineer who uses AI as an active part of their workflow — accelerating pipeline development, automating data quality processes, and enabling richer, faster insights across our Cloud Analytics Data Platform rather than relying on manual, repetitive engineering approaches.


Role and Responsibilities:

AI-Augmented Pipeline Development & Automation

  • Develop, construct, and maintain large-scale data processing systems that collect data from a variety of structured and unstructured sources — using AI code generation tools to accelerate pipeline authoring, reduce boilerplate, and improve code quality.
  • Build and optimize ELT pipelines using AI-assisted tooling to identify bottlenecks, suggest optimizations, and automate routine pipeline maintenance tasks.
  • Identify, design, and implement internal process improvements: use AI to automate manual processes, optimize data delivery, and re-design infrastructure for greater scalability — replacing manual analysis with AI-driven discovery of improvement opportunities.
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from various sources; use AI to accelerate infrastructure-as-code authoring and configuration.


AI-Ready Data Preparation & ML Enablement

  • Prepare data for data scientist exploration and discovery using AI-assisted data profiling and quality assessment tools — surfacing anomalies, schema drift, and data gaps faster than manual inspection allows.
  • Perform data wrangling and munging for downstream analytics and machine learning; leverage AI tools to generate and validate transformation logic against business rules.
  • Assemble large, complex datasets that meet functional and non-functional business requirements; use AI to rapidly evaluate dimensional modeling approaches and ontology alignment strategies.
  • Enable large-scale machine learning by designing and maintaining annotated datasets, elastic search approaches, and scalable data lake structures that support AI/ML workloads.


Analytics Pipeline & Insight Generation

  • Create and maintain analytics pipelines that generate data and insight to power business decision-making; use AI-assisted analysis to proactively surface trends, anomalies, and opportunities within pipeline outputs.
  • Collaborate with data scientists, analysts, and business stakeholders on requirements for dimensional modeling, distributed ETL pipelines, and cross-repository data migration.
  • Evaluate, compare, and improve design patterns, data lifecycle approaches, and data ontology alignment — using AI to model trade-offs and accelerate proof-of-concept validation.
  • Work with data and analytics experts to continuously improve the functionality, reliability, and intelligence of data systems.


Root Cause Analysis & Quality Management

  • Perform root cause analysis on internal and external data and processes using AI-assisted investigation tools — replacing slow, manual log and lineage review with faster, AI-accelerated diagnostics.
  • Develop and maintain data quality frameworks; use AI to automate anomaly detection, schema validation, and data contract enforcement across pipelines.
  • Develop a strong understanding of company domains, strategic direction, and user needs to ensure data systems are aligned to business outcomes, not just technical requirements.


Qualifications and Requirements:

  • 4+ years of experience in a Data Engineer role.
  • Graduate degree in Computer Science, Statistics, Informatics, Information Systems, or another quantitative field.
  • Advanced SQL knowledge and experience with relational databases and query authoring.
  • Required: Demonstrated, hands-on experience using AI tools to accelerate data engineering tasks — pipeline development, data quality automation, code generation, or root cause analysis — with specific examples you can speak to.
  • Experience building and optimizing data pipelines, architectures, and datasets.
  • Strong analytic skills working with unstructured and disconnected datasets.
  • Experience with big data tools: Hadoop, Spark, Kafka, etc.
  • Experience with relational and NoSQL databases including Postgres and Cassandra.
  • Experience with pipeline and workflow management tools: Airflow, Luigi, Azkaban, or similar.
  • Experience with AWS cloud services: EC2, EMR, RDS, Redshift.
  • Experience with stream-processing systems: Storm, Spark Streaming, or similar.
  • Working knowledge of message queuing, stream processing, and highly scalable data stores.
  • Proficiency in object-oriented/scripting languages: Python, Java, Scala, C++, or similar.
  • Experience supporting cross-functional teams in dynamic, agile environments.


Preferred Experience:

  • Experience designing or supporting data infrastructure for AI/ML model training, including annotated datasets and feature stores.
  • Familiarity with AI-assisted data quality or observability platforms (e.g., Monte Carlo, Soda, or similar).
  • Experience with LLM-based data processing pipelines or retrieval-augmented generation (RAG) architectures.
  • Healthcare data experience; familiarity with FHIR/HL7 standards a plus.
  • High English proficiency


Location

Lima, Lima (Remote)


Department

1420 - Decision Science


Employment Type

Full Time


Minimum Experience

Experienced


Similar Jobs

See all Remote Software Development jobs →

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in Data Engineer

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified