Stellus Rx

Senior Data Engineer

Posted 4 months ago

Peru

⭐ 2-5 years experience

Apply Now

Please mention DailyRemote when applying

AI Summary

The Senior Data Engineer will develop and maintain large-scale data processing systems using AI-assisted tools to automate pipelines and improve data quality. They will collaborate with data scientists and stakeholders to build infrastructure that supports analytics, machine learning, and business decision-making.

Job Summary

We're opening eyes, hearts and minds to the impact that a pharmacy team can have in changing lives.

Join our group of talented, committed team members-pharmacists, pharmacy care coordinators, technologists, product strategists and more-to create and expand the delivery of personalized health support that people didn't even know could be possible.

The Senior Data Engineer for Stellus Rx will be a key member of our Technology Team, working closely with Stellus Rx leaders and across the organization to unlock the health of millions of Americans. We are a culture that is unabashedly driven by purpose — making a difference to patients and team members while growing at an accelerated rate.

This role is built for a data engineer who uses AI as an active part of their workflow — accelerating pipeline development, automating data quality processes, and enabling richer, faster insights across our Cloud Analytics Data Platform rather than relying on manual, repetitive engineering approaches.

Role and Responsibilities:

AI-Augmented Pipeline Development & Automation

Develop, construct, and maintain large-scale data processing systems that collect data from a variety of structured and unstructured sources — using AI code generation tools to accelerate pipeline authoring, reduce boilerplate, and improve code quality.
Build and optimize ELT pipelines using AI-assisted tooling to identify bottlenecks, suggest optimizations, and automate routine pipeline maintenance tasks.
Identify, design, and implement internal process improvements: use AI to automate manual processes, optimize data delivery, and re-design infrastructure for greater scalability — replacing manual analysis with AI-driven discovery of improvement opportunities.
Build the infrastructure required for optimal extraction, transformation, and loading of data from various sources; use AI to accelerate infrastructure-as-code authoring and configuration.

AI-Ready Data Preparation & ML Enablement

Prepare data for data scientist exploration and discovery using AI-assisted data profiling and quality assessment tools — surfacing anomalies, schema drift, and data gaps faster than manual inspection allows.
Perform data wrangling and munging for downstream analytics and machine learning; leverage AI tools to generate and validate transformation logic against business rules.
Assemble large, complex datasets that meet functional and non-functional business requirements; use AI to rapidly evaluate dimensional modeling approaches and ontology alignment strategies.
Enable large-scale machine learning by designing and maintaining annotated datasets, elastic search approaches, and scalable data lake structures that support AI/ML workloads.

Analytics Pipeline & Insight Generation

Create and maintain analytics pipelines that generate data and insight to power business decision-making; use AI-assisted analysis to proactively surface trends, anomalies, and opportunities within pipeline outputs.
Collaborate with data scientists, analysts, and business stakeholders on requirements for dimensional modeling, distributed ETL pipelines, and cross-repository data migration.
Evaluate, compare, and improve design patterns, data lifecycle approaches, and data ontology alignment — using AI to model trade-offs and accelerate proof-of-concept validation.
Work with data and analytics experts to continuously improve the functionality, reliability, and intelligence of data systems.

Root Cause Analysis & Quality Management

Perform root cause analysis on internal and external data and processes using AI-assisted investigation tools — replacing slow, manual log and lineage review with faster, AI-accelerated diagnostics.
Develop and maintain data quality frameworks; use AI to automate anomaly detection, schema validation, and data contract enforcement across pipelines.
Develop a strong understanding of company domains, strategic direction, and user needs to ensure data systems are aligned to business outcomes, not just technical requirements.

Qualifications and Requirements:

4+ years of experience in a Data Engineer role.
Graduate degree in Computer Science, Statistics, Informatics, Information Systems, or another quantitative field.
Advanced SQL knowledge and experience with relational databases and query authoring.
Required: Demonstrated, hands-on experience using AI tools to accelerate data engineering tasks — pipeline development, data quality automation, code generation, or root cause analysis — with specific examples you can speak to.
Experience building and optimizing data pipelines, architectures, and datasets.
Strong analytic skills working with unstructured and disconnected datasets.
Experience with big data tools: Hadoop, Spark, Kafka, etc.
Experience with relational and NoSQL databases including Postgres and Cassandra.
Experience with pipeline and workflow management tools: Airflow, Luigi, Azkaban, or similar.
Experience with AWS cloud services: EC2, EMR, RDS, Redshift.
Experience with stream-processing systems: Storm, Spark Streaming, or similar.
Working knowledge of message queuing, stream processing, and highly scalable data stores.
Proficiency in object-oriented/scripting languages: Python, Java, Scala, C++, or similar.
Experience supporting cross-functional teams in dynamic, agile environments.

Preferred Experience:

Experience designing or supporting data infrastructure for AI/ML model training, including annotated datasets and feature stores.
Familiarity with AI-assisted data quality or observability platforms (e.g., Monte Carlo, Soda, or similar).
Experience with LLM-based data processing pipelines or retrieval-augmented generation (RAG) architectures.
Healthcare data experience; familiarity with FHIR/HL7 standards a plus.
High English proficiency

Location

Lima, Lima (Remote)

Department

1420 - Decision Science

Employment Type

Full Time

Minimum Experience

Experienced

Automatically Apply to the Best Remote Jobs

Stop the endless job search. Our AI finds and applies to the best jobs for you.

Try it Now

Stellus Rx

Senior Data Engineer

AI Summary

Automatically Apply to the Best Remote Jobs

Ace Your Job Interview

How to Answer "How Do You Handle Criticism"?

How to Answer "Tell Me About Yourself?" in an Interview

How to Answer "What is your Experience with Customer Service?"

How to Answer "Describe Your Experience Working With Diverse Teams Or Different Cultures?"

How to Answer The Interview Question "What Sets You Apart From Other Candidates?"

How to Answer "Why Are You The Best Person For This Job"?

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Why Should We Hire You?"

How to Answer "What Areas Need Improvement?"

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Tell Me About a Time You Received Constructive Feedback"

How to Answer "What Is Your Greatest Accomplishment?"

Similar Jobs