Reka

Member of Technical Staff- Data Intelligence

Posted 3 months ago

Singapore, United Kingdom, United States

⭐ 5-10 years experience

Apply Now

Please mention DailyRemote when applying

AI Summary

Collaborate with researchers to define data quality metrics and build scalable processing pipelines for training World Models. Manage the data stack's CI/CD, track dataset provenance, and optimize compute utilization across pipelines.

In this role, you’ll work closely with model researchers, data infrastructure engineers, and cross-functional partners to make sure our data is high quality and can be produced at petabyte scale in a reliable, efficient way. From understanding how data choices show up in model behavior, to building processing pipelines and running the compute behind them, you’ll help ensure our models are trained on the best data we can get.

What you’ll do

Work with model researchers to define what “good data” means for our models, including quality metrics, validation checks, and acceptance thresholds
Explore open source datasets and create internal ones most suitable to build fundamental World Models
Build algorithms for automated data quality assessment, data domain mixtures, and domain adaptation from synthetic to real data.
Track datasets, metadata, provenance, and versions so experiments are reproducible and it’s clear what data went into which training and evaluation runs
Own CI/CD and development tooling for the data stack (GitHub, Python, PyTorch), and automate repetitive workflows to reduce friction
Track and optimize throughput, storage, and compute utilization across pipelines and related assets

What we’re looking for

Strong ML and deep learning fundamentals with experience building and operating large-scale data and/or compute systems
Comfortable moving between research questions and production engineering: you can dig into data, run analyses, and also ship reliable systems
Demonstrated research experience with data compositions, quality, and dataset releases
Ability to design and execute experiments with convincing unbiased outcomes
Practical experience with distributed processing and orchestration (Spark, Ray, Airflow, or equivalents)
Solid Python skills, and familiarity with the tooling around modern model training workflows (datasets, checkpoints, experiment tracking)
Strong instincts around data quality: how to measure it, how to monitor it, and how to prevent regressions as things scale
Able to work in a fast-moving environment, prioritize what matters, and communicate clearly with both researchers and engineers
Bonus: experience with large video datasets, dataset curation for training, or building internal tooling for evaluation/analysis in ML environments

Automatically Apply to the Best Remote Jobs

Stop the endless job search. Our AI finds and applies to the best jobs for you.

Try it Now

Reka

🧑‍💻 Employees 11-50 employees 🏢 Industry Technology, Information and Internet

View More Jobs From Reka

Reka

Member of Technical Staff- Data Intelligence

AI Summary

What you’ll do

What we’re looking for

Automatically Apply to the Best Remote Jobs

Ace Your Job Interview

How to Answer "How Do You Handle Criticism"?

How to Answer "Tell Me About Yourself?" in an Interview

How to Answer "What is your Experience with Customer Service?"

How to Answer "Describe Your Experience Working With Diverse Teams Or Different Cultures?"

How to Answer The Interview Question "What Sets You Apart From Other Candidates?"

How to Answer "Why Are You The Best Person For This Job"?

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Why Should We Hire You?"

How to Answer "What Areas Need Improvement?"

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Tell Me About a Time You Received Constructive Feedback"

How to Answer "What Is Your Greatest Accomplishment?"

Similar Jobs

Junior Crypto Trader (Remote)

Mid/Senior AI Cinematic Video Editor (Full Remote - Worldwide)

Board Certified Physician Reviewer - Orthopedic Spine Surgery/Louisiana License

Neuroscience Specialist (AI Training Project)

Biochemistry Specialist (AI Training Project)

Molecular Biology Specialist (AI Training Project)

Reka

Member of Technical Staff- Data Intelligence

AI Summary

What you’ll do

What we’re looking for

Automatically Apply to the Best Remote Jobs

Share This Job:

Similar Jobs

Junior Crypto Trader (Remote)

Mid/Senior AI Cinematic Video Editor (Full Remote - Worldwide)

Board Certified Physician Reviewer - Orthopedic Spine Surgery/Louisiana License

Neuroscience Specialist (AI Training Project)

Biochemistry Specialist (AI Training Project)

Molecular Biology Specialist (AI Training Project)

Personalize your Remote Job Search in 3 Easy Steps!