Machine Learning Engineer - Pre Training

 Posted 3 months ago
     
 $150K - $190K per year
  
2-5 years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

The primary mission involves designing and optimizing large-scale pre-training systems to power generative AI models. This includes building scalable pre-training pipelines, implementing distributed training strategies across hardware, and developing monitoring systems for reliability.

About Mindbeam

We are building the next-generation AI infrastructure for open source and enterprise. Our work is deeply research-oriented and passionate about developing ground-breaking innovations to take state-of-the-art AI applications to the next level.

What drives us is not only advancing technology, but empowering the people behind it. We are a community of researchers, engineers, and visionaries who believe that collaboration, curiosity, and openness fuel progress. If you’re motivated by impact and inspired to build tools that others can build upon, you’ll be in the right place.

Mission

Design and optimize large-scale pre-training systems that power Mindbeam’s generative AI models.

Role Expectations

• Build scalable pre-training pipelines for foundation models, optimizing throughput and efficiency.

• Implement distributed training strategies across GPUs/TPUs and high-performance clusters.

• Collaborate with researchers to translate experimental setups into production-ready workflows.

• Develop monitoring and fault-tolerance systems to ensure reliable large-scale training.

• Continuously benchmark and tune performance across hardware and software stacks.

Background

• Bachelor’s, Master’s, or PhD in Computer Science, Engineering, or related field—or equivalent experience.

• 2+ years of experience with large-scale model training and distributed systems.

• Strong coding skills in Python and familiarity with ML frameworks (PyTorch, TensorFlow, JAX).

• Experience with GPU scheduling, memory optimization, and parallelism strategies.

• Comfort with containerized and orchestrated environments (Docker/Kubernetes).

• Understanding of high-performance computing and networking bottlenecks.

About You

You thrive on scale and complexity. You enjoy solving system-level bottlenecks, pushing hardware and software to their limits, and working closely with researchers to accelerate cutting-edge AI development.

Similar Jobs

See all Remote Software Development jobs →

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in Machine Learning Engineer

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified