Design and operate large-scale web crawlers and high-throughput ingestion pipelines to acquire high-quality pre-training data. Collaborate with research teams to align data acquisition strategies with the needs of frontier LLMs for software development.
Poolside
7 Remote Job Openings at Poolside
You will serve as the primary point of contact for customers, troubleshooting and resolving complex technical issues across SaaS and on-prem deployments. Additionally, you will create documentation and tools to help the support team scale while bridging the gap between customers and engineering teams.
Design and implement infrastructure and tooling for researchers and engineers to evaluate base and instruction-following models. Collaborate with research and product teams to define meaningful metrics that measure progress on real-world software development skills.
Design and implement a scalable, self-serve evaluation platform to support research and development. Maintain distributed evaluation pipelines and collaborate with modeling and product teams to improve experimentation tooling.
Member of Engineering (Pre-training / Data Engineering)
Poolside
·
Full Time
·
5 months ago
Poolside
You will be responsible for building and scaling the Model Factory, which involves architecting and maintaining high-performance pipelines for processing large datasets. Your mission includes delivering diverse and high-quality datasets for training models and collaborating with various teams to ensure model quality.
You will work on the data team to improve the quality of pretraining datasets and generate synthetic data at scale. Collaboration with other teams is essential to define high-quality data needs and ensure alignment on model capabilities.
Member of Engineering (Pre-training and inference fault tolerance)
Poolside
·
Full Time
·
8 months ago
Poolside
Identify and troubleshoot hardware problems during training at scale while minimizing GPU idle time. Design and develop tools to accelerate training recovery and improve checkpointing performance.