Artificial Analysis, Inc.

Solutions Engineer — Language Models

Posted a month ago

United States

⭐ 5-10 years experience

Apply Now

Please mention DailyRemote when applying

AI Summary

Operate and maintain the Python-based language model benchmarking pipeline, including onboarding new models and validating results. Serve as the primary technical point of contact for AI lab customers to explain methodology and resolve API issues.

About Artificial Analysis

Artificial Analysis is the leading independent AI benchmarking company. We support labs, engineers and enterprises to understand AI capabilities and make critical decisions about their AI strategies. We are the go-to authority for understanding AI, from AI labs and enterprises to media, investors, and policymakers. Our benchmarks don't just measure the cutting edge of AI, they are actively shaping the frontier.

Our benchmarks and analysis are trusted by hundreds of thousands of users and are the go-to reference for leading AI labs including OpenAI, Google, Meta, NVIDIA and Anthropic, and major publications including the Wall Street Journal, Bloomberg, the Financial Times and The Economist.

We are a team of 35+, on track to triple by year end, backed by Nat Friedman (Github, Meta), Daniel Gross (SSI), Andrew Ng (Google Brain, DeepLearning.ai, Amazon), Adam D'Angelo (Quora, Poe, OpenAI), Clem Delangue (Hugging Face) and other industry leaders.

The Opportunity

Artificial Analysis maintains one of the most comprehensive language model benchmarking suites in the industry, evaluating frontier models across quality, speed, and pricing for the AI labs and enterprises that rely on our data.

We're hiring a Solutions Engineer to own the day-to-day operation of our language model benchmarking stack. This is a hands-on, operational role: you'll add new models to our evaluation pipeline, run and debug benchmarks, and serve as the primary technical point of contact for AI lab customers — explaining results, fielding methodology questions, and resolving API endpoint issues over Slack and video calls.

This is not a software engineering role focused on building new systems. It's about running a sophisticated existing stack exceptionally well, consistently and reliably, while being the trusted technical face of Artificial Analysis to our customers.

What You’ll Do

Operate and maintain our Python-based language model benchmarking pipeline end-to-end: onboard new models, configure evaluations, execute benchmark runs, and validate results
Debug issues across the stack — from API endpoint timeouts and errors to unexpected benchmark outputs — and resolve them quickly
Serve as the primary technical contact for AI lab customers: communicate benchmarking results clearly, explain methodology, field technical questions, and troubleshoot integration issues via Slack and video conferencing
Monitor benchmark runs for anomalies, investigate discrepancies, and ensure the accuracy and integrity of published results
Maintain documentation of processes, known issues, and model-specific configurations
Collaborate with the engineering team to flag pipeline improvements and contribute to process refinements
Stay current with new model releases, API changes, and developments across the language model ecosystem

What We’re Looking For

Required:

5+ years of experience in a client-facing technical role — solutions engineering, support engineering, technical consulting, or similar (companies like Stripe, Vercel, Cloudflare, Datadog, Palantir, Accenture, or comparable)
Strong Python proficiency and comfort working with complex codebases you didn't write
Hands-on experience working with AI/ML model APIs (OpenAI, Anthropic, Google, Meta, etc.)
Excellent debugging skills — you can trace issues across APIs, data pipelines, and code
Strong written and verbal English communication skills, with the ability to explain technical concepts clearly to technical stakeholders
Highly responsive and reliable — you take ownership of customer issues and follow through
Comfortable with operational, repeatable work — you find satisfaction in running things well rather than building from scratch
High attention to detail and calm under pressure

Nice to have (not required):

Experience with AI evaluation, benchmarking, or testing methodologies
Familiarity with LLM inference infrastructure (tokenization, latency measurement, throughput metrics)
Experience working in or with AI labs or model providers
Background in B2B SaaS or developer tools

Why Artificial Analysis?

Shape how AI gets built: The leading AI labs track our benchmarks and use them to guide their development priorities. Your work will directly influence the direction of AI.
Become a world expert in AI: You will evaluate every major model, across every major capability, as they are released. Very few roles offer this breadth of exposure to frontier AI.
Work with the most important players in AI: You'll manage relationships with teams at the leading AI labs and major enterprises as a trusted, independent voice.
Join at a defining moment: We're 35+ people and fast growing, backed by some of the most connected investors in AI. The people who join now will shape the product, the team, and the strategy as we scale.
Competitive compensation including equity
Our team is split across San Francisco, Sydney, and Melbourne

Automatically Apply to the Best Remote Jobs

Stop the endless job search. Our AI finds and applies to the best jobs for you.

Try it Now

Artificial Analysis, Inc.

Solutions Engineer — Language Models

AI Summary

About Artificial Analysis

The Opportunity

What You’ll Do

What We’re Looking For

Why Artificial Analysis?

Automatically Apply to the Best Remote Jobs

Ace Your Job Interview

How to Answer "How Do You Handle Criticism"?

How to Answer "Tell Me About Yourself?" in an Interview

How to Answer "What is your Experience with Customer Service?"

How to Answer "Describe Your Experience Working With Diverse Teams Or Different Cultures?"

How to Answer The Interview Question "What Sets You Apart From Other Candidates?"

How to Answer "Why Are You The Best Person For This Job"?

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Why Should We Hire You?"

How to Answer "What Areas Need Improvement?"

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Tell Me About a Time You Received Constructive Feedback"

How to Answer "What Is Your Greatest Accomplishment?"

Similar Jobs

Principal Engineer II

SaaS Platform Developer

Senior DevOps Engineer – Azure Cloud Networking & Platform Engineering

Senior Software Development Engineer in Test (SDET) – Backbase & Digital Banking

Territory Sales Engineer

Sightline Software Engineer (IT-Sightline)

Artificial Analysis, Inc.

Solutions Engineer — Language Models

AI Summary

About Artificial Analysis

The Opportunity

What You’ll Do

What We’re Looking For

Why Artificial Analysis?

Automatically Apply to the Best Remote Jobs

Share This Job:

Similar Jobs

Principal Engineer II

SaaS Platform Developer

Senior DevOps Engineer – Azure Cloud Networking & Platform Engineering

Senior Software Development Engineer in Test (SDET) – Backbase & Digital Banking

Territory Sales Engineer

Sightline Software Engineer (IT-Sightline)

Personalize your Remote Job Search in 3 Easy Steps!

  Principal Engineer II