EnCharge AI

Research Engineer, AI Models

Posted a month ago

United States

⭐ 5-10 years experience

Apply Now

Please mention DailyRemote when applying

AI Summary

Research and implement state-of-the-art techniques to accelerate AI inference and optimize model quality on custom silicon. Build fine-tuning pipelines and benchmarking frameworks to characterize tradeoffs between latency, throughput, and power consumption.

Research Engineer, AI Models

Location: San Francisco, CA (or Remote-friendly with travel)

About EnCharge AI:

EnCharge AI is building the next generation AI platform. Our novel in-memory-computing architecture delivers a 10x step-function improvement in compute energy efficiency and performance for AI inference workloads. As the demands of artificial intelligence move beyond today's models, we believe fundamental underlying infrastructure must evolve. We are an experienced team of AI researchers, silicon & systems engineers, and architects backed by leading investors, poised to become the essential platform for the next wave of AI innovation.

The Opportunity:

Modern AI workloads—from large language models to diffusion-based generators to multimodal systems—represent some of the most compute-intensive frontiers in AI, and some of the most promising applications for our hardware’s energy efficiency advantages. We’re building a vertically integrated AI stack that will showcase the transformative potential of our silicon while delivering real value to customers today.

We are seeking a Research Engineer to push the boundaries of AI model quality and efficiency. You’ll build fine-tuning pipelines, develop rigorous benchmarking frameworks, and work at the intersection of ML research and hardware-aware optimization—ensuring our models run beautifully on our silicon.

This is a role for someone who thrives at the boundary between research and engineering. You’ll read papers, implement techniques, and ship production-quality code—all in service of making AI inference faster, cheaper, and better.

Key Responsibilities:

Algorithmic Acceleration: Research and implement state-of-the-art techniques to accelerate AI inference—quantization, sparsity, distillation, speculative decoding, caching strategies, and architectural modifications. Systematically characterize tradeoffs between model quality, latency, throughput, and power consumption to find optimal operating points across different use cases.
Hardware Co-Design: Partner closely with hardware, compiler, and quantization teams to ensure algorithmic improvements translate to real gains on our silicon. Identify optimizations aligned with our architecture's strengths—maximizing throughput while minimizing power. Shape the feedback loop between model development and hardware roadmap.
Evaluation: Build profiling tools and comprehensive benchmarking frameworks to understand compute bottlenecks, measure model quality across standard and domain-specific evals, and track efficiency metrics. Establish the methodology that informs both algorithmic choices and hardware-software co-design.
Applied Research: Build robust fine-tuning workflows for modern AI models, enabling rapid experimentation with LoRA, adapters, and full fine-tuning. Stay current with the rapidly evolving landscape—evaluate new architectures, implement promising techniques, and contribute insights that inform technical and go-to-market strategy.

Qualifications:

5+ years of experience in ML research, applied ML, or ML systems
Strong fundamentals in Python and PyTorch
Hands-on experience with modern AI models (transformers, diffusion models, or other generative architectures)
Experience fine-tuning large models and building training/evaluation pipelines
Deep understanding of transformers, attention mechanisms, & optimization techniques
Comfort reading and implementing techniques from research papers

Nice to Have:

Experience with efficient inference techniques (KV cache optimization, attention variants, MoE routing, flow matching)
Background in hardware-aware ML optimization or quantization
Familiarity with profiling tools (PyTorch Profiler, Nsight, custom instrumentation)
Publications in generative modeling, efficient inference, or ML systems
Contributions to open-source ML projects

Automatically Apply to the Best Remote Jobs

Stop the endless job search. Our AI finds and applies to the best jobs for you.

Try it Now

EnCharge AI

Research Engineer, AI Models

AI Summary

Automatically Apply to the Best Remote Jobs

Ace Your Job Interview

How to Answer "How Do You Handle Criticism"?

How to Answer "Tell Me About Yourself?" in an Interview

How to Answer "What is your Experience with Customer Service?"

How to Answer "Describe Your Experience Working With Diverse Teams Or Different Cultures?"

How to Answer The Interview Question "What Sets You Apart From Other Candidates?"

How to Answer "Why Are You The Best Person For This Job"?

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Why Should We Hire You?"

How to Answer "What Areas Need Improvement?"

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Tell Me About a Time You Received Constructive Feedback"

How to Answer "What Is Your Greatest Accomplishment?"

Similar Jobs

Senior Software Engineer

Account Executive - AI Defense

DEXIS IOS Sales & Application Specialist

Senior Salesforce Engineer (Remote NC, AZ, TX, GA)

Geographic Sales Engineer - Salt Lake City

HIM Cert Coder OP Team- Surgical Coder

EnCharge AI

Research Engineer, AI Models

AI Summary

Automatically Apply to the Best Remote Jobs

Share This Job:

Similar Jobs

Senior Software Engineer

Account Executive - AI Defense

DEXIS IOS Sales & Application Specialist

Senior Salesforce Engineer (Remote NC, AZ, TX, GA)

Geographic Sales Engineer - Salt Lake City

HIM Cert Coder OP Team- Surgical Coder

Personalize your Remote Job Search in 3 Easy Steps!