Define and develop the specifications and micro-architecture for key NPU modules, including in-memory compute and memory orchestration units. Collaborate with hardware and software teams to optimize AI accelerator performance for workloads like LLMs and CNNs.
EnCharge AI
8 Remote Job Openings at EnCharge AI
The Senior Emulation Engineer will set up and maintain emulation platforms while adapting SoC designs for validation. They will collaborate with design and software teams to debug architectures, optimize workloads, and support early software bring-up.
The Principal DFT Engineer will define and implement the end-to-end DFT architecture for complex SoCs, covering areas like Hierarchical DFT, Scan compression, and Boundary Scan. This role also involves developing strategies for In-System Test and power-on self-test to ensure chip health in remote edge data centers.
The Hardware Technical Writer will collaborate with internal engineering teams to gather information for creating technical documents such as design specifications, user manuals, and datasheets for AI hardware. Responsibilities also include reviewing specifications, producing government reports, and translating complex ASIC or VLSI concepts into clear documentation.
The engineer will design and build scalable serving infrastructure for video generation models, owning latency, throughput, reliability, and cost optimization. Key tasks include developing robust customer-facing APIs and SDKs, creating compelling demo applications, and building agentic systems leveraging video generation capabilities.
The AI Compiler Engineer will architect, design, and implement optimizations for AI model execution on graph compilers. They will collaborate with hardware architects and AI researchers to enhance performance and enable efficient model deployment.
The LLM Inference Deployment Engineer will optimize, deploy, and scale large language models for high-performance inference on energy-efficient AI accelerators. Responsibilities include utilizing inference runtimes and optimizing model execution for low-latency AI inference.
The AI Research Engineer will research and develop quantization techniques for deep learning models and implement optimizations for efficient inference algorithms. Collaboration with hardware engineers is essential to optimize model execution for edge devices.