The role involves architecting and implementing optimizations for AI model execution on graph compilers to maximize hardware utilization and reduce latency. You will collaborate with hardware architects and researchers to convert high-level AI models into efficient intermediate representations for inference accelerators.
EnCharge AI
8 Remote Job Openings at EnCharge AI
Define and develop the specifications and micro-architecture for key NPU modules, including in-memory compute and memory orchestration units. Collaborate with hardware and software teams to optimize AI accelerator performance for workloads like LLMs and CNNs.
The Senior Emulation Engineer will set up and maintain emulation platforms while adapting SoC designs for validation. They will collaborate with design and software teams to debug architectures, optimize workloads, and support early software bring-up.
The Principal DFT Engineer will define and implement the end-to-end DFT architecture for complex SoCs, covering areas like Hierarchical DFT, Scan compression, and Boundary Scan. This role also involves developing strategies for In-System Test and power-on self-test to ensure chip health in remote edge data centers.
The engineer will design and build scalable serving infrastructure for video generation models, owning latency, throughput, reliability, and cost optimization. Key tasks include developing robust customer-facing APIs and SDKs, creating compelling demo applications, and building agentic systems leveraging video generation capabilities.
The AI Compiler Engineer will architect, design, and implement optimizations for AI model execution on graph compilers. They will collaborate with hardware architects and AI researchers to enhance performance and enable efficient model deployment.
The LLM Inference Deployment Engineer will optimize, deploy, and scale large language models for high-performance inference on energy-efficient AI accelerators. Responsibilities include utilizing inference runtimes and optimizing model execution for low-latency AI inference.
The AI Research Engineer will research and develop quantization techniques for deep learning models and implement optimizations for efficient inference algorithms. Collaboration with hardware engineers is essential to optimize model execution for edge devices.