Please mention DailyRemote when applying
GPU Acceleration Engineer - Calculation Engine
Massively accelerate the sparse calculation engine of a UK SaaS B2B - Enterprise Planning & Analytics company by porting critical algorithms from Rust/C++ to GPU (CUDA). Transform currently impossible calculations (requiring thousands of years of CPU time) into operations achievable in minutes.
UK SaaS B2B - Enterprise Planning & Analytics company manages planning models reaching 64 quadrillion cells with billions of time periods. Our Hyperblock/Polaris engine is currently limited by:
Legacy CPU architecture (Java/Rust/C++)
Memory constraints on massive sparse structures
Prohibitive calculation times on complex scenarios
Objective: Achieve performance gains of 100x to 1000x via GPU offloading.
Port existing Rust/C++ algorithms to CUDA/GPU
Identify and extract critical calculation paths to accelerate
Optimize sparse matrix operations for GPU architecture
Develop performant Rust β CUDA wrappers
Benchmark and validate performance gains
Design GPU memory management strategies for massive datasets
Implement efficient patterns for sparse structures
Optimize CPU β GPU memory transfers
Manage GPU memory limitations on large-scale calculations
Work with engineering team on integration
Document GPU porting patterns
Participate in code reviews and design reviews
Train the team on GPU best practices
CUDA - Primary GPU development
Rust - Source language for algorithms to port
C++ - Legacy components and CUDA interoperability
(Java - platform context, no dev required)
NVIDIA CUDA (toolkit, libraries: cuBLAS, cuSPARSE)
Rust (ownership model, unsafe blocks, FFI)
GPU Programming (kernels, memory hierarchy, optimization)
Sparse Matrix Operations (compression, storage formats)
Profiling Tools (nvprof, Nsight, perf)
GPU & CUDA (Essential)
β Significant CUDA programming experience (3+ years)
β Mastery of GPU kernel optimization
β Deep knowledge of NVIDIA GPU architecture (memory hierarchy, warps, occupancy)
β Experience with sparse calculations on GPU (cuSPARSE or equivalent)
Rust (Essential)
β Production Rust development
β Mastery of ownership and borrowing system
β Experience with unsafe Rust and FFI (Foreign Function Interface)
β Ability to analyze and refactor existing Rust code
C++ (Required)
β Modern C++ (C++11/14/17)
β C++ β CUDA integration
β Templates and metaprogramming (asset)
Algorithms (Required)
β Data structures for scientific computing
β Sparse matrix algorithms (CSR, COO, etc.)
β Performance optimization and profiling
β Parallelization and concurrency concepts
π― Documented CPU β GPU porting projects
π― HPC experience (supercomputers, GPU clusters)
π― Memory optimization for large-scale datasets
π― Scientific computing or numerical simulation
π― Rust interop with other languages (C/C++/Python)
100% remote (France/Europe base preferred)
Occasional travel to London
Frequency: ~1 week/month for team sprints
Project kickoff + key reviews
Intensive collaboration sessions
Start date: As soon as possible
Stop the endless job search. Our AI finds and applies to the best jobs for you.
Discover remote opportunities in Software Development
Answer easy questions
200,000+ jobs across 15+ categories
Get your best job matches
Only hand-screened, legit jobs
Find a remote job faster
No ads, scams, or junk
“ I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!