About Power Factors
Power Factors is accelerating the green energy transition by providing advanced analytics and AI insights to operators of renewable energy assets. Our SaaS platforms are used to manage over 250 GW of wind, solar, hydro, and energy storage projects globally. By driving down operational costs and increasing revenue, we are tackling one of the world's most important challenges: making renewable energy the world's leading source of power. Our vision is to create a sustainable world powered by renewable energy. Our mission is to fight climate change with code.
We are looking for a Senior Data Engineer to join the Innovation team as a core member of the PF-LLM programme — our initiative to build a from-scratch multivariate time-series foundation model across a fleet of ~1,000 wind and PV sites. You will be the connective tissue of the entire programme: owning the data foundation that makes state-of-the-art model training possible, the inference service that makes model outputs usable, and the platform integration that puts those outputs in front of pilot customers. From production ETL through to shadow-mode validation pipelines, you will be the engineer who keeps every track moving. This role is critical-path from day one.
The Role — What You'll Do
Data Foundation
- Design and build the production ETL pipeline from source systems to warehouse and feature store at fleet scale, covering thousands of wind and PV sites across multiple OEMs.
- Own canonical signal schema design across wind and PV asset classes and OEMs — the deepest technical unknown in the programme and the foundation everything else depends on.
- Implement automated data quality gates: sparsity and missingness checks, flatline detection, outlier flagging, and freshness validation, with alerting that generates tickets automatically.
- Implement dataset versioning sufficient to reproduce every trained model from scratch.
- Build and maintain backfill jobs, idempotency guarantees, and retry logic that survive mid-run failure without duplicating data.
- Govern storage and compute costs on the warehouse from day one.
Inference Service & API
- Build the batch and on-demand inference API with contract tests, sized for fleet-wide daily runs.
- Establish latency and throughput baselines; own the cold-start and model-loading strategy.
- Instrument the service with structured logs and metrics from the outset.
Platform Integration
- Integrate forecasts into the Power Factors product platform: auth and authorisation with customer isolation, observability hooked into the existing stack, and feature flags per customer and per site.
- Build and maintain the shadow validation pipeline: run live inference in parallel with the existing forecast path, log predictions and actuals, and produce weekly validation reports broken down by asset class, OEM, and region.
- Support the pilot customer rollout: enable the product for friendly customers behind flags and own incoming data and integration tickets throughout the pilot window.
Collaboration & Documentation
- Work closely with the ML Engineer to align on data quality requirements, feature store interfaces, and the handoff between the data platform and training pipeline.
- Partner with the Tech Lead and Frontend Engineer during platform integration to ensure a clean, maintainable integration surface.
- Contribute to architectural decisions across the programme and document data flows, schemas, and pipeline runbooks to a standard that supports the broader team.
Must-Have Qualifications
- 6+ years of back-end and data engineering experience, with a proven track record of shipping production systems.
- Production-grade ETL/ELT pipeline design at scale: idempotency, retry logic, backfill jobs, incremental loading, and cost-controlled warehouse compute.
- Schema design and data modelling across heterogeneous sources — experience reconciling signals from disparate systems into a canonical, queryable format.
- Data quality engineering: automated quality gates (sparsity, flatline detection, outlier flagging, freshness checks), alerting pipelines, and dataset versioning for ML reproducibility.
- API design and development: RESTful inference services with contract testing, latency and throughput budgeting, and structured observability (logs, metrics, traces).
- Experience integrating ML model outputs into SaaS product surfaces: auth and authorisation, customer isolation, and feature flag management.
- Cloud infrastructure proficiency (AWS preferred), containerisation (Docker, Kubernetes), and CI/CD pipeline ownership.
- Python and SQL as core tools; hands-on experience with modern warehouse technologies (Snowflake, BigQuery, or Databricks).
- Pipeline orchestration with Airflow, Prefect, Dagster, or equivalent.
- Excellent written and verbal communication skills in English.
Beneficial Qualifications
- Experience with time-series, IoT, or industrial sensor data (SCADA systems, irregular sampling, high-missingness signals) — a significant advantage.
- Familiarity with streaming data platforms (Kafka, Kinesis, or Pub/Sub) for real-time or near-real-time ingestion.
- Experience designing and managing feature stores for ML training and serving.
- Renewable energy domain knowledge: understanding of wind and solar asset operations and the data they produce.
- Experience standing up shadow-mode or A/B comparison pipelines for ML systems — running live inference in parallel with an existing path and logging predictions against actuals.
- Multi-tenancy and platform integration experience in B2B SaaS products.
- Knowledge of data lake architectures and open table formats (Delta Lake, Iceberg, Parquet).
- Familiarity with MLOps practices: model registry conventions, retraining triggers, and drift monitoring.
What We Offer
- Comprehensive benefits package including health, dental, and vision coverage, plus dedicated wellness support.
- Generous paid vacation policy.
- Employer RRSP matching program.
- Work-from-abroad opportunities with manager approval.
- Exposure to a global team operating across multiple countries and time zones.
- A humble cause with a clear purpose — you will help us fight climate change with code every day at work.
At AppDirect, we believe that innovation thrives in an environment that houses diversity of excellence, experience and thought. We respect each AppDirector as their own fingerprint; unique with no one alike. We foster an environment of inclusion without regard to race, religion, age, sexual orientation, or gender identity enabling AppDirectors to embrace their uniqueness to do their best work. As such, we strongly encourage applications from Indigenous peoples, racialized people, people with disabilities, people from gender and sexually diverse communities, and/or people with intersectional identities.
By applying to this role, you acknowledge that your application information — including your resume, contact details, and any materials you submit — may be shared with our client (the hiring organization) for the purpose of evaluating your candidacy. We act as a recruiting partner on behalf of this client. Your information will be used solely in connection with this opportunity and handled in accordance with applicable privacy laws.
At AppDirect we take privacy very seriously. For more information about our use and handling of personal data from job applicants, please read our Candidate Privacy Policy. For more information on our general privacy practices, please see the AppDirect Privacy Notice: https://www.appdirect.com/about/privacy-notice