ProblemShared

Analytics Engineer / Data Engineer

Posted 9 hours ago

United Kingdom

£60000 - £80000 per year

⭐ 5-10 years experience

Apply Now

Please mention DailyRemote when applying

AI Summary

Build the bridge between raw source data and the dbt layer using medallion architecture to ensure analysts have usable data. Responsibilities include modelling bronze source data into silver tables and replacing legacy reporting pipelines with code-first generators.

*Please be aware although we are a remote organisation we do require candidates to reside in the UK.*

Salary: £60,000 - £80,000

Workpattern: Full time

Reporting to: Head of Analytics

Location: Remote

#LI-Remote

About Us: ProblemShared is a practitioner-led, CQC-regulated digital mind-health provider. We assess and treat adults and young people with ADHD, autism, and specific learning differences, and we partner with NHS trusts, integrated care boards, universities, insurers, and private healthcare providers. The company was founded by clinicians who watched the NHS waitlist problem from inside an A&E and decided to do something about it.

Our data mission: We believe “data for good” — that good data work transforms how neurodivergent people understand themselves, how practitioners sharpen their craft, and how teams design better support. The stack we work in: Fivetran → Databricks (Unity Catalog) → dbt → semantic layer → Power BI and R/Quarto for partner-grade reporting. Mature in some places, still being built in others. You’d help us close the gaps.

What makes us different

Tool-agnostic, outcome-driven. We care more about a candidate’s mental agility and generalisable knowledge than tool-specific experience.
Neurodiversity-welcoming. Potential over conformity. A meaningful share of our colleagues are neurodivergent. Adjustments to the process are normal, you don’t have to explain why.
Fully remote, flexibly. Work where you do your best thinking. If you’re in London some of us meet up for coworking once a week, it’s not mandated. Psychoeducation, prescribing, educational navigation, and therapy sessions all under one roof.

About the role
You build the bridge between raw source data and the dbt layer the rest of the team builds on, working across the medallion architecture. Your craft is modelling raw, messy, real-world source data (Semble, Zendesk, CMS, finance systems) in through the medallion layers, (bronze, silver, gold) and ensuring our analysts get data they can use. Our team already includes a senior data engineer (focussed on ingestion and transformation) and a senior analytics engineer (focussed on dbt access control and restructuring the data into analyst-ready schemas), you will deputise for them both to increase capacity and reduce key-person risk. You’ll stay close enough to our analysts that you can tell when a schema decision is going to cause pain three weeks later. The chair builds the conduit between raw data and analyst-ready data and ensures that conduit is maintainable and governable.

Why this role matters
Half our most-used source systems land in shapes that are almost-but-not-quite usable. Bronze tables that look like APIs, prescriptions scattered across five systems, person records that don’t quite match. Your work turns those into the silver and gold foundations the rest of the data team builds on — which means every dashboard, every report, and every clinical-quality indicator downstream gets to start from solid ground.

What you will do

Model bronze source data into silver tables for analytics consumption. The headline first project is Semble, which currently lands in a heavily normalised form that’s not ready to be analysed. An old process (Power BI bolted onto complicated exports) provides an illustration of the business logic to be applied in the restructure, but you’ll have the agency to throw that old process away and build something better.
Build upon our cross-system reconciliations for entities that span our stack — prescriptions across Semble, CMS, pharmacy and EMIS; patient identities fuzzy-joined across CMS, NHS, and insurer sources. Data needs to be reconfigured from “where is it from” to “what is it about”; so our analysts can answer business questions.
Replace a VBA-stitched partner-reporting PowerPoint pipeline with a code-first generator.
Build automated data-quality alerting on referrals and key data feeds. Slack-routable, with a clear ownership trail.
Author new dbt models in your own schema, in separate MRs, reviewed by the Senior Analytics Engineer.
Coach the analysts on engineering skill and discipline — when to use a CTE, when not to, how to use the data platform. We need our analysts to understand engineering and our engineers to understand analytics, so peer teaching and support is a team practice.

What good looks like at 90 days

Semble silver model in production, with at least one downstream dashboard repointed to it.
One cross-system reconciliation (prescriptions or person-key) shipped, with a documented match-rate baseline.
Our contract-reporting pipeline either replaced or with a credible plan to replace it.
You’ve reviewed a non-CS analyst’s SQL at least once in writing, and they’ve shipped the result.

What you bring – Must have

Strong in Python (incl. PySpark) and SQL on a modern cloud lakehouse. Comfortable reading a query plan.
Hands-on dbt — you’ve authored models, written tests, and reviewed others’ models with metadata discipline.
A track record of taking semi-structured source data (APIs, JSON, denormalised exports) and turning it into models other people can build on without fear.
Comfortable disagreeing with a stakeholder or a teammate and still landing the decision together. You argue to learn what’s true/best, not to win.
A view on when AI-supported drafting genuinely adds value and when it doesn’t, and the instinct to read AI-generated code with scepticism before shipping it.
A passion for making the data orderly.

We know nobody ticks every box. If you have most of these, please apply.

What you bring – Nice to have

Databricks (jobs, notebooks, serverless), Unity Catalog ABAC, Asset Bundles, GitHub Actions CI/CD.
Terraform for permissions and schema management.
Spark Declarative Pipelines or streaming experience for large-file ingest.
Fuzzy-matching or entity-resolution experience.
Experience or knowledge of medallion architecture.
Past experience collaborating with data scientists or analysts. Clear understanding of “normalised” vs “tidy/ready-to-analyse”, and when to use each.
Experience in a highly regulated industry working with sensitive data. Healthcare a plus.
Public work we can look at — GitHub, a blog, a talk. Optional, not expected.

Clinical subject-matter work lives with separate hires on the team. On process we’re pragmatic; if you find yourself defending Scrum ceremonies without articulating their trade-offs, you’ll probably find us frustrating.

How we work
You’ll work with pseudonymised clinical and operational data. Information governance is part of how you design schemas, not a step bolted on at the end. We use Unity Catalog ABAC, column-masking UDFs, and row filters by region and caseload. Everything is in version control. Nothing ships from a notebook straight to a stakeholder without peer review.

We use AI coding tools only when they’re additive and always with human oversight. As part of the central data team, you’ll help set the standards the rest of the company follows — so we expect you to know when not to trust them.

We hold a particular stance on how we represent the people in our data. Our patients are neurodivergent, and our framing of them is fit, not fault — dignity precedes identity, recognition over categorisation, the person is the primary recipient of anything we say about them. If those phrases land for you, we’ll probably get on.

Async is the default. Deep-work blocks are real and protected. Slack response cadence is hours, not minutes. Meeting load averages around two hours a day.

What we offer you

Excellent salary
Annual Performance related bonus (Discretionary)
Company Pension Scheme
30 days annual leave + public holidays + the option to buy and sell additional leave, & extended leave options such as sabbatical leave
Private health insurance
Blue Light card / discounts
Enhanced family friendly policies
Flexible working
All company events and in-person team meet ups
Access to a range of wellbeing activities
Access to development / training opportunities to support your career ambition
One volunteering day per year

Our Recruitment Process and Next Steps

1. Short call (30 min) Two-way interview; we don’t want to waste your time. We’ll describe the role so you can determine your own interest. There will be technical questions, focussed on d

2. Online Programming Test (15-60 minutes). Focussed on PySpark and DBT. No hard algorithms – we just need to know you find easy programming questions easy (without an AI).

3. 2 Part Final Interview (60min/30min) – Technical conversation with the team, followed by a values conversation with the Chief Data Officer. Your choice whether these are back to back or on separate days – whatever is best for you.

We reply within seven days, yes or no.

If you’d like adjustments at any point — written instead of live, extra time, breaks, captions, a quiet room, the questions in advance, anything else — tell us. Many of our team are neurodivergent; we expect to adapt the process to the candidate, not the other way around. No diagnosis or explanation needed.

All applicants welcome

No matter who you are, where you’re from, who you love, follow in faith, disability status, ethnicity or the gender you identify with, you’re welcome at ProblemShared.

We particularly welcome applications from autistic, ADHD, and other neurodivergent candidates — many of our own team are neurodivergent. You wouldn’t have to mask to do meaningful work here.

Automatically Apply to the Best Remote Jobs

Stop the endless job search. Our AI finds and applies to the best jobs for you.

Try it Now

ProblemShared

Analytics Engineer / Data Engineer

AI Summary

Automatically Apply to the Best Remote Jobs

Ace Your Job Interview

How to Answer "How Do You Handle Criticism"?

How to Answer "Tell Me About Yourself?" in an Interview

How to Answer "What is your Experience with Customer Service?"

How to Answer "Describe Your Experience Working With Diverse Teams Or Different Cultures?"

How to Answer The Interview Question "What Sets You Apart From Other Candidates?"

How to Answer "Why Are You The Best Person For This Job"?

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Why Should We Hire You?"

How to Answer "What Areas Need Improvement?"

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Tell Me About a Time You Received Constructive Feedback"

How to Answer "What Is Your Greatest Accomplishment?"

Similar Jobs

Backend Developer Internship (Remote)

Backend Developer Internship (Remote)

Boyd Thermal by Eaton - Senior Mechanical Engineer

Senior Salesforce Developer (Service / Health Cloud)

Sr. AI Growth Partner (Sr. Customer Success Manager) - Spanish Speaking

Vice President, Commissioning & Integration

ProblemShared

Analytics Engineer / Data Engineer

AI Summary

Automatically Apply to the Best Remote Jobs

Share This Job:

Similar Jobs

Backend Developer Internship (Remote)

Backend Developer Internship (Remote)

Boyd Thermal by Eaton - Senior Mechanical Engineer

Senior Salesforce Developer (Service / Health Cloud)

Sr. AI Growth Partner (Sr. Customer Success Manager) - Spanish Speaking

Vice President, Commissioning & Integration

Personalize your Remote Job Search in 3 Easy Steps!