Please mention DailyRemote when applying
*Please be aware although we are a remote organisation we do require candidates to reside in the UK.*
Salary: £60,000 - £80,000
Workpattern: Full time
Reporting to: Head of Analytics
Location: Remote
#LI-Remote
About Us: ProblemShared is a practitioner-led, CQC-regulated digital mind-health provider. We assess and treat adults and young people with ADHD, autism, and specific learning differences, and we partner with NHS trusts, integrated care boards, universities, insurers, and private healthcare providers. The company was founded by clinicians who watched the NHS waitlist problem from inside an A&E and decided to do something about it.
Our data mission: We believe “data for good” — that good data work transforms how neurodivergent people understand themselves, how practitioners sharpen their craft, and how teams design better support. The stack we work in: Fivetran → Databricks (Unity Catalog) → dbt → semantic layer → Power BI and R/Quarto for partner-grade reporting. Mature in some places, still being built in others. You’d help us close the gaps.
What makes us different
Tool-agnostic, outcome-driven. We care more about a candidate’s mental agility and generalisable knowledge than tool-specific experience.
Neurodiversity-welcoming. Potential over conformity. A meaningful share of our colleagues are neurodivergent. Adjustments to the process are normal, you don’t have to explain why.
Fully remote, flexibly. Work where you do your best thinking. If you’re in London some of us meet up for coworking once a week, it’s not mandated. Psychoeducation, prescribing, educational navigation, and therapy sessions all under one roof.
About the role
You build the bridge between raw source data and the dbt layer the rest of the team builds on, working across the medallion architecture. Your craft is modelling raw, messy, real-world source data (Semble, Zendesk, CMS, finance systems) in through the medallion layers, (bronze, silver, gold) and ensuring our analysts get data they can use. Our team already includes a senior data engineer (focussed on ingestion and transformation) and a senior analytics engineer (focussed on dbt access control and restructuring the data into analyst-ready schemas), you will deputise for them both to increase capacity and reduce key-person risk. You’ll stay close enough to our analysts that you can tell when a schema decision is going to cause pain three weeks later. The chair builds the conduit between raw data and analyst-ready data and ensures that conduit is maintainable and governable.
Why this role matters
Half our most-used source systems land in shapes that are almost-but-not-quite usable. Bronze tables that look like APIs, prescriptions scattered across five systems, person records that don’t quite match. Your work turns those into the silver and gold foundations the rest of the data team builds on — which means every dashboard, every report, and every clinical-quality indicator downstream gets to start from solid ground.
What you will do
Model bronze source data into silver tables for analytics consumption. The headline first project is Semble, which currently lands in a heavily normalised form that’s not ready to be analysed. An old process (Power BI bolted onto complicated exports) provides an illustration of the business logic to be applied in the restructure, but you’ll have the agency to throw that old process away and build something better.
Build upon our cross-system reconciliations for entities that span our stack — prescriptions across Semble, CMS, pharmacy and EMIS; patient identities fuzzy-joined across CMS, NHS, and insurer sources. Data needs to be reconfigured from “where is it from” to “what is it about”; so our analysts can answer business questions.
Replace a VBA-stitched partner-reporting PowerPoint pipeline with a code-first generator.
Build automated data-quality alerting on referrals and key data feeds. Slack-routable, with a clear ownership trail.
Author new dbt models in your own schema, in separate MRs, reviewed by the Senior Analytics Engineer.
Coach the analysts on engineering skill and discipline — when to use a CTE, when not to, how to use the data platform. We need our analysts to understand engineering and our engineers to understand analytics, so peer teaching and support is a team practice.
What good looks like at 90 days
Semble silver model in production, with at least one downstream dashboard repointed to it.
One cross-system reconciliation (prescriptions or person-key) shipped, with a documented match-rate baseline.
Our contract-reporting pipeline either replaced or with a credible plan to replace it.
You’ve reviewed a non-CS analyst’s SQL at least once in writing, and they’ve shipped the result.
What you bring – Must have
Strong in Python (incl. PySpark) and SQL on a modern cloud lakehouse. Comfortable reading a query plan.
Hands-on dbt — you’ve authored models, written tests, and reviewed others’ models with metadata discipline.
A track record of taking semi-structured source data (APIs, JSON, denormalised exports) and turning it into models other people can build on without fear.
Comfortable disagreeing with a stakeholder or a teammate and still landing the decision together. You argue to learn what’s true/best, not to win.
A view on when AI-supported drafting genuinely adds value and when it doesn’t, and the instinct to read AI-generated code with scepticism before shipping it.
A passion for making the data orderly.
We know nobody ticks every box. If you have most of these, please apply.
What you bring – Nice to have
Databricks (jobs, notebooks, serverless), Unity Catalog ABAC, Asset Bundles, GitHub Actions CI/CD.
Terraform for permissions and schema management.
Spark Declarative Pipelines or streaming experience for large-file ingest.
Fuzzy-matching or entity-resolution experience.
Experience or knowledge of medallion architecture.
Past experience collaborating with data scientists or analysts. Clear understanding of “normalised” vs “tidy/ready-to-analyse”, and when to use each.
Experience in a highly regulated industry working with sensitive data. Healthcare a plus.
Public work we can look at — GitHub, a blog, a talk. Optional, not expected.
Clinical subject-matter work lives with separate hires on the team. On process we’re pragmatic; if you find yourself defending Scrum ceremonies without articulating their trade-offs, you’ll probably find us frustrating.
How we work
You’ll work with pseudonymised clinical and operational data. Information governance is part of how you design schemas, not a step bolted on at the end. We use Unity Catalog ABAC, column-masking UDFs, and row filters by region and caseload. Everything is in version control. Nothing ships from a notebook straight to a stakeholder without peer review.
We use AI coding tools only when they’re additive and always with human oversight. As part of the central data team, you’ll help set the standards the rest of the company follows — so we expect you to know when not to trust them.
We hold a particular stance on how we represent the people in our data. Our patients are neurodivergent, and our framing of them is fit, not fault — dignity precedes identity, recognition over categorisation, the person is the primary recipient of anything we say about them. If those phrases land for you, we’ll probably get on.
Async is the default. Deep-work blocks are real and protected. Slack response cadence is hours, not minutes. Meeting load averages around two hours a day.
What we offer you
Excellent salary
Annual Performance related bonus (Discretionary)
Company Pension Scheme
30 days annual leave + public holidays + the option to buy and sell additional leave, & extended leave options such as sabbatical leave
Private health insurance
Blue Light card / discounts
Enhanced family friendly policies
Flexible working
All company events and in-person team meet ups
Access to a range of wellbeing activities
Access to development / training opportunities to support your career ambition
One volunteering day per year
Our Recruitment Process and Next Steps
1. Short call (30 min) Two-way interview; we don’t want to waste your time. We’ll describe the role so you can determine your own interest. There will be technical questions, focussed on d
2. Online Programming Test (15-60 minutes). Focussed on PySpark and DBT. No hard algorithms – we just need to know you find easy programming questions easy (without an AI).
3. 2 Part Final Interview (60min/30min) – Technical conversation with the team, followed by a values conversation with the Chief Data Officer. Your choice whether these are back to back or on separate days – whatever is best for you.
We reply within seven days, yes or no.
If you’d like adjustments at any point — written instead of live, extra time, breaks, captions, a quiet room, the questions in advance, anything else — tell us. Many of our team are neurodivergent; we expect to adapt the process to the candidate, not the other way around. No diagnosis or explanation needed.
All applicants welcome
No matter who you are, where you’re from, who you love, follow in faith, disability status, ethnicity or the gender you identify with, you’re welcome at ProblemShared.
We particularly welcome applications from autistic, ADHD, and other neurodivergent candidates — many of our own team are neurodivergent. You wouldn’t have to mask to do meaningful work here.
Stop the endless job search. Our AI finds and applies to the best jobs for you.
Discover remote opportunities in Analytics Engineer
Answer easy questions
200,000+ jobs across 15+ categories
Get your best job matches
Only hand-screened, legit jobs
Find a remote job faster
No ads, scams, or junk
“ I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!