Please mention DailyRemote when applying
About Spokeo:
Join our mission to make the world more transparent with data.
Spokeo is a people intelligence platform that helps over 18 million monthly visitors reconnect with friends, reunite with families, and build trust in new relationships. Thousands of companies also trust Spokeo’s 60 billion public records to improve customer research, help verify information, and prevent fraud.
Founded in 2006, Spokeo has built a dedicated, remote-first team with an average tenure of 6.9 years. It has earned recognition from Comparably as a Best Company for Compensation, Employee Happiness, Perks and Benefits, Support for Women, Work-Life Balance, and CEO Leadership.
About this Opportunity
As a Senior Data Engineer at Spokeo, you will develop, optimize, and improve our data systems, including ETL pipelines, storage, and entity resolution. This involves working with infrastructure built in AWS, including Airflow, PySpark, EMR, S3, DynamoDB, and more. This role will help build and improve data products, automation platform features, analytical software packages, and data pipeline orchestration tools.
What You’ll Do:
Build infrastructure and data automation pipelines to ingest, process, and load data from various sources. Automate and integrate new components into the data pipeline.
Collaborate with stakeholders and data science teams to develop data products, including entity resolution and best selection, to efficiently execute product vision and strategy in alignment with organizational goals and priorities.
Create unit and stress-test components to monitor technical performance and ensure that identified issues are resolved.
Develop data analysis tools to provide data insights and capture key metrics.
Research solutions and maintain technical documentation.
Follow best practices for data governance, quality, cleansing, and other ETL-related activities.
Who You Are:
7+ years of development experience in data engineering within a production environment (internships and academic settings excluded).
Proven experience working with large datasets exceeding 100M+ records or multiple terabytes.
2+ years of development experience in highly scalable, distributed systems and cluster architectures using AWS and utilizing EMR.
5+ years of hands-on programming experience with Python.
5+ years of professional experience working in big data ecosystems; Spark is required; PySpark is preferable.
3+ years of experience with SQL, schema design, and dimensional data modeling.
2+ years of professional experience working with dataflow orchestration tools, such as Airflow.
2+ years of experience with non-relational databases (e.g., DynamoDB, Elasticsearch, etc.).
A bachelor’s degree in Computer Science, Information Systems, Mathematics, or a related field is required.
Working at Spokeo
Spokeo offers a bonus program, equity plans, and a 401 (k). Once a year, we do a discretionary, merit-based salary increase. Additional benefits include 100% medical/dental/vision coverage and unlimited employee PTO.
Spokeo extends written offers to candidates who successfully complete their selection process. Spokeo’s offers include a base salary, participation in a company bonus program, stock options, and comprehensive benefits. The final offer will depend on several factors, including marketplace competition, job level, and the candidate’s experience and skills.
Privacy Notice for Candidates: https://www.spokeo.com/recruiting-policy
Spokeo is an equal-opportunity employer. Applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability, or protected veteran status. Spokeo fosters a business culture where ideas and decisions from everyone help us grow, innovate, create the best products, and remain relevant in a rapidly changing world.
Recruiters or staffing agencies: Spokeo is not obligated to compensate any external recruiter or search firm that presents a candidate or their resume or profile to a Spokeo employee without 1) a current, fully executed agreement on file, and 2) being assigned to the open position (as a search) via our applicant tracking solution.
#LI-Remote
Stop the endless job search. Our AI finds and applies to the best jobs for you.
Discover remote opportunities in Data Engineer
Answer easy questions
200,000+ jobs across 15+ categories
Get your best job matches
Only hand-screened, legit jobs
Find a remote job faster
No ads, scams, or junk
“ I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!