Data Engineer – Web Scraping, LLM Pipelines and Scalable Data Infrastructure

 Posted 2 days ago
  
 Mexico
  
5-10 years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

Build high-volume web scraping systems and structured datasets to create a strong data foundation. Develop automated ETL pipelines using LLMs and design scalable data architectures on GCP.

The Role:

You’ll help build the data foundation of our product: high‑volume web scraping systems, structured datasets and LLM‑driven processing pipelines. The role combines hands‑on engineering with architectural thinking and suits someone who enjoys turning messy web data into reliable, scalable outputs.

Key Responsibilities:

  • Build new structured datasets, including scraping accelerators, Form D filings and dynamic web sources.

  • Develop automated ETL pipelines that parse, clean and transform content using LLMs.

  • Define and maintain database schemas in Supabase or PostgreSQL.

  • Create evaluation frameworks to measure and compare LLM performance across pipeline components.

  • Contribute to the design of scalable data architectures using GCP services.

  • Improve reliability, observability and deployment workflows for scraping and data processing systems.

Requirements:

  • 4+ years of experience building data pipelines, backend services and automated data processing systems.

  • Strong background in web scraping with tools like Scrapy, Playwright or similar.

  • Experience deploying pipelines on cloud platforms such as GCP or AWS.

  • Solid knowledge of ETL frameworks, workflow orchestration (Airflow) and modern data stores (BigQuery, PostgreSQL).

  • Comfortable working with Docker and API frameworks like FastAPI.

  • Clear, fluent communication in English.

Similar Jobs

See all Remote Software Development jobs →

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in Data Engineer

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified