Nir Yu

Data Engineer – Web Scraping, LLM Pipelines and Scalable Data Infrastructure

Posted 22 days ago

Mexico

⭐ 5-10 years experience

Apply Now

Please mention DailyRemote when applying

AI Summary

Build high-volume web scraping systems and structured datasets to create a strong data foundation. Develop automated ETL pipelines using LLMs and design scalable data architectures on GCP.

The Role:

You’ll help build the data foundation of our product: high‑volume web scraping systems, structured datasets and LLM‑driven processing pipelines. The role combines hands‑on engineering with architectural thinking and suits someone who enjoys turning messy web data into reliable, scalable outputs.

Key Responsibilities:

Build new structured datasets, including scraping accelerators, Form D filings and dynamic web sources.
Develop automated ETL pipelines that parse, clean and transform content using LLMs.
Define and maintain database schemas in Supabase or PostgreSQL.
Create evaluation frameworks to measure and compare LLM performance across pipeline components.
Contribute to the design of scalable data architectures using GCP services.
Improve reliability, observability and deployment workflows for scraping and data processing systems.

Requirements:

4+ years of experience building data pipelines, backend services and automated data processing systems.
Strong background in web scraping with tools like Scrapy, Playwright or similar.
Experience deploying pipelines on cloud platforms such as GCP or AWS.
Solid knowledge of ETL frameworks, workflow orchestration (Airflow) and modern data stores (BigQuery, PostgreSQL).
Comfortable working with Docker and API frameworks like FastAPI.
Clear, fluent communication in English.

Automatically Apply to the Best Remote Jobs

Stop the endless job search. Our AI finds and applies to the best jobs for you.

Try it Now

Nir Yu

🧑‍💻 Employees 201-500 employees 🏢 Industry Staffing and Recruiting

View More Jobs From Nir Yu

Nir Yu

Data Engineer – Web Scraping, LLM Pipelines and Scalable Data Infrastructure

AI Summary

Automatically Apply to the Best Remote Jobs

Ace Your Job Interview

How to Answer "How Do You Handle Criticism"?

How to Answer "Tell Me About Yourself?" in an Interview

How to Answer "What is your Experience with Customer Service?"

How to Answer "Describe Your Experience Working With Diverse Teams Or Different Cultures?"

How to Answer The Interview Question "What Sets You Apart From Other Candidates?"

How to Answer "Why Are You The Best Person For This Job"?

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Why Should We Hire You?"

How to Answer "What Areas Need Improvement?"

How to Answer "Tell Me About A Time When You Had To Balance Competing Priorities?"

How to Answer "Tell Me About a Time You Received Constructive Feedback"

How to Answer "What Is Your Greatest Accomplishment?"

Similar Jobs

Cloud Data Architect - DBT

Databricks Solutions Architect 2026- US

Software Engineer 2

Experienced Information Systems Architect - ServiceNow

Data Analyst

Go Software Engineer (UK)

Nir Yu

Data Engineer – Web Scraping, LLM Pipelines and Scalable Data Infrastructure

AI Summary

Automatically Apply to the Best Remote Jobs

Share This Job:

Similar Jobs

Cloud Data Architect - DBT

Databricks Solutions Architect 2026- US

Software Engineer 2

Experienced Information Systems Architect - ServiceNow

Data Analyst

Go Software Engineer (UK)

Personalize your Remote Job Search in 3 Easy Steps!