Lead Data Architect

 Posted 12 hours ago
     
10+ years experience
Apply Now

Please mention DailyRemote when applying

AI Summary

Design and implement enterprise data platforms optimized for GenAI and AI/ML workloads using Databricks and AWS. Lead the end-to-end development of data pipelines, model onboarding, and MLOps strategies to deliver scalable production solutions.

Why Karsun?

Join Karsun Solutions to grow your career with the company transforming possible for the US Government.

 

At Karsun, collaboration drives our community. We’re committed to building an environment where team members from diverse backgrounds can innovate, learn and grow with us. Here at Karsun, the only limit to your potential is the limit of your curiosity.

 

Join Team Karsun, and Find Your Next!

Summary

Senior/Lead technical data architect to design, build, and operate enterprise data
platforms that power GenAI and AI/ML use cases. This is a highly technical, handson
role responsible for data platform architecture, endtoend data engineering, ML/LLM
pipeline design, production model onboarding, and delivery of scalable Databricks-
centric solutions across cloud environments. Candidate must be AWS Certified
Machine Learning – Specialty.

What You'll Be Doing:

  • Architect and implement enterprise data platforms (batch + streaming)
    optimized for ML, LLMs, and GenAI workloads.
  • Lead design and hands on implementation of Databricks workspaces, Unity
    Catalog, Delta Lake design patterns, cluster policies, and performance tuning.
  • Build and own end to end data pipelines (ingest, transform, feature engineering,
    serving) using PySpark, Databricks Jobs, Spark SQL, Delta Lake, and
    orchestration tools.
  • Design and operationalize model training, fine tuning (LLM), evaluation,
    deployment, and monitoring pipelines (MLOps/RAG/CAG) integrating
    Databricks MLflow, CI/CD, and infra-as-code.
  • Implement vectorless and vectorization/embedding pipelines, vector store
    integrations, and retrieval layers for RAG (FAISS, Pinecone, Weaviate,
    Milvus).
  • Define data schemas, governance, lineage, access controls, and data product
    APIs; implement Unity Catalog or equivalent for centralized governance.
  • Drive cost/performance optimization for storage, compute (spot/preemptible),and query patterns.
  • Collaborate with engineers, data scientists, product owners, and security to
    translate business needs into production GenAI solutions.
  • Mentor and lead engineering teams; conduct architecture reviews, code
    reviews, and run technical deep dives.
  • Implement observability for data and ML pipelines (metrics, logging, data
    quality tests, alerting).
  • Create reproducible experiment tracking, model registry, and rollout strategies
    (canary, shadow testing, rollback).
  • Stay current on GenAI/LLM architectures and evaluate/introduce new tooling
    and frameworks.

Required Qualifications:

  • 8+ years hands on experience in data engineering/platform architecture; 3+
    years in an architect or lead role.
  • Proven, hands on Databricks experience (designing workspaces, Delta Lake,
    performance tuning, productionizing Spark jobs).
  • Deep Spark + PySpark expertise and experience with Databricks Runtime.
  • Strong experience building ML/LLM pipelines and operationalizing models
    (training, fine tuning, serving).
  • Practical experience with vector embeddings, semantic search, and RAG
    architectures.
  • Solid Python expertise and common ML libraries (PyTorch, TensorFlow,
    Hugging Face transformers) and MLflow.
  • Cloud platform experience (AWS strongly preferred).
  • Experience with containerization and orchestration while leveraging open
    source libraries for unstructured and structured data processing,
    serving/inference.
  • Strong SQL skills; experience with distributed query/warehouse systems and
    parquet/AVRO/Delta formats.
  • CI/CD and infra-as-code experience (Terraform, GitOps, Jenkins/GitHub
    Actions/GitLab CI).
  • Data governance, security, and IAM experience; experience implementing
    row/column level access controls and data lineage.
  • Demonstrated ability to design for scalability, reliability, and cost efficiency.
  • BA or BS degree in CS, Computer Engineering, Information Technology or a
    related field.

Preferred Qualifications:

  • Prior experience with Databricks Unity Catalog, Photon, and Databricks SQL.
  • Experience integrating Databricks with vector databases (Pinecone, neo4j) and
    retrieval frameworks (LangChain, LlamaIndex).
  • Familiarity with AWS Bedrock or other managed LLM services.
  • Experience with realtime streaming (Kafka, Kinesis) and stream processing on
    Databricks Structured Streaming.
  • Certifications: Databricks Certified Professional.
  • Experience with data quality and profiling tools (Great Expectations, Soda).
  • Experience with large-scale ETL frameworks and tools (Airflow, Prefect).

Things to Know:

Commitment to Non-Discrimination

All qualified applicants will receive consideration for employment without regard to disability, status as a protected veteran or any other status protected by applicable federal, state, local, or international law.

 

Salary Range

The proposed salary range for this role is $160,000 to $190,000 USD. The salary range provided is a good faith estimate representative of all experience levels. Karsun considers several factors when extending an offer, including but not limited to, the role, function and associated responsibilities, a candidate’s work experience, location, education/training, and key skills.

 

Third Party Resumes: Karsun does not accept unsolicited resumes through or from search firms or staffing agencies. All unsolicited resumes will be considered the property of Karsun and Karsun will not be obligated to pay a placement fee.

 

Clearance Information

This position requires the eligibility to obtain a security clearance. The Defense Industrial Security Clearance Office (DISCO), an agency of the Department of Defense, handles and adjudicates the security clearance process. More information about Security Clearances can be found on the US Department of State government website: https://www.state.gov/m/ds/clearances/c10978.htm

 

Location

To be considered for this role, you must reside in one of the following states: CA, CO, DC, FL, GA, IL, MD, NJ, NY, NC, OH, OK, PA, SC, TX, VA, WV.

 

Applicants must be authorized to work in the U.S. We may consider candidates currently in H-1B status who are eligible for transfer.

Similar Jobs

See all Remote Software Development jobs →

Personalize your Remote Job Search in 3 Easy Steps!

Discover remote opportunities in Data Architect

Answer easy questions

Answer easy questions

200,000+ jobs across 15+ categories

Get your best job matches

Get your best job matches

Only hand-screened, legit jobs

Find a remote job faster

Find a remote job faster

No ads, scams, or junk

I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!

Sarah J. — Sarah J. · Marketing Manager ★★★★★ Verified