Please mention DailyRemote when applying
We are Software Mind, an awesome team of engineers who are ready to ramp up any top-notch company’s projects! Our aim? To always be one step ahead. Become part of a multicultural company in constant growth with an excellent work environment certified by Great Place To Work!
About the Project
Software Mind is building a private, tenant-isolated AI assistant for the real estate title and settlement industry. The platform is a retrieval-first (RAG) system that ingests historical email, documents, and structured metadata into a per-tenant vector index, and serves grounded, cited, expert-weighted answers through a chat-style Q&A interface with single sign-on and full audit logging.
The platform is AWS-native with a Python/FastAPI backend, Vue.js frontend, OpenSearch/Pinecone vector store, and OpenAI/Anthropic/Bedrock as LLM provider. You will join a senior, cross-functional LATAM-based team where hands-on AI delivery experience not just familiarity is the baseline expectation.
You stand up and own the cloud infrastructure and CI/CD foundation the entire project runs on. Your work is on the critical path from day one: delivery begins with environment provisioning. You design for tenant isolation, observability, and security from the outset not as an afterthought. This role requires prior experience operating infrastructure for production AI or LLM-based workloads.
Your Responsibilities
Provision and configure a dedicated VPC and segmented cloud environment on AWS
Build the baseline CI/CD pipeline and maintain and evolve it across all delivery phases
Configure and manage the vector store infrastructure (OpenSearch/Pinecone on AWS)
Set up and manage the observability stack: CloudWatch, X-Ray, alerting thresholds, and LLM-specific monitoring
Implement infrastructure-as-code for all environments (dev, staging, production) using Terraform or CDK
Manage secrets, KMS encryption key configuration, and tenant-scoped access controls
Configure LLM provider connectivity (OpenAI / Anthropic / Amazon Bedrock enterprise tier, zero-data-retention)
Define and implement environment promotion strategy aligned with the 2-week sprint cadence
Support incremental ingestion pipeline infrastructure requirements and nightly scheduling
Tech Stack: AWS (VPC, ECS, Lambda), Terraform, CDK, OpenSearch, Pinecone, CloudWatch, X-Ray, GitHub Actions, CodePipeline, OpenAI, Anthropic, Bedrock, Cognito, KMS, Docker
Must-Have Skills & Experience
6+ years in DevOps or cloud infrastructure engineering; strong AWS specialisation required
Infrastructure-as-code: Terraform, CloudFormation, or AWS CDK
CI/CD tooling: GitHub Actions, AWS CodePipeline, or equivalent
Core AWS services: VPC, ECS, Lambda, S3, DynamoDB, API Gateway, Cognito, CloudWatch, X-Ray
Experience designing and operating multi-tenant cloud environments with tenant-level data isolation
AI Experience (Required Not Optional)
At least one project operating infrastructure for a production AI/ML or LLM-integrated system not just general cloud workloads
Experience configuring and managing vector store infrastructure (OpenSearch, Pinecone, Weaviate, or equivalent) in a production environment
Familiarity with LLM provider APIs (OpenAI, Anthropic, or Amazon Bedrock) in a production/enterprise configuration, including zero-data-retention tier setup
Understanding of AI-specific observability concerns: token usage monitoring, latency profiling for LLM calls, and model response logging
Nice-to-Have
Experience with enterprise SSO and identity federation: Cognito, Okta, or Azure AD
Background in HIPAA, SOC 2, or regulated-data cloud environment configuration
Familiarity with OCR or document processing service infrastructure (AWS Textract, etc.)
We are accepting applications from LATAM countries
#LI-DNI
Stop the endless job search. Our AI finds and applies to the best jobs for you.
Discover remote opportunities in DevOps Engineer
Answer easy questions
200,000+ jobs across 15+ categories
Get your best job matches
Only hand-screened, legit jobs
Find a remote job faster
No ads, scams, or junk
“ I was the first applicant for a remote marketing position that got listed on the company website the same day I applied. Had an interview within 48 hours!