D
AI

DevOps/MLOps Engineer

DealerCX · US · $120k - $150k

Actively hiring Posted 5 months ago

Role overview

You'll own the entire deployment pipeline and model serving infrastructure. This is a hybrid DevOps + MLOps role – you'll ensure our application deploys reliably AND that our AI models (both frontier and local) serve efficiently.

Our cost optimization strategy requires routing between expensive frontier models (Claude, GPT) and cost-effective local models (Llama, Mistral) based on task complexity. You'll build and own this infrastructure.

Responsibilities

  • CI/CD pipelines – Automated build, test, and deploy on every push
  • Infrastructure as code – Terraform/Pulumi for reproducible environments
  • Monitoring & alerting – Know when things break before customers do
  • Incident response – Own uptime and reliability
  • Daily deploys – Enable the team to ship to production every day safely
  • Model serving infrastructure – Deploy and serve LLMs (local and API-based)
  • Model router – Build the abstraction layer that routes requests to appropriate models
  • GPU infrastructure – Manage inference servers for local models (Llama, Mistral)
  • Cost optimization – Track and optimize model usage costs
  • Model versioning – Safe rollouts and rollbacks for prompt/model changes
  • Developer experience – Make the team faster through better tooling
  • Scaling – Prepare infrastructure for growth
  • Infrastructure security – Server hardening, network security, firewall configuration, VPC design
  • Secrets management – Vault, AWS Secrets Manager, or similar; no secrets in code
  • Access control – IAM policies, least-privilege principles, SSO integration
  • Vulnerability scanning – Automated scanning in CI/CD, dependency audits, container scanning
  • Intrusion detection – CloudTrail, GuardDuty, or similar; alert on suspicious activity
  • Encryption – Data at rest and in transit; key management
  • Incident response – Work with fractional CISO to implement detection, containment, and recovery procedures
  • Compliance – Support audits and maintain security documentation
  • CI/CD quality gates – Automated tests run on every push; bad code doesn't deploy
  • Test environment management – Staging environments that mirror production
  • LLM output monitoring – Track hallucinations, wrong tool calls, response quality in production
  • Security scanning – Automated vulnerability scanning in CI pipeline
  • Alerting & anomaly detection – Know when something breaks before customers do
  • Cloud: AWS (EC2, RDS, S3, Lambda)
  • Containers: Docker
  • CI/CD: GitHub Actions
  • Database: PostgreSQL (RDS)
  • Caching: Redis
  • Model serving: vLLM, Ollama, or similar for local inference
  • GPU compute: AWS/GCP GPU instances or dedicated inference providers
  • Model routing: Custom abstraction layer for model selection
  • Observability: Datadog, Grafana, or similar for unified monitoring
  • 3+ years DevOps/SRE/Platform engineering experience
  • Strong AWS experience (EC2, RDS, Lambda, IAM, VPC)
  • Infrastructure as code (Terraform, Pulumi, or CloudFormation)
  • CI/CD pipeline design and maintenance
  • Docker and container orchestration
  • Monitoring and observability tools
  • MLOps experience – Model deployment, serving, monitoring
  • GPU infrastructure – Managing inference workloads
  • Experience with LLM serving (vLLM, TGI, Ollama)
  • Kubernetes experience
  • Cost optimization mindset
  • Experience serving both frontier APIs and local models
  • LangChain/LangSmith or similar LLM observability
  • Startup experience – comfort with ambiguity and speed
  • Texas location
  • AI-augmented development – We use AI tools extensively; you'll automate everything possible
  • Daily deploys – Your pipelines enable the team to ship constantly
  • Async communication – Written updates, minimal meetings
  • On-call rotation – Shared responsibility for production (small team = everyone contributes)

Benefits

  • 401(k)
  • Dental insurance
  • Health insurance
  • Paid time off
  • Vision insurance

Tags & focus areas

Used for matching and alerts on DevFound
Fulltime Remote Ai Mlops Generative Ai
Common Questions

Frequently asked questions

Quick answers about how DevFound's AI matching, resumes, and referrals work.

DevFound's AI Copilot ingests your profile, goals, and live job data to deliver curated matches in seconds. Every match includes a resume variant, suggested referrals, and interview prep so you can act immediately. The more feedback you provide, the sharper the Copilot becomes.

AI-led job searches shrink the hours spent sifting through boards and formatting resumes. DevFound pairs automation with your personal outreach, so you reserve energy for interviews and negotiation. Traditional networking still matters, but AI gives you a lift before you even send a message.

Modern AI roles expect comfort with production-grade code, data fluency, and practical ML tooling. The strongest candidates pair deep technical chops with storytelling—translating model impact to product, GTM, and exec partners. Continuous learning keeps you ahead as stacks evolve.

DevFound rewards active seekers. Keep your profile fresh, respond to match quality prompts, and enable alerts so you never miss a role. The AI prioritizes companies and teams that align with your feedback, accelerating both introductions and interview invites.

High-density tech hubs continue to host the deepest AI talent pools, yet distributed teams are catching up fast. Use DevFound filters to hone in on onsite, hybrid, or fully remote roles and watch openings expand across time zones.

DevFound aggregates thousands of remote AI openings and flags the nuances—core hours, async culture, and visa needs—up front. The Copilot also recommends how to position your distributed work experience so hiring managers know you can thrive on a remote team.