Responsibilities

Design, implement, and train state-of-the-art ML models for high-impact applications (e.g., NLP, Computer Vision, Network Optimization).
Optimize AI workloads for extreme performance and scalability on large-scale GPU systems like GB200 NVL72, using tools such as Dynamo, vLLM, and advanced inference engines.
Partner with cross-functional teams to co-design hardware-software solutions that maximize AI processing efficiency.
Build robust tools, data pipelines, evaluation frameworks, and deployment systems.
Track and incorporate the latest AI research and technological advancements.
Contribute to product requirements (PRDs) and agile execution (sprint planning and delivery).
Champion a culture of humility, bold innovation, and high-velocity product delivery.

Basic qualifications

Bachelor's degree in Computer Science, Electrical Engineering, Mathematics, Statistics, or a related technical field.
3+ years of hands-on experience in machine learning, deep learning, and software engineering.
Proficiency in Python; experience with C/C++.
Strong working knowledge of major AI/ML frameworks (PyTorch, TensorFlow, JAX, or similar).
Solid foundation in data structures, algorithms, and software design principles.

Master's or PhD in Computer Science, AI/ML, or a related discipline.
Experience with Large Language Models (LLMs), Generative AI, or Computer Vision.
Familiarity with distributed training frameworks and techniques (e.g., Ray, DeepSpeed, Megatron-LM).
Proven expertise optimizing models for GPU inference (e.g., TensorRT, Triton Inference Server).
Knowledge of MLOps tools and practices (Kubeflow, MLflow, etc.).

Used for matching and alerts on DevFound

Fulltime Ai Ai Engineer Machine Learning Computer Vision