Responsibilities
- Design, implement, and train state-of-the-art ML models for high-impact applications (e.g., NLP, Computer Vision, Network Optimization).
- Optimize AI workloads for extreme performance and scalability on large-scale GPU systems like GB200 NVL72, using tools such as Dynamo, vLLM, and advanced inference engines.
- Partner with cross-functional teams to co-design hardware-software solutions that maximize AI processing efficiency.
- Build robust tools, data pipelines, evaluation frameworks, and deployment systems.
- Track and incorporate the latest AI research and technological advancements.
- Contribute to product requirements (PRDs) and agile execution (sprint planning and delivery).
- Champion a culture of humility, bold innovation, and high-velocity product delivery.
Basic qualifications
- Bachelor's degree in Computer Science, Electrical Engineering, Mathematics, Statistics, or a related technical field.
- 3+ years of hands-on experience in machine learning, deep learning, and software engineering.
- Proficiency in Python; experience with C/C++.
- Strong working knowledge of major AI/ML frameworks (PyTorch, TensorFlow, JAX, or similar).
- Solid foundation in data structures, algorithms, and software design principles.
Preferred qualifications
- Master's or PhD in Computer Science, AI/ML, or a related discipline.
- Experience with Large Language Models (LLMs), Generative AI, or Computer Vision.
- Familiarity with distributed training frameworks and techniques (e.g., Ray, DeepSpeed, Megatron-LM).
- Proven expertise optimizing models for GPU inference (e.g., TensorRT, Triton Inference Server).
- Knowledge of MLOps tools and practices (Kubeflow, MLflow, etc.).
Tags & focus areas
Used for matching and alerts on DevFound Fulltime Ai Ai Engineer Machine Learning Computer Vision