Apple
AI

Generative AI Research Engineer, Multimodal, Agent Modeling - SIML

Apple · Cupertino, CA, US · $13k

Actively hiring Posted 7 months ago

Are you passionate about Generative AI? Are you interested in working on groundbreaking generative modeling technologies to enrich billions of people? We are driving multiple initiatives focused on advancing generative models, and we are seeking candidates experienced in training, adapting and deploying large-scale generative models. This role emphasizes AI safety, multimodal understanding and generation, and the development of agentic systems that push the boundaries of what AI can achieve responsibly.

We are the Intelligence System Experience (ISE) team within Apple’s software organization. The team operates at the intersection of multimodal machine learning and system experiences. It oversees a range of experiences such as System Experience (Springboard, Settings), Image Generation, Genmoji, Writing tools, Keyboards, Pencil & Paper, Generative Shortcuts - all powered by production scale ML workflows. Our multidisciplinary ML teams focus on a broad spectrum of areas, including Visual Generation Foundation Models, Multimodal Understanding, Visual Understanding of People, Text, Handwriting, and Scenes, Personalization, Knowledge Extraction, Conversation Analysis, Behavioral Modeling for Proactive Suggestions, and Privacy-Preserving Learning. These innovations form the foundation of the seamless, intelligent experiences our users enjoy every day.

We are looking for research engineers to architect and advance multimodal LLM and Agentic AI technologies, ensuring their safe and responsible deployment in the real world. An ideal candidate will have the ability to lead diverse cross functional efforts spanning ML modeling, prototyping, validation and privacy-preserving learning. A strong foundation in machine learning and generative AI, along with a proven ability to translate research innovations into production-grade systems, is essential. Industry experience in Vision-Language multimodal modeling, Reinforcement and Preference Learning, Multimodal Safety, and Agentic AI Safety & Security would be meaningful needs.

SELECTED REFERENCES TO OUR TEAM’S WORK:

https://arxiv.org/pdf/2507.13575

https://arxiv.org/pdf/2407.21075

https://www.apple.com/newsroom/2024/12/apple-intelligence-now-features-image-playground-genmoji-and-more/

Description

We are looking for a candidate with a proven track record in applied ML research. Responsibilities in the role will include training large scale-multimodal (2D/3D vision-language) models on distributed backends, deploying efficient neural architectures on device and private cloud compute, addressing emerging safety challenges to make the model/agents robust and aligned with human values.

A key focus of the position is ensuring real-world quality, emphasizing model and agent safety, fairness, and robustness. You will collaborate closely with ML researchers, software engineers, and hardware and design teams across multiple disciplines. The core responsibilities include advancing the multimodal capabilities of large language models and strengthening AI safety and security for agentic workflows. On the user experience front, the work will involve aligning image and video content to the space of LLMs for visual actions and multi-turn interactions, enabling rich, intuitive experiences powered by agentic AI systems.

Preferred Qualifications

Experience with building & deploying AI agents, LLMs for tool use, and Multimodal-LLMs

Minimum Qualifications

M.S. or PhD in Electrical Engineering/Computer Science or a related field (mathematics, physics or computer engineering), with a focus on computer vision and/or machine learning or comparable professional experience.

Strong ML and Generative Modeling fundamentals

Experience using one or more of the following: Pre-training or Post-training of Multimodal-LLMs, Reinforcement Learning, Distillation

Familiarity with distributed training

Proficiency in using ML toolkits, e.g., PyTorch

You're aware of the challenges associated to the transition of a prototype into a final product

Proven record of research innovation and demonstrated leadership in both applied research and development

Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant .

Tags & focus areas

Used for matching and alerts on DevFound
Ai Machine Learning Computer Vision Generative Ai Pytorch Fulltime
Common Questions

Frequently asked questions

Quick answers about how DevFound's AI matching, resumes, and referrals work.

DevFound's AI Copilot ingests your profile, goals, and live job data to deliver curated matches in seconds. Every match includes a resume variant, suggested referrals, and interview prep so you can act immediately. The more feedback you provide, the sharper the Copilot becomes.

AI-led job searches shrink the hours spent sifting through boards and formatting resumes. DevFound pairs automation with your personal outreach, so you reserve energy for interviews and negotiation. Traditional networking still matters, but AI gives you a lift before you even send a message.

Modern AI roles expect comfort with production-grade code, data fluency, and practical ML tooling. The strongest candidates pair deep technical chops with storytelling—translating model impact to product, GTM, and exec partners. Continuous learning keeps you ahead as stacks evolve.

DevFound rewards active seekers. Keep your profile fresh, respond to match quality prompts, and enable alerts so you never miss a role. The AI prioritizes companies and teams that align with your feedback, accelerating both introductions and interview invites.

High-density tech hubs continue to host the deepest AI talent pools, yet distributed teams are catching up fast. Use DevFound filters to hone in on onsite, hybrid, or fully remote roles and watch openings expand across time zones.

DevFound aggregates thousands of remote AI openings and flags the nuances—core hours, async culture, and visa needs—up front. The Copilot also recommends how to position your distributed work experience so hiring managers know you can thrive on a remote team.