About Us
We are a small, fully remote team of musicians, engineers, marketers, and creators who care deeply about building an outstanding product. We value ambition, ownership, and the ability to move fast while maintaining high quality.
Our mission is to make high-quality, unique vocals accessible to musicians everywhere. We work at the forefront of audio technology, exploring new trends in audio generation, vocal synthesis, and state-of-the-art machine learning solutions to empower musicians to create their best work.
Your Role
As an Audio Machine Learning Engineer, you will focus on solving challenging problems related to AI-based vocals and audio engineering. You will collaborate closely with the AI team to design, develop, and improve large-scale machine learning models for audio applications.
You will play a key role in researching new approaches, building production-ready models, and continuously improving existing systems using modern machine learning techniques.
What You Will Be Doing
- Researching, developing, and improving machine learning models for singing voice synthesis (SVS) and voice conversion
- Experimenting with diffusion-based generative models for vocals and audio
- Working with neural vocoders (e.g., HiFi-GAN–style architectures, large-scale GAN-based or diffusion-based vocoders)
- Designing and improving audio feature extraction pipelines for vocal modeling
- Working with large, high-quality vocal and music datasets
- Improving model quality, robustness, and inference performance
- Integrating new models and improvements into production systems
- Writing clean, efficient, and scalable Python code using **PyTorch
Who You Are**
- Master’s degree (or higher) in Machine Learning, AI, Computer Science, or a related field
- Strong experience as a Machine Learning Developer, with Python as your primary language
- Hands-on experience with PyTorch for training and deploying deep learning models
- Solid understanding of singing voice synthesis, voice conversion, and modern audio modeling techniques
- Familiarity with diffusion-based generative models and neural vocoders
- Excellent English communication skills, both written and spoken
- Comfortable taking ownership, collaborating with a remote team, and working effectively in a fast-paced startup environment
- Bonus: Background or strong interest in music production, vocals, or audio engineering
Location
We hire within GMT 3 to GMT +4 time zones. Outside of this range, collaboration becomes challenging.
Compensation
- Full-time, fully remote position
- Competitive compensation based on experience