CASE STUDY

X

Realistic Virtual Voices Through AI

OVERVIEW

  • Founded mid-2019 by Zohaib Ahmed and Saqib Muhammad
  • Company size — 11 people
  • Over 150+ paying clients, 70,000+ users
  • Caters to a diverse set of clients, from conversational AI experiences to entertainment, gaming, and film

Resemble AI is a rising company creating compelling, realistic, and unique voice narration using AI techniques. The distributed team of 11, based out of Toronto, is motivated by their founding ideal — creating high-quality voice characters should be an automated, streamlined process. With today’s technological capabilities, it shouldn’t be so time-consuming and inconvenient to create an alternative voice to Amazon’s Alexa.


BACKGROUND

Traditionally, creating a new AI voice is a manual process that requires recording a person’s speech in a studio setting and repeating this process any time major changes are needed. This is a cumbersome, expensive process that makes it difficult to iterate quickly. Furthermore, differentiating automated voice styles is becoming an increasingly important part of a brand’s identity and marketing: while most brands have always had unique visual symbols, nowadays, they often also want audio identities, much like that of larger brands.


As an initial proof of concept, Resemble AI allows cloning a voice from just 5 minutes of recorded audio data. Adding additional training data allows the voice model to become successively better. 


SOLUTION

Resemble AI is a machine-learning focused company, since their core product is entirely dependent on ML. Of the 8 engineers at Resemble AI, four are primarily machine learning (ML) engineers — making it so streamlined workflow in model training makes a huge difference. Before using Spell, the team worked in the Cloud, in conjunction with manually setting up machine instances, managing scaling, and integrating tools to make machine management easier. This process had caused a considerable amount of aggravation and required much time to be spent dealing with infrastructure. 


Eventually, Resemble AI started using some GPUs in-house, but with the small number of machines at their disposal, they found it simply wasn’t enough for all the experiments they were running — and that’s when they turned to Spell. Today, they use Spell for production workloads — every customer they onboard has a training process in which Resemble AI needs to spin up a job to train a model. Currently they process about 50-100 users on Spell this way on a daily basis. 


“With Spell what works out really well is we can remain Cloud agnostic. … Knowing that we can switch from one Cloud provider to another really easily.”

Zohaib Ahmed, CEO

RESULTS

With Spell, Resemble AI was able to cut their costs significantly. Compared with Amazon SageMaker or Google Cloud Platform ML, both of which demanded high costs for the short, yet frequent training runs they depend on, integrating Spell into the team’s workflows was a much more cost-effective solution that enabled the high-powered capabilities required to perform their frequent runs at a low cost.


They noted that with Google or Amazon, users end up structuring code to adapt to how the cloud platform wants it set up, which creates latency, makes it very painful to migrate, and requires much overhead to tackle.


In contrast, Spell alleviates much of the complexity of Cloud development, taking care of many of these moving pieces so that engineers don’t have to themselves. They also save a lot of time through having a much more intuitive UI on Spell, compared to that of other Cloud platforms.


With their streamlined workflow, Resemble AI hopes to work on scaling concurrency and reducing latency so they can continue to get results to their customers as quickly as possible. Their business is growing quickly, and Spell’s MLOps infrastructure has proven to be a pivotal factor to the success of their production jobs.



Spell is a powerful platform for building and managing machine learning projects. Spell takes care of infrastructure, making machine learning projects easier to start, faster to get results, more organized and safer than managing infrastructure on your own.


contact@spell.ml