About Voice Cloning
Voice cloning is the process of creating a synthetic voice from audio recordings of a real person.
Why in NEWS
Music composer A.R. Rahman has stated that his team obtained the necessary permission to use an AI software to recreate the voices of late singers Bamba Bakya and Shahul Hameed.
AI Voice Cloning: By using Artificial Intelligence (AI) techniques to train a Machine Learning voice model on real recordings, voice cloning can extract voice spectrums and produce a voice that sounds almost identical to the original voice.
- For example, advanced deep learning techniques can be used to create expressive, and emotional voice clones that are 99% identical to the original voice.
Misuse of Voice Clones: Scammers take an audio clip of an individual and upload it to an online programme that can almost exactly replicate the voice.
Some Examples of Voice Cloning Scams.
- A woman in Arizona received a call from someone who sounded like her daughter, crying and pleading for assistance. The caller claimed to have been kidnapped and requested $5,000 in ransom.
- A CEO of a UK-based energy company was duped into transferring $243,000 to a Hungarian supplier after receiving a call from someone who sounded like his boss, the CEO of the parent company in Germany.
- A Canadian man received a voicemail from someone who sounded like his mother, asking him to call her back immediately. When he called back, he was informed that his mother had been in a car accident and required money for surgery.
Voice Cloning Services: There are numerous applications available online, including popular ones such as Murf, Resemble, and Speechify.
Voice Cloning’s Working Mechanism
Data collection and preprocessing: To successfully clone voices, a large dataset of voice samples is required.
- Once collected, these audio samples are pre-processed, which involves cleaning, organising, and formatting the sounds before feeding them into AI models.
Role of Neural Networks: These advanced AI frameworks analyse raw audio data to extract patterns and decipher the complex web of sound.
Generative adversarial networks (GANs): A dynamic framework enhances the technology’s ability to clone voices. It has two components: the generator and the discriminator.
- The generator’s role is to produce synthetic voices.
- The discriminator must distinguish between generated and authentic human voices.
Creating Authentic Voices: The process of AI voice cloning leads to the creation of authentic voices.