Can AI voice cloning be used for good?
A promising new technology for people living with ALS
By Mark Armendariz-Gonzales, Clera Rodrigues and Kaleef Starks
What happens when you lose the ability to speak? The gift of speech is a powerful tool, especially if you are a teacher. A teacher is able to take command of a classroom with their voice and connect with their students by talking to them. This is an issue that former fifth grade English teacher Rachael Doboga currently struggles with, as her almost decade-long battle with ALS has taken away her ability to talk. Luckily, with the help of AI voice cloning technology, Doboga has been able to keep a part of her old self alive by preserving her voice.
AI voice cloning technology is proving beneficial in numerous ways, especially for individuals dealing with illnesses. Yet this emerging technology presents plenty of risks and imposter scams. So much so that lawmakers — such as Congressman Ted Lieu, Congresswoman Anna Eshoo, and Senator Brian Schatz — are calling to place regulations on AI voice cloning.
Lieu, who represents the 36th district of California, drafted the H. Res. 66 bill— written entirely by the AI chatbot ChatGPT — in late January this year. The bill aimed to draw attention to the rapid development of AI technology, calling on Congress to take action to regulate AI.
“I have had the opportunity to meet with a wide variety of experts, including civil rights and liberties advocates, professors, experts with technical expertise, executives from leading AI companies and officials from the Biden-Harris Administration,” says Lieu. “AI will require a whole-of-government approach.”
Lieu also introduced H.R. 4223, the National AI Commission Act, to create a bipartisan commission of experts to advise Congress on what kind of AI to regulate and how to do so.
Governments around the world are still carving out what protective laws and policies to put into effect. AI regulation in the U.S. is still unsettled: there is currently no comprehensive federal law governing AI usage, no significant state law and no considerable work towards a U.S. equivalent of, say, the EU Artificial Intelligence Act.
In fact, merely using a voice cloning tool in a deceptive manner even non-maliciously and regardless of whether it is the tool’s intended use or only function, the Federal Trade Commission Act's restriction on unfair or deceptive conduct may nonetheless apply if you make, sell, or use it.
Doboga, as a teacher, was accustomed to projecting her voice, especially since her class consisted of almost two dozen 10-year-olds. Since her diagnosis in 2015, she has gone from being the loudest person in the room to its quietest. Speaking in her accustomed tone of voice became too painful in her throat, and her voice quickly became barely more than a whisper.
Besides affecting her voice, the disease took its toll on her body, causing her to become a quadriplegic. The once lively English teacher can now only move her feet about an inch and spends most of her days lying in a hospital bed set up in her bedroom alongside her husband’s twin bed.
Rachel Doboga at home in bed next to ventilator. Photo courtesy of Rachel Doboga
Unable to stand the thought of not being able to communicate verbally, Doboga quickly reached out to the local ALS Association chapter and coordinated with the assistive technology coordinator, who provided her with a microphone headset that she would use to amplify her fading voice.
“I eventually started wearing a microphone headset everywhere. That made me feel self-conscious, but it was a necessity,” Doboga says. “My voice also became so nasal I was unintelligible to everyone except my husband, family and best friend who I would talk to for hours in the middle of the night when the monster inside me seemed scariest.”
Michelle Joy Ross - speech pathologist at USC Keck Medicine
In addition to the microphone headset, Doboga was given voice banking software. Voice banking is a type of AI voice cloning software that allows you to create a synthetic voice that matches with your natural voice.
Doboga took advantage of this technology a year after her diagnosis and initiated the process of using the software to preserve her voice. The very first phrase she “banked” was her telling her husband that she loved him. She then recorded a number of additional phrases from caregiving tasks to everyday conversational talk. Doboga was able to bank a total of 72 phrases before losing her voice completely.
“Seventy-two phrases may sound like a lot, but it's nothing when you consider the fact that that's supposed to last the rest of your life.”
The Greater Good
Although the use of voice banking was late in Doboga’s case, the technology has become life-changing for many people dealing with ALS and other illnesses that cause an individual to lose the ability to speak. While much discussion about AI voice cloning centers on the technology’s abuse by scam, the technology brings significant benefits in the medical field. In 2018, the ALS Association launched an initiative called “Project Revoice” to help people dealing with ALS record their voices.
AI voice cloning has started to make an impact within the medical field, assisting people battling ALS as well as illnesses such as throat cancer, Parkinson’s Disease and Pseudobulbar Palsy. One major hurdle toward putting this technology into the hands of individuals who need it most is the cost. Between $500 to $1,000 for a one-time use of the voice cloning software, the technology is out of most people’s reach. It is, however, covered by disability insurance. Currently, nonprofits such as the Team Gleason Foundation are filling the financial gap by providing funding for patients who are in need of this technology.
Michelle Joy Ross demonstrates voice cloning process for ALS patient
“From a medical model, you are decreasing someone’s social isolation and you are improving their quality of life,” says USC Keck speech-language pathologist Michelle Ross. Working with ALS patients on a daily basis, Ross utilizes specific technology and software to help patients communicate effectively. The tools, which include a large tablet-sized screen — 27-inches by 16:9 ratio — allows her to assist patients with developing phrases such as “I’m hungry” and “I would like to use the bathroom” through eye movement. A second sector of software allows phrases to be typed, saved and used for day-to-day communication on a laptop computer. Both tools are attached to a wheelchair as ALS patients receive assistance from Ross.
Despite breakthroughs with AI voice cloning, science remains imperfect. Voice banking can be time consuming as it requires a lot of data to clone a patient’s voice. There have also been instances in which the patients feel like the banked voice does not sound like their own.
“We can replicate the sounds of their speech, but we can’t really replicate the pitch changes that people use,” says Amy Roman, an ALS Association speech-language pathologist.
Amy Roman - speech pathologist at ALS Association
Roman advises that when voice banking, patients must remember this is an AI voice. That said, AI voice cloning is developing and becoming more realistic at a much faster rate than governments can keep up in regulating it.
Adam Powell, the executive director of the bipartisan USC Election Cybersecurity Initiative, calls Lieu’s legislation a great first step towards regulation.
“We are in another one of these stages where the technology is going to be running faster than any legislative or executive action to try to even observe it, let alone try to shape it,” says Powell.
The legality — or lack thereof — of AI-based voice cloning is still up for debate. What types of vocal clones are permissible? Where do we draw the line? Is posthumous vocal cloning ethical?
“I’ve heard from folks all over the country about the importance of countering the risks posed by AI.”
-Rep. Ted Lieu
Moreover, AI cloning is a global issue that affects everyone everywhere. Powell says, “The United States Senate doesn’t have a monopoly on AI. The Chinese are spending a lot of money on this. The Russians are spending money on this.”
With such realistic, inexpensive voice cloning technology, Powell says, “You can’t believe what you hear. You have to really be skeptical at a level where, we as people, haven’t been skeptical before.”
In cloning a voice, someone might be violating an individual’s right to privacy and publicity. At the end of the day, someone’s vocal identity is being appropriated. This can be unethical when applied in negative situations.
What Is at Stake?
In discussing the dangers of AI-generated voices, Lieu says, “We will have to rely on free news media to point out these deceptive uses of AI. It’s very difficult for the average person to tell real visual or audio material from fake.”
Despite the great possibilities this technology opens up for learning and healthcare, vocal data is also widely being misused for malicious purposes.
In reference to some of the adverse applications of voice cloning, Lieu says, “AI can be – and already has been – used to mimic photos and videos that do not actually exist. We’ve seen AI used in political campaign ads instead of real photos or videos of a candidate.” For instance, the U.S. President Joe Biden’s image and voice was used to produce realistic false news to deceive the general public earlier this year in February.
“This presents real danger to the information ecosystem surrounding our campaigns and elections, the most central piece of our democracy,” adds Lieu.
When it comes to political manipulation, Powell says, “The first major AI hack of a national leader that we know of may have been in Russia, where in June of this year, somebody created an AI version of Vladimir Putin, and they had him announcing a general mobilization of the Russian people.”
The hacker was able to feed the video clip to Russian media outlets, following which it was broadcast on television and radio stations in regions bordering Ukraine before the Kremlin was able to denounce its legitimacy and validity.
This branch of AI technology is being developed by people all over the world. “It's not just a matter of making certain that Microsoft and Google are identifying who is real and who isn't,” says Powell. “But we're going to have this coming very rapidly, and things coming from China, Russia and elsewhere.”
As the leader of the USC Election Cybersecurity Initiative, Powell is providing election cybersecurity training in all 50 states to aid the protection of elections and campaigns. Powell says his team is incorporating AI voice cloning technology curriculum into all of their security workshops.
Lieu has heard from people all over the country about the importance of countering the risks that AI poses. “I’m encouraged by the collaboration we’ve already seen on this issue and hopeful it will continue,” says Lieu.
Lieu is one of only three computer science majors in Congress. “I’m working hard to express the urgency of this situation to my colleagues,” Lieu says. “AI presents uncharted territory for all of us, which is why it’s so important that we hear from experts, scholars and advocates about how best to respond to this rapidly developing technology.”
Perks of Vocal Clones
Despite the technology’s potential drawbacks, AI-generated voice cloning offers significant benefits. In education, for instance, cloned voices can function as virtual instructors or as peers (not teachers or tutors, but as interactive social agents) in learning support groups.
An Association for Computing Machinery review of designs dealing with death revealed that technological solutions, such as AI-generated characters depicting the departed, give users a chance to cope with grief and come to terms with their loss.
While voice cloning has implications globally, it equally has advantages around the world. Another instance of a voice cloning beneficiary is Vivek Kargathiya, an SEO strategist based in Calcutta, India. “Last year, my grandma passed away, and I used to live in with her,” says Kargathiya. “I really missed [her].”
Kargathiya used the AI voice cloning app Speechify to recreate his late grandmother's voice using old voicemails and videos he saved.
“I didn’t face any challenges over here to use the tool. It was easy to use, but the voice cloning is amazing,” Kargathiya says. “But as we remember, the tool is a tool, and the human voice is a human voice. So there is a little bit of a difference in the voice, but overall the experience was amazing.”
From Victim to Advocate
Before being diagnosed with ALS, Doboga had a very active life. She and her husband would go canoeing, fossil hunting and practice yoga together. Doboga also had a strong love for dance. She knew many styles, including jazz, hip-hop and tango. At one point, she danced with the Moscow Ballet. As her condition worsened and her body began to shut down, Doboga had no choice but to say goodbye to these activities that brought her so much joy. She soon found herself bedridden and in a constant battle with fatigue.
Rachel Doboga and her husband, Evan, share quality tme at home. Photo courtesy of Rachel Doboga
Instead of accepting defeat to this terminal disease with no known cure, she did the complete opposite. With the help of her husband and best friend, Doboga began to spread awareness by writing articles and columns for such publications as The Huffington Post and ALS News Today. She also started her own blog called howilivewithALS.com. Her goal is to show the public that ALS is not incurable, it is just underfunded. Doboga’s solution is for the public to fundraise and talk about ALS as much as they can. Becoming an advocate for ALS has helped a bed-bound Doboga find a new sense of purpose. Although the disease has stopped her physically, Doboga remains relentless and continues to be a powerful voice within the ALS community.
“A person’s voice is an intrinsic part of their identity. When I listen to recordings of my voice, I realize how warm and friendly it was,” says Doboga. “Warm and friendly, just like me.”