Member-only story
The Dark Side of Microsoft’s New Voice Cloning Feature: Innovation Enabling Risk
Microsoft will release a new Teams feature that allows users to clone their voice so the system can translate their conversation into different languages in real time. However, this amazing technology has a dark side as malicious attackers may misuse the capability as part of voice cloning scams for social engineering attacks.
The new interpreter agent will simulate the user’s speaking voice as it translates to different native languages for meeting participants. As the conversation unfolds, attendees will hear the translated dialogue in the simulated voice of the speaker, allowing for two-way conversations to occur — “for a more personal and engaging experience”, according to Microsoft.
While I applaud Microsoft and the other companies who are working on similar technology and collectively driving a new era for cross-language communication, such powerful innovation comes with serious risks. Integrating voice cloning technology into mainstream products will significantly enable the already problematic and increasing deepfake crisis.
A Cybersecurity Nightmare in the Making
Cybercriminals understand how powerful deepfake technology, including the imitation of peoples’ voices, can be in committing fraud, obtaining or resetting credentials, or harassing targets. Therefore, technology providers must protect such tools at a higher level to reduce the risks of abuse.
Unfortunately, Microsoft is providing very few details indicating security forethought in its announcements. Like the recent Microsoft Recall feature debacle, this stands to benefit the attackers more than the users. Microsoft should have recognized the inherent voice-cloning risks and proactively “built-in” appropriate security controls to lead with as part of the marketing announcement. Wrapping such dual-use capabilities with strong security, notification validation, and authentication controls to limit its misuse is a good start.