Watsonx Powers Grammys, Security Tests Audio Hijacking
Key Points
- IBM watsonx partnered with the Recording Academy for the 66th Grammy Awards, using a generative AI content engine to streamline creation of multi‑channel stories about over a thousand nominees across nearly 100 categories.
- The watsonx.ai large language model was fine‑tuned on the Academy’s proprietary data, enabling editors to select templates, artists or categories, exclude topics, and instantly generate, re‑phrase, and edit headlines, bullets, and wrap‑ups, saving hundreds of hours of manual work.
- This AI‑driven workflow helped the Grammy digital team deliver engaging content to more than five million music fans worldwide while maintaining brand consistency and creative flexibility.
- IBM Security demonstrated a proof‑of‑concept “audio jacking” attack that hijacks live VoIP conversations, transcribes them with speech‑to‑text, uses an LLM to alter financial instructions, then synthesizes the modified speech with a cloned voice to trick victims into sending money.
- The experiment showed that only a few seconds of a person's voice are needed to create a convincing clone, highlighting emerging risks of voice‑deepfake attacks and the importance of robust security controls.
Sections
- WatsonX Powers Grammy Content Creation - IBM’s watsonx partnered with the Recording Academy to use generative AI for quickly producing and customizing multi‑channel stories that spotlight nominees and categories for the 66th Grammy Awards.
- AI Bot Hijacks Conversation - The speaker explains how a middleman bot can intercept and alter dialogue, highlighting the need for evolving security measures against generative AI threats.
Full Transcript
# Watsonx Powers Grammys, Security Tests Audio Hijacking **Source:** [https://www.youtube.com/watch?v=ZsWzF7g8YTc](https://www.youtube.com/watch?v=ZsWzF7g8YTc) **Duration:** 00:03:47 ## Summary - IBM watsonx partnered with the Recording Academy for the 66th Grammy Awards, using a generative AI content engine to streamline creation of multi‑channel stories about over a thousand nominees across nearly 100 categories. - The watsonx.ai large language model was fine‑tuned on the Academy’s proprietary data, enabling editors to select templates, artists or categories, exclude topics, and instantly generate, re‑phrase, and edit headlines, bullets, and wrap‑ups, saving hundreds of hours of manual work. - This AI‑driven workflow helped the Grammy digital team deliver engaging content to more than five million music fans worldwide while maintaining brand consistency and creative flexibility. - IBM Security demonstrated a proof‑of‑concept “audio jacking” attack that hijacks live VoIP conversations, transcribes them with speech‑to‑text, uses an LLM to alter financial instructions, then synthesizes the modified speech with a cloned voice to trick victims into sending money. - The experiment showed that only a few seconds of a person's voice are needed to create a convincing clone, highlighting emerging risks of voice‑deepfake attacks and the importance of robust security controls. ## Sections - [00:00:00](https://www.youtube.com/watch?v=ZsWzF7g8YTc&t=0s) **WatsonX Powers Grammy Content Creation** - IBM’s watsonx partnered with the Recording Academy to use generative AI for quickly producing and customizing multi‑channel stories that spotlight nominees and categories for the 66th Grammy Awards. - [00:03:07](https://www.youtube.com/watch?v=ZsWzF7g8YTc&t=187s) **AI Bot Hijacks Conversation** - The speaker explains how a middleman bot can intercept and alter dialogue, highlighting the need for evolving security measures against generative AI threats. ## Full Transcript
The role of Watson X at the Grammys
and IBM Security's audio jacking experiment
all on this episode of IBM Tech now.
What's up y'all my name is Ian and I am back
to bring you the latest and greatest news and announcements about IBM technology
IBM watsonx recently partnered with the recording Academy for the 66th annual Grammy Awards.
The challenge they faced?
Driving captivating content across multiple digital channels in today's highly fragmented media landscape.
Not an easy task when you need to celebrate the achievements and stories
of more than a thousand nominees across nearly 100 categories.
The solution?
AI stories with IBM watsonx,
a generative AI content engine fueled by trusted data.
Essentially the task was to build a content supply chain that would save hundreds of hours of research,
writing and production time while offering creative flexibility and easy review
This year's solution used the generative AI capabilities of watsonx
to leverage a powerful large language model hosted in the Watsonx.ai component.
The model was trained on the recording Academy's trusted proprietary data.
The AI stories interface let editorial team members choose templates
that featured nominees or categories with a variety of layouts and branding.
Next they selected an artist or award category to feature the subject of the post
and any topics to exclude from the output.
AI stories were then created featuring introductory texts,
headlines, bullets, one- liners and wrap-up texts.
Any of these outputs could be regenerated to create alternate phrasings and could be manually edited easily.
And that's how IBM watsonx and the Recording Academy digital team delivered an engrossing digital experience to more than 5 million music fans worldwide.
To learn more about watsonx at the Grammy Awards, click the link in the description of this video.
Next up is a wild story about how the IBM security team recently conducted successful audio jacking experiments.
It sounds like something out of a movie taking place in the future,
but audio jacking intercepts and hijacks a live conversation
then uses an LLM to understand the conversation
in order to manipulate audio output that clones the victim's voice.
Essentially they were able to modify the details of a live Financial conversation
occurring between the two speakers and divert money to a fake adversarial account.
It works roughly like this:
the attacker installs malware on a victim's phone or compromises a wireless voiceover IP service.
Next, they utilize speech to text capabilities to convert the victim's voice and conversation into text
and allow the LLM to understand the context of the conversation.
Then they instruct the LLM to modify the sentence whenever anyone mentions a bank account.
When the LLM modifies the sentence, the program uses text to speech with pre-cloned voices to generate and play the audio
- and before you bump on the clone voices, nowadays they only need 3 seconds of an individual's voice to clone it.
Finally, the bot switches the victim's bank account number with their attacker's number so funds are deposited into the wrong account.
And just like that, the bot which is acting as a middleman
has hijacked the conversation and changed key elements without either of the victims knowing.
There's many more fine-tuned details to the whole process that are covered in the blog,
but it's another illustration of how security processes will need to continually evolve
as gen AI presents new opportunities for bad faith actors to strike.
To learn more, hit the link below.
Thanks so much for joining me today for this episode of IBM Tech Now.
If you're interested in learning more about the topics I've covered make sure you explore the links in the description of this video
and again please don't forget to subscribe to our channel
to stay up to date on what's going on in Tech now.