SynthID: Google is Expanding Ways to Protect AI Misinformation

Pankaj Singh 20 May, 2024 • 5 min read

Introduction

With the release of many AI tools, finding AI-generated content is crucial today!

It is all due to the widespread dissemination of false information and the potential for spreading hate. AI-generated content can create convincing fake news, deepfakes, and other misleading materials that can manipulate public opinion, incite conflict, and damage reputations.

You might wonder if there is any way to check whether the content is AI-generated. Yes, now it is possible with Google Deepmind’s SynthID

At a time when the internet abounds in AI-generated texts, the direct producers and creators of such content have an increasingly lesser chance of maintaining the authenticity and integrity of their work. In the contemporary digital era, it has become imperative to differentiate between human-generated content and AI-generated content to preserve trust and the value of human labor. Hence, the introduction of SynthID – the world’s first toolkit to mark and identify AI-generated content. Currently, this new revolutionary suite is in its beta version. It invisibly embeds digital watermarks across media variants to ensure easy and risk-free identification of AI-generated images, audio, text, and visuals.

What Exactly is SynthID?

At the 2024 I/O conference, Google unveiled the extension of SynthlD, a digital watermark designed to authenticate synthetic images produced by AI. This Google Deepmind technology will now be integrated into their latest video-generating tool, the Gemini app, and web interface. Originally introduced the previous year, SynthlD aims to safeguard users by providing a reliable method to distinguish between genuine and AI-generated content, thereby combating the spread of misinformation.

SynthID is designed to watermark and identify images created by AI. This innovative technology embeds an imperceptible digital watermark within the pixels of AI-generated content, ensuring that the watermark remains invisible to the naked eye but detectable through specific scanning methods. By examining an image for this unique digital signature, SynthID can determine the probability that an image was produced by an AI, thereby helping to authenticate the origins of digital imagery.

Also read: PaliGemma: Google’s New AI Sees Like You and Writes Like Shakespeare!

The Importance of Identifying AI-generated Content

SynthID addresses a critical need in the digital landscape: the ability to identify AI-generated content. While not a panacea for misinformation or misattribution, SynthID represents a significant step forward in AI safety. Making AI-generated content traceable promotes transparency and trust, helping users and organizations responsibly engage with AI technologies.

How SynthID Works?

SynthID employs advanced deep-learning models and algorithms to embed and detect digital watermarks across different media types:

Watermarking: Digital watermarks are embedded directly into AI-generated content without altering its original quality.
Identification: SynthID scans media for these watermarks, allowing users to verify if Google’s AI tools generated content.

The Watermarking Process

An LLM generates text one token at a time. Tokens can represent a single character, word, or part of a phrase. To create a sequence of coherent text, the model predicts the next most likely token based on the preceding words and the probability scores assigned to each potential token.

SynthID adjusts the probability score of each predicted token in cases where it won’t compromise the output’s quality, accuracy, and creativity. This process is repeated throughout the generated text, embedding a watermark pattern that is detectable by SynthID. The section below will cover SynthID for Text, Images, and Videos.

SynthID for Text

SynthID’s text watermarking capabilities are integrated into the Gemini app and web experience.

This approach embeds watermarks into the text generation process of large language models (LLMs), which predict the next token (character, word, or phrase part) in a sequence. SynthID can watermark text without compromising its quality or creativity by subtly adjusting token probability scores. This method is effective for various text lengths and remains robust under mild transformations like paraphrasing.

SynthID for Music and Audio

In November 2023, SynthID expanded to include AI-generated music and audio, deploying first through the Lyria model. The watermarking process involves converting the audio wave into a spectrogram, embedding the watermark, and then converting it back. This technique ensures the watermark remains inaudible and resilient to common audio modifications such as compression or speed changes.

SynthID for Images and Video

SynthID’s watermarking for images and video involves embedding watermarks directly into pixels and video frames. This method preserves media quality while allowing the watermark to remain detectable even after modifications like cropping or compression. SynthID tool’s capabilities are integrated with Vertex AI’s text-to-image models and the Veo video generation model, facilitating seamless identification of AI-generated content.

Availability and Integration

SynthID technology is available to Vertex AI customers and is integrated into products like ImageFX and VideoFX. Users can identify AI-generated content through features in Google Search and Chrome, promoting widespread use and accessibility of this technology.

We’ve also integrated SynthID into Veo, the most capable video generation model to date, available to select creators on VideoFX.

Also read: How to create Images using Imagen 2?

Benefits and Limitations

SynthID’s watermarking technology excels with longer AI-generated texts and diverse content but is less effective with factual prompts and extensively rewritten or translated text. While it significantly enhances AI content detection, it is not foolproof against sophisticated adversaries but is a strong deterrent against malicious use of AI.

Also read: Google I/O 2024 Top Highlights

Future Developments

Google plans to publish a detailed research paper on SynthID’s text watermarking and open-source the technology through the Responsible Generative AI Toolkit. This will enable developers to integrate SynthID into their models, broadening its impact across the AI ecosystem.

Conclusion

In conclusion, the SynthID tool is a testament to collaborative innovation’s power in AI. By embedding trust and accountability directly into the fabric of digital content, SynthID addresses current challenges and paves the way for a future where AI-generated media can be reliably authenticated. This advancement reinforces the integrity of digital information and sets a new standard for responsible AI usage, benefiting users and creators alike. As we look forward to the continuous development of the SynthID tool, its impact on the digital landscape promises to be both profound and far-reaching.

I hope you found this article helpful in understanding “SynthID: Google is Expanding Ways to Protect AI Misinformation.” For more articles like this, explore our blog section today!