Why the AI boom needs Web3: The privacy challenge

Ocean Protocol Team
Ocean Protocol
Published in
4 min readSep 26, 2023

--

By Sheridan Johns, co-founder of Ocean Protocol, and Leonard Dorlöchter, co-founder at peaq

Let’s play a game: We give you the setup, and you come up with a punchline. Here we go — what does Italy have in common with Twitter’s ex-Chief Twit and the creators staging an online protest at ArtStation? “They all know how to make a statement that goes viral!”

The above punchline was proudly brought to you by none other than ChatGPT, everyone’s favorite AI overlord, but this questionable sense of humor is not what got the model a temporary ban in Italy. The country restricted access to the application due to the concerns over user privacy and data. Rome lifted the restrictions after tweaks that allowed users to opt out of letting OpenAI use their dialogues with ChatGPT to help train it.

Data, or rather its allegedly illegal usage, was also what prompted Elon Musk to lash out at Microsoft, accusing it of using Twitter’s data to train an AI model. After all, user data is a crucial pillar of the business model embraced by today’s social media platforms and Big Tech in general. By harvesting user data, they can hyper-personalize the ads they feed to their users to the point where it sometimes feels like they know us better than we know ourselves.

And, as you have correctly guessed it, data was at the heart of the ArtStation protests against AI-generated imagery. Art is also a type of data, the kind you need to train a model with hardcore applied statistics to approximate an artist’s genius. The same goes for literature, music, and pretty much anything else. If it can be created, it can also be turned into data, and if it can be turned into data, you can train a model that can generate more data of the same sort.

So yes, data is the punchline. But as the age of AI takes off, this punchline will draw only a few laughs — and a ton of turmoil.

New oil, new clashes

In the age of AI, data is indeed the new oil, as old as this Big Data-era wisdom may sound. It’s the fuel that drives machine learning, the core technology powering AI, and scale is the name of the game: In most cases, the more data you throw in the machine-learning furnace, the better the model you get.

And much like oil, again, data isn’t just a resource, but a source of power. Whoever controls the spice, controls the universe, and whoever controls the data, controls the future of the AI boom.

While we hopefully won’t see data being a factor in actual wars, the age of AI will surely see its fair share of spats and arguments over who gets to do what with data. Even though it’s not regarded as a finite resource, and is easily replicated with a good ol’ copy-paste, there is a zero-sum game element here. If Alice owns the data Bob needs to train his BobGPT, she has a vested interest in making sure he’s not doing that for free.

While she’s at it, though, Alice may want to seriously consider another piece of this puzzle: privacy.

At a first glance, there’s an inherent tension between AI and privacy. While training an AI model takes vast amounts of data, the way this data is being harvested often infringes on our privacy. In some cases, as it apparently happened with Meta, this violation is more clear-cut, and in others, as with the artists’ protest, it is more nuanced. Part of the dataset that StableDiffusion was trained on came from the publicly-available art on ArtStation, and its result is a model that you can use for commercial and non-commercial purposes alike… with some terms and conditions. So it’s not too difficult to see why the creators were outraged.

AI’s privacy struggles are indicative of a larger and more deep-seated issue. The digital economy’s business model is predicated on surveillance and monetization of user data. This model, often referred to as “surveillance capitalism,” goes against privacy if not in the letter, but definitely in the spirit. The advent of AI will only kick this problem into a whole new gear.

So while data is growing more valuable by the day, all the disputes we’ve seen so far are just the tip of the iceberg, in terms of companies and platforms clashing over data ownership, as well as users and regulators protesting the infringement on privacy. But what if there was a way to reconcile AI’s insatiable hunger for data with the need for privacy?

Enter Web3, the next evolution of the internet, where users, not tech companies, are in control of their data, and enter the second part of this blog post, published by peaq, the Web3 network powering the Economy of Things (EoT). AI and data are crucial components of the EoT as envisaged by peaq, and its co-founder Leonard Dorlöchter has a lot to add — so click here and keep on reading.

About Ocean Protocol

Ocean Protocol was founded to level the playing field for AI and data. Ocean tools enable people to privately & securely publish, exchange, and consume data.

About peaq

peaq is the go-to blockchain for real-world applications. It is a layer-1 blockchain powering the Economy of Things and the backbone for Decentralized Physical Infrastructure Networks (DePINs).

--

--