The future of AI will orient around datasets… which will put the user in a front-and-centre blind spot
Photo: BP Miller
The rise of generative AI has thrown the future of creators (of all industries) into question, with copyright being the key to how it all plays out. Models like ChatGPT have an input-output ‘black-box’ problem, which means that whatever goes into training the models cannot be traced to what comes out. Therefore, ownership is an open question: is an AI-generated piece of work owned by the end user who generated the prompt, the parent company of the language learning model (LLM), or the artist whose works trained the model in such a way as to produce it?
There are varying degrees of clarity, of course. Generative-AI music tool Boomy, for example, faced this question by specifying in its terms that any song created on its software is the sole property of Boomy. In doing so, the company has inadvertently established the precedent that a right is created when a song is generated using AI. Whether this is the right way to go is disputed, however. A track like ‘Heart on My Sleeve’ was clearly trained on Drake and The Weeknd, raising questions as to whether they should have a say in the distribution of it (the song was removed from streaming after apparent takedown requests by Drake and Universal). On the other hand, something like an image of soup in the style of vaporwave, generated by DALL-E, is less clear in origin and thus trickier to attribute.
For the generative models that come pre-trained, blockchain can offer some solutions by effectively stamping ownership of different track or image parts, and attributing proportional ownership back to the original creators. This would contrast with the recent US Copyright Office ruling, which determined that entirely-AI-generated works have no human input and can have no copyright attached at all. This could have an impact on streaming services, where human-made, rights-bound tracks would compete side-by-side with AI-generated ones. Watermarking could play a role here as well, allowing distribution platforms to register AI-generated content and potentially screen it, which could protect creators – but it would still be a maze to navigate for platforms and creators alike. Still, other companies are taking a third-way approach; for companies like Electric Sheep, which builds AI-powered rotoscoping for the video industry, the promise of rights-cleared datasets operating on broader licensing models ensures commercial viability from the onset without the complexity of posthumous copyright navigation.
Yet there is a gap somewhere in the middle, and this is the one which is likely to widen: models that users can train themselves, on data of their own. Stable Diffusion, for example, allows users to ‘fine tune’ the model quickly and easily with images selected by the user, allowing them to hyper-personalise the output. This means that users can, for example, train the model on a single artists’ entire body of work (screenshotted or simply photographed, so without any form of inbuilt attribution as could be accomplished through blockchain), and generate lookalike images in their precise style – and then share, make prints, and even potentially sell on a platform like Etsy. Even if copyright law does come down on the side of attribution to original creators (still an ongoing debate), this sort of process would be nearly impossible to track at scale, harder to crack down on (if millions of people are doing it, can you really catch them all?), and – notably – is cheaper, easier, and avoids liability for the companies behind the AIs themselves. It is similar to how Napster lawyers argued that Napster was merely providing a platform — it was users who were doing the illegal file-sharing. This offers the path of least resistance for future development, and thus has no small role to play as we move forward.
Featured Report
Visionary audio Unlocking the power of video in podcasting
YouTube may be the only viable platform for long-form video podcasts, but that does not mean audio-first podcast platforms should abandon video. Instead, podcast platforms should leverage video both as...
Find out more…The answer now, as it was back in the 2000s with music, is to embrace the new dynamic in a way that rewards original creators while being convenient to consumers. Piracy was impossible to fully quell, but streaming offered a simpler user alternative that was not against the law and for an agreeable price. Of course, Napster being free was the main reason why so many ‘90s kids downloaded it, but it was not the only reason. Napster also gave them a convenient way to download files from the internet, something that was previously a major hassle, and convenience always wins. Similarly, AI generation is out of Pandora’s box, and it is no longer possible to get people to stop playing around with it and creating new things – but a platform that can offer this kind of creative freedom in a context that still rewards original artists, and allows users a way to showcase their creative engagement guilt-free, would doubtlessly be an easy win. Licensing for bodies of works to be used in AI models for users to play with, rather than micro-attributions as per blockchain tracking, would probably be the easiest way to do the rights side of this; something that Grimes, among others, have already embraced.
Regulation (like the EUs’ Digital Services Act) will have to play a role in encouraging this kind of solution, however. The open question of ownership leaves us in a creative and attribution grey zone, where creation and distribution are a free-for-all and no one is responsible for the pitfalls. Is it streaming platforms’ responsibility to block AI tracks? Social platforms to discourage their sharing? The owners of the AIs to build in attribution? Or users, to understand the ins and outs of copyright, creation, remuneration and distribution all on their own, and behave accordingly?
It will have to be everyone’s responsibility, to some extent. Streaming platforms will likely need to incorporate some kind of ‘stamp’ of attribution / verification, purely for transparency purposes. Generative AI will likely have to enable blockchain integration to stamp works from what goes in to what comes out. Social platforms are eventually going to have to take responsibility as content platforms (for much more than just AI) as they move ever more into the algorithmically-curated content discovery space, and find it harder to hide behind their position as “social networks” where everything lies on the user. And users and creators alike will have to come to grips with what AI is, what it can do, and how to interact with it. The future is everyone’s problem – because otherwise, the blind spots can quickly become the main frontiers of innovation in all the wrong ways.
The discussion around this post has not yet got started, be the first to add an opinion.