Copyright is a type of intellectual property that gives its owner the exclusive right to copy, distribute, adapt, display, and perform creative work, usually for a limited time. It was created as a direct response to the development of the printing press, as the cost of producing new books became so low that there was little incentive for authors to keep writing new pieces, as the competition would copy them almost instantly. While the history of copyright is very interesting and determines many of the laws that we still have today, what I want to address here is how the core values of copyright protection are misguided and how AI can help us redefine it.
One of the problems I had, even before giving any thought to this issue, is how extremely open to interpretation the rules are, especially when talking about fair use. There is an abundance of examples of people abusing the system one way and another; famously many YouTubers have lost their channels because of copyright strikes by fake companies, and most people incur copyright infringements daily without realizing, be it by sharing photos, music, or trying to consume any content that isn’t licensed in your region.
Copyright is one of the areas of intellectual property, which also includes trademarks and patents. While I have my problems with those, they are nowhere near the level of jank of copyright. Not only is it country-dependent, and very much so, but there also are treaties between countries and another layer of international copyright that makes it a difficult-to-understand mess. And that’s one of the core problems, since it was designed to fight for the rights of the authors to their pieces in traditional media, it never accounted for the amount of sharing and copying that an internet connection gives every person.
I don’t think any copyright law is well-designed if almost everyone is breaking it constantly, and the current state of the web makes it impossible to properly enforce the legislation, that alone should make us want to take another look at it and see what can be improved.
So where does AI fit into this? Well, somewhat recently, some court cases in the US are setting precedents on how AI-generated art fits in relationship to ownership. I firmly believe that AI-generated images and videos are art, and I’ve been planning for a long time to write about what I think art is, but that will come later. In the same way that digital drawing or Photoshop changed the paradigm of visual arts or DAWs in music-making, it’s not the tools you use that define if what you are doing is artistic, but the choices and intention with which you make it.
Unluckily, it doesn’t seem like the judges are convinced of that, and recent rulings seem to not be granting artificially generated work any copyright protections. It might have to do with the fact that most datasets are made of actual copyrighted material (used illegally in almost all cases), so it seems counterintuitive to let the resulting material be of ownership to another person.
I´d like to believe that my conclusions are logical, and with time they will become even more obvious as more of the art spaces will involve some kind of AI, and the office will be forced to review the rules. What I hope for is that the disruption will be big enough to prompt some more radical moves. Many other interesting arguments are presented in this substack, which originally motivated me to write this post. The discussion section also houses some good criticism. part 1 and part 2 of Jonh Menick’s take on AI and art is one of the best-written articles I’ve read.
With AI out of the way for now, I want to explore what some of those changes might be and possible outcomes. Most of the inspiration for this comes from this video and this one from Benn Jordan, who you might also know as The Lightbulb. His main concept is called “Socialized Copyright”, and as he explains (you really should watch the video to understand it properly), the government creates a platform in which all kinds of media can be consumed and creators are paid via taxes. In the video, he does a great breakdown of why this is financially viable (much more than I expected) and how would this affect the creative industries. He also explains how “companies profit from the industry of gatekeeping information” and are supported by the gov (and your tax dollars) while the artists get the crumbs of the pie.
I find this proposal interesting as it is a very pragmatic way of handling the problem, but it can’t apply to everything (software for example, is also under copyright law but wouldnt work well in this context). Also, like most socialized initiatives, it’s no more than a bandage for a problem created by capitalism and the extraction of value along the chain.
My personal view on this topic is a bit more radical and perhaps I need to touch a bit more grass, but for me the only real solution is the destruction of copyright, allowing the free flow of information and forcing companies out of the equation. This last consequence is especially useful since it seems clear to me that the art space will improve once the money incentive is gone. This is not possible to implement in any shape or form nowadays, and maybe never, but I believe that it is the best way to organize ourselves around the question of ownership of information. Having restricted access to something that can be created by chance is not the way forward. Curiosity can be a much better motivator than money.
While one can easily find a myriad of problems in my approach, I want to point to one of the most successful cases in my opinion, and that is FOSS (Free and open-source software). Without getting into the gritty of the different licenses and what you can do with them, I do not think that anyone will complain with the statement that almost all of our current technology infrastructure is thanks to the open source community. That might not be obvious to a person not versed in computer sciences, but, at least in my case, each layer that you go deeper in a technological process, the more open-source software libraries and code that you will find.
This comes from the selflessness of thousands of developers that develop not with profit in mind, but from a feeling of community, since there are a lot of shared problems that don’t need to get solved every time. This concept of building upon what is already there allowed us to keep improving our science and computers. I also believe that the same can be done with art. The most immediate example of this is the technique known as sampling in music production, in which you take the “binary” (in this case, the final master available to the public), and use it as an instrument in your new creation.
Many will find my arguments naive, since, at the end of the day, it was thanks to capitalism and the innovation it promoted that I get to live how I do today. I won’t deny that, but I do believe that it gets way more credit than what it deserves. As Chomsky said 1 in an interview (that I cannot find right now) since universities let researchers file patents for their findings and effectively allowed them to make way more money for their work, their incentives became way different, and the quality and importance of the work done went down since then. The same can be applied to art.
Diving a bit deeper into the sampling example, the current way that it works in the music industry represents many of the things I have been saying, as they are used everywhere and rarely giving credit, since doing so would expose you to a lawsuit from the original composer. Most of the time you can release a song that samples illegally with no problem, as with enough processing it becomes virtually impossible to recognize the original piece, but if the song happens to go viral and attract money, a lot of problems will arise since you are not under current law the rightful owner of your creation. The clearing of samples has been an issue for a long time and has ruined many songs and albums.
I believe sampling, like AI, is a completely legitimate art form, best illustrated by producers like Madlib or Dilla, and while this is a problem that the industry has been dealing with since the inception of hip-hop, another conflict arises now as AI gets to a point where “stem separation” is viable and getting to a point that you will be able to run it real-time with few artifacts. This effectively exposes the source code of the music, and it will be one of the major music innovations of this era.
Comparisons between open software and art are not perfect, but they are much more similar than what you might expect at first. One key difference is in the source code. Having the source code available allows us to modify our copies and have assurances over the security and transparency of the software. Generally, the software you obtain will be in the form of a binary file, written in machine code that can’t be properly understood unless you have a very rare set of skills. Many companies add a layer of obfuscation over that so that their code can’t be reverse-engineered, usually sacrificing the performance of the program in the process. Art is different in the way the source code is always out in the open2 (at the end of the day, the only thing that matter is the binary). The laws around it should reflect that.
There are plenty of people living from open software, providing maintenance or specific features, or donations. The same could be achieved in the art space. Another big advantage of the FOSS community is their strength in numbers, and how they build upon each other’s work. Copyleft licenses are a big part of that, but I always found them a bit too patchy. Forcing people to open source their work is not going to create a collaborative community.
Almost all art is building on something else, artists are mostly a matrix multiplication between influences and personal experiences (with few exceptions). The way our culture and law are structured, it’s not generally well-received when your work is too similar to others. Attributions of inspiration and references are few and rare. That is not the case in FOSS, and attributions are almost always given since there is a regulated way in which you can copy someone else.
To clarify, the artist IS the owner of his work, but once it leaves his hands, he has no control over how it is shared, interpreted, or copied. I believe this makes sense in the current state of things. When the only way to share an image was to carry a canvas or have a painter create an imperfect copy, it was easier to enforce, and, since no one could make a painting like a specific artist could, it didn’t matter as much. But that isn’t the case anymore, since byte-perfect copies are available with 2 clicks, and sharing content is effectively instant and free.
I like to imagine how the world could work if capitalism didn’t rule how we interact with each other, and what are some ways in which we can get closer to that. I admit that it is difficult to understand a place in which you aren’t motivated by pay or status, but by intrinsic values (hopefully good); this kind of thought experiment might not be practical, but, if nothing else, it serves to stick your head out of the capistalism realism bubble.
As a side note, another problem that this system might solve is the reality exposed in this post from DoNotResearch(one of the most exciting institutions nowadays for me). But I leave the possible consequences to the reader.
Edit: Remembered another thing that makes annoys me in the copyright, (extensions)[https://en.wikipedia.org/wiki/Copyright_Term_Extension_Act] allows companies to hold copyright forever, even if the original artists is long dead. That is wrong in my opinion. Another interesting link
Edit 2: As with most things I do, I no longer like this post and it contains many inaccuracies. The main point still remains that wether training data becomes licensed or not will determine the future of much of this dilemma. If it is ruled that copyright applies to training data it will be the end for open source models and Llama2 will be as far as we go, while companies will struggle much more than they do now. Right now it is unclear wich way it will go, but dont forget that the only reason that we have paradigm-changing applications such as GPT-4 it’s because they broke the licensing agreements on most of it’s data. My intention with this post was to show how this problem might force a shift in the way ownership of information is understood, but that message was diluted in my writing.
Edit: As of february 2024 there is still no real legislation on the topics, the EU is working hard on it but doesnt seem like licensing is a major issue for them, in EEUU the pending case of NYT vs OpenAI might be the detonator. Analysis