Contrary to what leading tech companies claim, it is entirely possible to ensure that generative AI models respect copyright and compensate authors when appropriate. Now, regulators need to step up to hold the industry accountable for failing to do so.
SEBASTOPOL, CALIFORNIA – Generative artificial intelligence stretches current copyright law in unforeseen and uncomfortable ways. The US Copyright Office recently issued guidance stating that the output of image-generating AI isn’t copyrightable unless human creativity went into the prompts that generated it. But that leaves many questions: How much creativity is needed, and is it the same kind of creativity that an artist exercises with a paintbrush?
Another group of cases deal with text (typically novels and novelists), where some argue that training a model on copyrighted material is itself copyright infringement, even if the model never reproduces those texts as part of its output. But reading texts has been part of the human learning process for as long as written language has existed. While we pay to buy books, we don’t pay to learn from them.
How do we make sense of this? What should copyright law mean in the age of AI? Technologist Jaron Lanier offers one answer with his idea of data dignity, which implicitly distinguishes between training (or “teaching”) a model and generating output using a model. The former should be a protected activity, Lanier argues, whereas output may indeed infringe on someone’s copyright.
SEBASTOPOL, CALIFORNIA – Generative artificial intelligence stretches current copyright law in unforeseen and uncomfortable ways. The US Copyright Office recently issued guidance stating that the output of image-generating AI isn’t copyrightable unless human creativity went into the prompts that generated it. But that leaves many questions: How much creativity is needed, and is it the same kind of creativity that an artist exercises with a paintbrush?
Another group of cases deal with text (typically novels and novelists), where some argue that training a model on copyrighted material is itself copyright infringement, even if the model never reproduces those texts as part of its output. But reading texts has been part of the human learning process for as long as written language has existed. While we pay to buy books, we don’t pay to learn from them.
How do we make sense of this? What should copyright law mean in the age of AI? Technologist Jaron Lanier offers one answer with his idea of data dignity, which implicitly distinguishes between training (or “teaching”) a model and generating output using a model. The former should be a protected activity, Lanier argues, whereas output may indeed infringe on someone’s copyright.