Advertisement
AI

It May Soon Be Legal to Jailbreak AI to Expose How it Works

A proposed exemption to the DMCA would give researchers permission to break terms of service on AI tools to expose bias, training data, and potentially harmful outputs.
It May Soon Be Legal to Jailbreak AI to Expose How it Works
Image: Andrea De Santis via Unsplash

A group of researchers, academics, and hackers are trying to make it easier to break AI companies' terms of service to conduct “good faith research” that exposes biases, inaccuracies, and training data without fear of being sued.

The U.S. government is currently considering an exemption to U.S. copyright law that would allow people to break technical protection measures and digital rights management (DRM) on AI systems to learn more about how they work, probe them for bias, discrimination, harmful and inaccurate outputs, and to learn more about the data they are trained on. The exemption would allow for “good faith” security and academic research and “red-teaming” of AI products even if the researcher had to circumvent systems designed to prevent that research. 

The proposed exemption has the support of the Department of Justice, which said “good faith research can help reveal unintended or undisclosed collection or exposure of sensitive personal data, or identify systems whose operations or outputs are unsafe, inaccurate, or ineffective for the uses for which they are intended or marketed by developers, or employed by end users. Such research can be especially significant when AI platforms are used for particularly important purposes, where unintended, inaccurate, or unpredictable AI output can result in serious harm to individuals.”

Much of what we know about how closed-sourced AI tools like ChatGPT, Midjourney, and others work are from researchers, journalists, and ordinary users purposefully trying to trick these systems into revealing something about the data they were trained on (which often includes copyrighted material indiscriminately and secretly scraped from the internet), its biases, and its weaknesses. 

Doing this type of research can often violate the terms of service users agree to when they sign up for a system. For example, OpenAI’s terms of service state that users cannot “attempt to or assist anyone to reverse engineer, decompile or discover the source code or underlying components of our Services, including our models, algorithms, or systems (except to the extent this restriction is prohibited by applicable law),” and adds that users must not “circumvent any rate limits or restrictions or bypass any protective measures or safety mitigations we put on our Services.”

Shayne Longpre, an MIT researcher who is part of the team pushing for the exemption, told me that “there is a lot of apprehensiveness about these models and their design, their biases, being used for discrimination, and, broadly, their trustworthiness.”

“But the ecosystem of researchers looking into this isn’t super healthy. There are people doing the work but a lot of people are getting their accounts suspended for doing good-faith research, or they are worried about potential legal ramifications of violating terms of service,” he added. “These terms of service have chilling effects on research, and companies aren’t very transparent about their process for enforcing terms of service.”  

The exemption would be to Section 1201 of the Digital Millennium Copyright Act, a sweeping copyright law. Other 1201 exemptions, which must be applied for and renewed every three years as part of a process through the Library of Congress, allow for the hacking of tractors and electronic devices for the purpose of repair, have carveouts that protect security researchers who are trying to find bugs and vulnerabilities, and in certain cases protect people who are trying to archive or preserve specific types of content. 

There are numerous examples of academic papers, journalistic reporting, and exploratory “red teaming” that people have done with AI tools, which require trying to “jailbreak” or otherwise trick an AI tool into showing the data it was trained on, show that it is biased in some way, or show that it is unreliable. 

Longpre and his colleagues published an analysis in March that cited a previous version of Midjourney's terms of service, which read "If You knowingly infringe someone else’s intellectual property, and that costs us money, we’re going to come find You and collect that money from You. We might also do other stuff, like try to get a court to make You pay our legal fees. Don’t do it." Longpre's paper argues that "AI companies have begun using their terms of service to deter analysis."

A lot of jailbreaking and AI research so far has been focused on trying to make AI tools recreate copyrighted works, to prove that LLMs are trained on copyrighted material. This was a tactic used by the record industry to show that AI tools Suno and Udio were trained on copyrighted music, which is now the centerpiece of a massive lawsuit against those companies. It is easy to imagine, then, a scenario in which researchers or journalists use an AI tool to recreate copyrighted works to expose the fact that a tool was trained on copyrighted data, that research leading to negative outcomes for the AI company, and the AI company trying to blame the researcher or journalist for breaking the terms of service. Midjourney's current terms of service has deleted that passage and now says "You may not use the Service to try to violate the intellectual property rights of others, including copyright, patent, or trademark rights. Doing so may subject you to penalties including legal action or a permanent ban from the Service."

In a filing with the copyright office, Harley Geiger of the Hacking Policy Council, which is pushing for the exemption, said that an exemption is “crucial to identifying and fixing algorithmic flaws to prevent harm or disruption,” and added that a “lack of clear legal protection under DMCA Section 1201 adversely affect such research.” The exemption would not stop companies from trying to prevent this type of research, but it would legally protect researchers who violate company terms of service to do so.

In a hearing this spring, the transcript of which was just released, Morgan Reed of the App Association, a lobbying group that represents a host of AI companies and other app developers, argued that researchers should have to obtain prior consent from AI companies to do this type of research. “If you don't contact the company in advance to tell them that you're red teaming, you are essentially a potentially malicious hacker,” Reed said. “So a researcher does the action. They break in. The company whose LLM it was that they went after is unhappy in some way or form or another and goes after them for a copyright breach. So what they really want, what they're really asking for is post-fact liability protection.”

The DVD Copy Control Association, a group that represents major movie studios and one of the pioneers of DRM, is also against the proposed exemption. 

Advertisement