Advertisement
Announcements

We Need Your Email Address

AI stealing our work. The collapse of social networks. The need to pay journalists to produce impactful journalism. Here is why we are asking for your email address to read 404 Media.
Left to right: Sam Cole, Jason Koebler, Joseph Cox, Emanuel Maiberg.
Image: Sharon Attia.

You might have noticed that over the last few days, we’ve begun requiring readers to sign up with email addresses to read most of the articles on our website. You may be wondering why we did this and what we are trying to sell you. The short answer is that we are primarily trying to sell you our journalism by putting articles that we think you will care about in your inbox, in the hopes that you will one day read something that convinces you that our work is worth paying for. 

The long answer is that, through our own reporting, we are realizing that in order to combat the fracturing of social media platforms, a Google discoverability crisis fueled by AI generated spam and AI-fueled SEO, and a media business environment that is in utter freefall, we need to be able to reach our readers directly using a platform that we own and control. To do that, we need your email address. 

One of the most difficult decisions we had to make when launching 404 Media in August was whether we were going to have a “hard paywall” on every article, meaning that you can only read the site if you pay us money. So far, for the five months this company has existed, we have erred on the side of making almost all of our work available for free with no wall of any kind. We did it that way because ideally we would like our work to reach as many people as possible with as little friction as possible and we want our work to be impactful, which is often easier when more people read it (we are working on a fix for our paid, full-text RSS feeds). In recent weeks, however, our own journalism has made it clear to us that we needed to become more aggressive about asking our readers for their email addresses, which is something that we were sheepish about at first but have decided is essential to keeping this website running–and we hope you’ll come along with us for the journey. 

We started 404 Media in August 2023 with no funding from anyone, with the simple idea that through some mix of paid reader subscriptions, advertising, and turning our work into podcasts, documentaries, and other multimedia projects, we would be able to make enough money to survive and keep publishing journalism that matters. We are happy to report that five months into this experiment, we are not going anywhere soon, and feel energized by the support of our subscribers and readers. We are endlessly thankful to the people who have become paid subscribers (who get bonus audio and written content every week, as well as unlimited, ad-free access to our articles). We couldn’t do this without you and we want and need you to keep telling people about our work. But at the same time, the floor could fall from underneath any of the ways we bring in revenue, showing the need to constantly diversify.

We are human beings who spend all day talking to sources, reading research papers, spelunking the internet, and generally trying to do a mix of important, fun, heartbreaking, infuriating, and eye-opening work that we want to be read and enjoyed by other human beings. We are not part of a large corporation or a large organization, and there is no fallback plan for any of us. 

Who We Are
Who is 404 Media?

Before we get to our findings about how AI is ripping off our website: In the free version of our newsletter, we sometimes also include a paid sponsorship from a company. You may have already seen this—we had one with the privacy company DeleteMe, for instance. These are non-invasive and non-tracking, beyond “X number of people clicked the DeleteMe link.” The money these bring in are to cover our bases for when people unsubscribe (not everyone can subscribe to support our outlet, don’t worry, we understand) or other factors come into play. The vast, vast majority of our money comes from our paid subscribers. And as we’re writing this, we don’t have a sponsor for the newsletter for January or February anyway. Showing, again, the need to diversify where our money comes from. Paid subscribers never see these sponsorships.

So when you give us your email, you’ll receive an email with our biggest story of the day, as well as our round-up on Fridays which showcases our work and includes behind-the-scenes content for paying subscribers. You’ll also periodically receive emails that highlight what we’re all about: the impact we’ve made, why we left corporate media to launch an independent company, etc. Now for the AI stuff.

AI Spam Is Eating the Internet, Stealing Our Work, and Destroying Discoverability

In December, we noticed that articles we spent significant amounts of time on—reporting that involved weeks or months of research, talking to and protecting sources, filing public records requests, paying for and parsing those records, hours or days of writing, editing, and packaging—were being scraped by bots, run through an AI article “spinner” or paraphraser, and republished on random websites. 

Sam’s investigation into the inclusion of child sexual abuse material in the LAION large language model, a hugely important and sensitive story that we ultimately worked on over the course of nearly a year before we even launched, consulted with a lawyer on, and spoke to many experts for, quickly became an article called “They Delete A Database To Train AI Generative Images To Contain Child Sexual Abuse Material” on a website called “Nation World News.” Jason's scoop about a Russian stowaway became “LAX Passenger Arrives on International Flight Without Passport, Visa, Ticket, Report Says” on the Clayton County Register, another site full of AI cloned articles. Emanuel’s lighthearted interview with John Hittler became “The Man With the ‘worst Last Name In Human History’ Reveals How He Discovered Its Benefits” on “Nation World News” and, separately, “How The Man With the Worst Last Name in Human History Discovered Its Advantages” on “World Nation News,” a totally different website. Joseph’s article about how AI-generated plagiarism is showing up all over Google News, while our articles are not, was quickly picked up by a website called “Digital Information World” in a completely illegible, obviously AI-generated article called “AI-Produced Content Is Being Marketed Across Google News And The Company Is Aware Of It,” apparently written by Dr. Hura Anwar, a dental surgeon who publishes articles on the website roughly every six minutes, all day every day. Digital Information World is, of course, indexed by Google News.

This problem is going to get worse, not better. Over the last few weeks Jason has been researching and experimenting with a series of AI tools that promise to “spin” articles for their users. One, called SpinRewriter, lets users create 1,000 slightly different versions of the same article with a single click and to automatically publish them to as many WordPress sites as you want using a paid plugin. It also offers a tool that lets users manage as many websites as they want from a single dashboard. A company called Byword gleefully advertises the “SEO heist” that “stole 3.6M total traffic from a competitor” with this One Weird Trick (exporting the competitor’s sitemap and creating AI generated versions of 1,800 of their articles). 

A screenshot of Byword explicitly telling users to paste their “competitors' article URLs.”

Jason signed up for a Byword account, fed it the URLs of some of our articles, and was able to instantly generate articles based on them. They were not good, but they were article-shaped and came with AI-generated images. Byword also allows you to use AI to generate social media posts about the articles. Byword can connect to WordPress, has a feature where you can “Generate articles by scraping lists of your competitors’ URLs,” and is planning to launch a tool that will allow people to generate articles based directly on the sitemap of the website they’re trying to “compete with.” This is all powered by GPT-4, and larger operations require an OpenAI API key, which is particularly notable given OpenAI’s apparent respect for the craft of journalism. Byword explains on its website that Google doesn’t actually care about AI generated articles and that you will not be penalized for using AI to generate articles en masse, a position the company probably feels comfortable noting because that is basically what Google itself told us when we pointed out that AI-generated articles were showing up on Google News. Google told us in a statement at the time that it focuses on the quality of articles on Google News, rather than how it was produced; that is, by a human or an AI (despite the AI generated stuff on Google News that we found also being of a shit quality!)

@404.media

google is totally borked, have you noticed?

♬ original sound - 404 Media - 404 Media

Requiring an email address to read our articles has, for the moment, stopped our content from being scraped and repurposed by AI. It will also, we hope, serve as a preventative measure against the impacts of the internet being flooded by all of this AI-generated drek. We are worried that a flood of low-quality, AI-generated bullshit—articles written by robots to appease a robotic search ranking algorithm—is going to drown out what we do, and make it harder to organically find our work. This idea is not a random thesis we have, or paranoia about AI progress. The authors of a study about the general degradation in the search results of Google, Bing, and DuckDuckGo warn that “AI will only exacerbate the problem,” and told Jason in an email “ChatGPT is absolutely capable of producing content that is indistinguishable from what we typically classified as a ‘content farm.’ AI can easily serve as an accelerator for this, probably making it even harder for search engines to produce good result pages.” Being able to email you our articles will let us bypass Google altogether.

Journalism For Humans By Humans

In case this is your first time on our website: 404 Media is made up of four human tech journalists who formerly worked at VICE’s Motherboard. We have each worked in the field for more than 10 years, and we have all done work that has led to Congressional investigations, hundreds of millions of dollars of fines against companies big and small, consumer rights laws being passed, etc. Each of us have also, at multiple points in our careers, sat in meetings with SEO consultants who have suggested that we make sure our articles are optimized to be read by robots so that they will rank higher on Google. SEO is endlessly boring and we are all bored typing about it and none of us have any desire to spend time making sure we write optimized subheadlines, publish “anchor pages” that have a series of internal links, and stack keywords to ensure that an algorithm thinks our article is good. Worse, we have no interest in selecting which articles to write about based on whether an SEO consultant’s spreadsheet thinks a high ranking result on well-searched topic is “winnable,” which is a thing that happens all the time and that many outlets use to help dictate their coverage. Playing the SEO game to some extent has become a necessary chore for journalists to get their work discovered by readers, and to stand out and compete within the online news environment that’s increasingly being dog-walked by opaque algorithms that prefer ad-flooded websites and machine friendly keyword trash over real, human-created writing and reporting.

This brings us to social media platforms, which we will keep posting to but which are an increasingly unreliable way to reach readers because of mass fragmentation, ever-changing algorithms that favor engagement, and the fact that Elon Musk has driven Twitter into the ground. In our careers we have seen business models and editorial strategies rapidly change based on whatever Facebook’s algorithm happens to be doing on any given day, and we have seen scores of our colleagues lose their jobs because those business models were irrevocably intertwined with an ever-changing Black Box algorithm and the whims of a gigantic tech company. 

After we began to require email registration we have gotten criticism from a small but very vocal group of people about how this wall is gross, manipulative, or otherwise coercive. You are welcome to feel this way if you want to, but know that newspapers are failing at a rate of two per week, nearly every one of our friends has been laid off at some point over the last few years, that there are very few viable journalism jobs left in this country and that many of them are at the New York Times, which is a very good newspaper but is increasingly marching toward being the overwhelmingly dominant media outlet, and can’t and will not do many of the stories that we publish. We are unwilling to sit idly by and let AI endlessly scrape and repackage our work, or clueless media executives drive companies we have worked for into the ground, and are trying to strike a balance where we can monetize our work in a way that is not annoying and allows us to make enough revenue to continue to do the work we do.

Advertisers Don’t Want Sites Like Jezebel to Exist
The ‘Brand Safety’ and ‘Suitability’ industries have financially crushed the news business by keeping ads away from articles that its ‘sentiment analysis’ algorithms think will make people sad or upset.

When sites like Buzzfeed, Gawker, and VICE were growing, there was the pervasive dream that it would be possible to make huge sums of money by simply getting as many pageviews as possible and advertising against it. Overall, this strategy has not worked for a variety of systemic, technological, and mismanagement reasons, as evidenced by the fact that nearly every digital media company that chased scale has lost vast sums of money, have collectively laid off thousands of talented journalists, and have watched ad revenues dry up as Facebook and Google dominate the industry and ad-tech platforms systematically drive down rates on good journalism with concepts like “brand safety.” While we would like as many people as possible to read our articles and for the experience to be as frictionless as possible, we are unwilling to make the exact same mistakes as failing media companies that have convinced both readers and their staffs that their work is worthless and their labor is expendable and interchangeable. We are unwilling to gamble our livelihoods or our company’s future on the idea that we can build a successful business model by focusing exclusively on collecting a fraction of a fraction of a fraction of a penny every time someone clicks one of our articles. 

We do have some programmatic ads on the site, which paid members do not see, but they bring in pennies. The purpose of them is more to make ends meet if something happens to another of our revenue streams, and so we can cover essentials like our podcasting software, for example. Having these ends-meeting ads doesn’t mean we shy away from covering the advertising industry, either. If anything, we’re more aggressive than anyone, exposing the brand safety industry for the journalism destroying machine it is, and investigating shady ad-based surveillance platforms (with our work then directly leading to connected companies being cut-off). 

We have found in the last week that when we require people to give us their email addresses, our email list grows (obviously), and more people quickly become paid subscribers. (In part, this might be because when you sign up, our site will send you a welcome email showing you all of the changes we’ve made in the world in just a few months with our journalism). This email list makes our journalism and our company more economically viable, and allows us to keep doing the work that we’re doing, at a time when good websites are shutting down and good journalists are being laid off for reasons like “We were notified by Authentic Brands Group (ABG) that the license under which the Arena Group operates the Sports Illustrated (SI) brand and SI related properties has been officially revoked by ABG. As a result of this license revocation, we will be laying off staff that work on the SI brand,” and “the billionaire who owns this company doesn’t understand it and is tired of losing money.” 

So, again: We need your email address, because it is, at the moment, the way that we can best ensure that real people like yourself see and read our work. You can enter it below.

Advertisement