Leaked Documents Show Nvidia Scraping ‘A Human Lifetime’ of Videos Per Day to Train AI

· Aug 5, 2024 at 12:06 PM

Internal emails, Slack conversations and documents obtained by 404 Media show how Nvidia created a yet-to-be-released video foundational model.

Leaked Documents Show Nvidia Scraping ‘A Human Lifetime’ of Videos Per Day to Train AI — Unsplash / Nvidia logo

🖥️

404 Media is an independent website whose work is written, reported, and owned by human journalists and whose intended audience is real people, not AI scrapers, bots, or a search algorithm. Sign up to support our work and for free access to this article. Learn why we require this here.

Nvidia scraped videos from Youtube and several other sources to compile training data for its AI products, internal Slack chats, emails, and documents obtained by 404 Media show.

When asked about legal and ethical aspects of using copyrighted content to train an AI model, Nvidia defended its practice as being “in full compliance with the letter and the spirit of copyright law.” Internal conversations at Nvidia viewed by 404 Media show when employees working on the project raised questions about potential legal issues surrounding the use of datasets compiled by academics for research purposes and YouTube videos, managers told them they had clearance to use that content from the highest levels of the company.

A former Nvidia employee, whom 404 Media granted anonymity to speak about internal Nvidia processes, said that employees were asked to scrape videos from Netflix, YouTube, and other sources to train an AI model for Nvidia’s Omniverse 3D world generator, self-driving car systems, and “digital human” products. The project, internally named Cosmos (but different from the company’s existing Cosmos deep learning product), has not yet been released to the public.

This post is for paid members only

Become a paid member for unlimited ad-free access to articles, bonus podcast content, and more.

Sign up for free access to this post

Free members get access to posts like this one along with an email round-up of our week's stories.

Account

Navigation

Follow us

Leaked Documents Show Nvidia Scraping ‘A Human Lifetime’ of Videos Per Day to Train AI

This post is for paid members only

Sign up for free access to this post

Account

Navigation

Follow us

Leaked Documents Show Nvidia Scraping ‘A Human Lifetime’ of Videos Per Day to Train AI

Subscribe

This post is for paid members only

Sign up for free access to this post