For the last couple of weeks, Perplexity, the “AI-powered answer engine,” has been criticized for ignoring long held standards about what kind of data it can scrape to power its AI, not properly crediting sources, and lying about it. This is, by now, standard practice for the generative AI industry, but should be especially unsurprising given the origin story of Perplexity, which included creating a series of fake accounts and AI-generated research proposals to scrape Twitter, as CEO Aravind Srinivas recently explained on the Lex Fridman podcast.
According to Srinivas, all he and his cofounders Denis Yarats and Johnny Ho wanted to do was build cool products with large language models, back when it was unclear how that technology would create value. Inspired by early Perplexity investor Elad Gil, who was excited by the idea of a company that could disrupt Google Search, Srinivas and his cofounders decided to prove the value of the company by creating a tool that would allow users to search Twitter in a way they haven’t been able previously.
Perplexity built a tool that translated natural language to SQL queries, a programming language for managing and processing information in databases. This would allow users to easily surface information about Twitter by asking their AI questions in plain English. A couple of examples Srinivas gave on the podcast is “Who is Lex Fridman following that Elon Musk is also following,” or “what are the most recent tweets that were liked by both Elon Musk and Jeff Bezos.” This is data that is on Twitter, but that users couldn’t easily find. Now they can, thanks to a Perplexity demo it called “Bird SQL.”