07/16/2024

Enterprise

Exa: Redesigning Search for AI

Exa Co-Founders pictured left to right: Jeff Wang and Will Bryk.

We at Lightspeed are excited to announce our investment in Exa, an AI-powered search engine built and optimized for the needs of AI agents.

Lately, we’ve devoted a lot of thought to the coming agentic web, that is to say, new web infrastructure that specifically supports AI agents and their capabilities. This will change the status quo, as the ideal infrastructure for AI agents will look different than the ideal infrastructure for human users.

Why do we need an agentic web? To start, AI agents will need access to up-to-date, accurate information to best complete their tasks. Large language models memorize vast sums of data, but that quickly becomes stale and is not reliably retrievable. Retrieval-augmented generation has emerged as a key paradigm for enabling large language models to reason over information outside of the training corpus, but most implementations today focus on private or internal information. Ideally, AI agents would be able to retrieve information across the entire public internet via API, an important tool in their toolkit. This will require new infrastructure — the agentic web.

Building the agentic web presents many challenges, both technical and economic. To start, existing web infrastructure has degraded due to competing incentives to serve the needs of advertisers rather than users. Traditional search engines have more incentive to drive ad clicks and impressions than generate helpful answers to user requests. Further, savvy website owners have reverse-engineered the signals that the major search engines use to rank websites, leading to an entire industry of “search engine optimization,” which tailors websites to appease the search engines rather than visitors, polluting the results with low-quality content farms.

It turns out, content isn’t always king. Modern search engines fail to distinguish between the thing you are looking for vs. content that merely discusses the thing you are looking for. This distinction might sound subtle, but it’s critical for giving AI agents the best information to reason over. For example, a search for “software engineers proficient in Go” would ideally return the personal websites or social media profiles of such engineers, rather than pages that talk about the Go language. In other words, the ideal search engine would understand the notion of an “entity” as distinct from mere content discussing a general topic area.

AI agents have different needs from humans: the optimal response to a search query coming from an AI is not necessarily the same as one from humans. AI agents don’t need to see ads, they need to see results. They don’t want just the first few pages — they want all the relevant results to take advantage of growing context windows. None of this would be a problem if there was separate search infrastructure for AI agents. Unfortunately, both humans and AIs have been forced to consume the same search results. It’s one size fits all, the worst of all worlds.

This is where Exa comes in. Exa is an embedding-based search engine built for AI agents. Exa ingests and indexes up-to-date content from the web and exposes that data to LLM-based applications via a search API powered by their own novel “link prediction” foundation model specifically tuned for understanding search queries and returning relevant links from its index. Importantly, Exa respects the notion of an entity — a query for “top open source AI models” would actually return links to Mistral, Llama, etc, whereas the same query on Google returns websites that talk about open source AI but rather than the particular models themselves.

Exa returns exactly what you ask for — in this case, actual startups.

Exa solves massive technical challenges at scale. The team has built and tuned its own web crawling infrastructure for ingesting information from across the web and incorporating into its growing index. Exa’s crawlers ingest data across the web to identify the highest-quality parts of the internet. In addition to their custom foundation model, Exa built their own unique vector database in Rust, designed to scale to large query volumes across billions of documents at low latency.

Will Bryk and Jeff Wang, co-founders of Exa, are up to the task. They’ve been friends and collaborators since their days studying computer science at Harvard, hacking on problem sets and technical projects late into the night. Will then went on to be an early engineer at Cresta, where he developed a reputation for being a learning machine and creative problem solver. Jeff cut his teeth at the Plaid, where he was known for his incredible engineering velocity and ability to find product-market fit for new projects.

With Exa, the internet is new and exciting again. Exa gives us a real sense of nostalgia for the old internet, before all the cruft that’s developed over time. AI demands a refresh. We’re excited to lead Exa’s $17M Series A round alongside Nvidia Ventures and Y Combinator.  

Exa is on an incredible mission to redesign the internet for AI. Come be part of it here – Exa is hiring across all roles, including ML research, engineering, and operations.

Lightspeed Possibility grows the deeper you go. Serving bold builders of the future.