Category: All posts
Jul 11, 2025
Posted by
Jacky Liang
Every conversation about AI and LLM apps eventually lands onto the same buzzwords: retrieval augmented generation (RAG), vector databases, context engineering (yet another new term!!!), prompt engineering (apparently this is out now?), etc.
Every few months, we get new words that make the last one obsolete.
Strip away the VC-approved, Twitter-140-char-friendly jargon and you'll find something simpler underneath…
AI is just search.
That’s it.
Unfortunately, the tech industry got drunk on vector databases thinking they could solve everything. Two years later after the 2023 vector DB investment peak, companies are learning that similarity != relevance, and sometimes, good ol’ lexical search destroys semantic similarity.
Building a coding agent? A customer support chatbot? E-commerce search? Different problems need different search techniques.
There’s a reason Claude Code is nibbling at Cursor’s market share, so much that they literally hired the Claude Code team—it’s all in its search.
Before we get into why AI is simply just search, we need to first walk through the history of why AI and LLMs need search at all.
Large language models are trained up to a certain date, known as the cut-off date, meaning their training data doesn’t include information after a certain point in time. Even the most recently released models like Claude Sonnet 4 have a cut-off date of March 2025.
Cut-off date aside, now what if you want to ask an LLM today’s weather? Ask about how your company won the most recent deal (where the data is in Slack and Salesforce, not in the public)? Why did Timescale change their name to TigerData?
You now need external data.
Retrieval augmented generation (RAG) with vector search became the default for adding external data to LLMs that do not implicitly have access to data up to a certain point OR private/proprietary data that wasn’t part of their training.
Vector search promised to solve this “not enough information” problem by finding information that is most semantically similar to the question.
Obviously, like all good hype cycles the industry ran with it because it sounded cool and AI-native. Companies like Pinecone (I’ve worked here btw), Weaviate, Qdrant—all rode this AI wave and raised massive rounds in late 2023 because folks believed vector search could handle any workload.
Vector search and vector databases became the go-to solution for all your external AI data problems. Embedding model providers like Voyage AI also rode this wave because you need embedding models to translate text to their semantic mathematical representations (vectors).
So now, the entire tech industry believes AI apps = vector databases. You NEED a vector database for every AI app.
Two years later in 2025, the climate for vector database companies looks... rough. I should know, I personally lived through the downturn firsthand at such a vector database company.
And the reason is pretty simple…
Turns out, vector databases actually aren't THE solution for everything. There are inherent limitations and downsides to using/implementing/maintaining vector databases that the industry is finally discovering.
Vector search gives you "most similar" stuff, but not necessarily "most relevant" stuff. This is especially painful when it comes to coding, or any use case that requires specificity.
When coding, if you're searching for getUserById
, you need an exact match of the function name. getUserById
is an identifier, not a concept—but vector search might return findUserByEmail
, updateUserProfile
, or deleteUserAccount
because they're semantically similar. Close enough for conversational use cases; completely wrong for code.
In customer support, when you need the manual for part "P/N 4B0-959-855-A", you need that exact document. "P/N 4B0-959-855-A" is a reference number, not meaningful text—but vector search gives you the top 10 most semantically similar part numbers like "4B0-959-855-B" or "4B0-959-856-A", which is useless when you're trying to fix a broken machine.
For e-commerce, searching for Nike SKU "DQ4312-101" should return that exact product first. "DQ4312-101" is a product code, not descriptive content—but vector search might surface "DQ4312-102" (wrong colorway) or "DQ4311-101" (different shoe entirely) because the numbers are similar. Costly mistakes if you're shipping out wrong sneakers, times 1000.
When searching for "Dark Side of the Moon" on Spotify, you want the exact Pink Floyd album, not similar song names like Kelly Clarkson's "Dark Side" or "The Killing Moon" by Echo & the Bunnymen.
Vector search should not be applied to text where semantic similarity is irrelevant.
Claude Code uses pure lexical search (keyword matching) instead of vector search when searching for relevant context (such as, where is this function defined? What files import this module? How is this API endpoint implemented?), and the results speak for themselves.
As someone who used Cursor for 12 months (thank you Zack Proser for intro-ing me to it), was one of the biggest Cursor simps, and swore AI coding couldn't get better—I canceled my Cursor sub this week. Did not think I would ever do this.
But… Claude Code is that much better.
With Cursor, you constantly need to manually tag files using @ symbols because it often can't find the right context on its own. You need to know your codebase exceptionally well just to help the AI understand what's relevant.
One big reason why people love Claude Code is because it finds the right files automatically, you don’t need to manually tag a bunch of folders and files. In large or codebases new to you, this is especially beautiful of an experience.
Claude Sonnet 4 and Opus 4 in Claude Code don’t guess.
They search in a surgically precise way using good ol’ grep, a 50-yr-old utility.
For example—need to find React components using hooks?
grep -r "useState\|useEffect" --include="*.jsx" --include="*.tsx"
Need files importing a specific module?
grep -r "import.*react-router" --include="*.js"
Claude Code goes one step further in its lexical search implementation.
Claude will keep searching for matches (AKA agentic search) until it either finds what it needs OR rules out that no such dependency or function exists. Only then does it write code, knowing it/you haven’t already written it elsewhere—preventing spaghetti code and redundant implementations, a very common problem Cursor’s agent does.
For coding, similarity != relevance. Similarity is fuzzy; relevance is precise and exact.
Note: agentic search for coding agents is not new (relevant reading 1, 2, 3), but I’d say Claude Code perfected it.
Evidence suggests Cursor’s team most definitely agrees Claude Code is better, because they literally hired two of Claude Code’s leads Boris Cherny and Cat Wu to join them in July 2025.
Hmm…
I made a prediction that Cursor may ditch vector search for code entirely (they currently use turbopuffer as their vector database), and just use lexical search entirely. 450,000 impressions on LinkedIn later, I think this prediction may not be so off-base.
"Okay Jacky," you may say, "this all sort of makes sense, but how does this help me and my product at all?"
Good question!
If there are things you should take from reading this piece, it's the following:
And here’s the thing—most real-world AI apps actually need both lexical and vector approaches working together.
This is called hybrid search, and it’s where the industry is heading.
In the coming articles, I’ll show you how to build multi-search systems using Postgres that combine the surgical precision of lexical search, fuzzy precision of full text search, with semantic understanding of vector search.