Searcha page DIY search engine rivals Google's early storage

In 1998, Google launched its search engine, initially named Backrub, operating on a Stanford campus server with 40 GB of data and housed in a case made of Duplo blocks. As of 2025, Google’s search capabilities require multiple data centers.

Ryan Pearce has created a DIY search engine called Searcha Page, including a privacy-focused version named Seek Ninja, with the server located in his laundry room alongside his washer and dryer. Pearce states, “Right now, in the laundry room, I have more storage than Google in 2000 had. And that’s just insane to think about.”

The server was initially in Pearce’s bedroom but was moved to the utility room due to excessive heat. “The heat hasn’t been absolutely terrible, but if the door is closed for too long, it is a problem,” he says.

Searcha Page’s results are improving, with its database containing 2 billion entries, expected to reach 4 billion within six months. In comparison, Google had 24 million pages in 1998 and 400 billion by 2020, as revealed during the U.S. v. Google LLC antitrust trial.

Pearce’s engine uses large language models for keyword expansion and context understanding. “What I’m doing is actually very traditional search,” Pearce says. “It’s what Google did probably 20 years ago, except the only tweak is that I do use AI to do keyword expansion and assist with the context understanding, which is the tough thing.”

AI has been a key part of search engines, including tools like reverse image search, Google’s RankBrain, and Bing’s 90% ML-driven results in 2019. AI is now seen as a way to build and scale search engines efficiently.

Pearce utilizes “upgrade arbitrage,” purchasing old but powerful server hardware. His 32-core AMD EPYC 7532 CPU, which cost over $3,000 in 2020, now costs under $200 on eBay. “I could have gotten another chip for the same price, which would have had twice as many threads, but it would have produced too much heat,” he says.

The entire system cost $5,000, with $3,000 spent on storage. Pearce’s codebase is around 150,000 lines of code, with an estimated 500,000 lines of iterative work.

Searcha Page and Seek Ninja use SambaNova for speedy access to the Llama 3 model at a low cost. Annie Shea Weckesser, SambaNova’s CMO, notes that access to low-cost models is increasingly becoming essential for solo developers like Pearce, adding that the company is “giving developers the tools to run powerful AI models quickly and affordably, whether they’re working from a home setup or running in production.”

Pearce uses the Common Crawl repository to build his crawler. “I really appreciate them. I wish I could give them back something, but maybe when I’m bigger,” he says.

An initial attempt to use a vector database failed, resulting in “very artistic” results. Pearce now uses LLM-generated summaries of pages. Wilson Lin, another DIY search engine developer, uses a self-created vector search tool called CoreNN and relies on nine separate cloud services to keep costs low. “It’s a lot cheaper than [Amazon Web Services]—a significant amount,” Lin says. “And it gives me enough capacity to get somewhere with this project on a reasonable budget.”

Pearce originally envisioned a small-site search engine similar to Marginalia, favoring small sites over Big Tech. “Someone from China actually reached out to me because . . . I think he wanted an uncensored search engine that he wanted to feed through his LLM, like his agent’s search,” he says.

Expanding beyond English would require new datasets. Pearce plans to move the search engine to a colocation facility once traffic reaches a certain threshold and is generating modest revenue through affiliate-style advertising.

“My plan is if I get past a certain traffic amount, I am going to get hosted,” Pearce says. “It’s not going to be in that laundry room forever.”

The application deadline for Fast Company’s Most Innovative Companies Awards is Friday, October 3, at 11:59 p.m. PT.

Searcha page DIY search engine rivals Google’s early storage

Aytun Çelebi

Related Posts

Apple overhauls Siri for iOS 27 WWDC plans AI chatbot reveal

X copies Bluesky’s Starterpacks, rolls out curated account lists

Apple targets 20M AI pin units by 2027 battles OpenAI hardware

Microsoft launches Xbox app on all Arm-based Windows 11 PCs

LATEST

Meta unleashes Threads ads globally across 400 million users

Apple overhauls Siri for iOS 27 WWDC plans AI chatbot reveal

X copies Bluesky’s Starterpacks, rolls out curated account lists

Apple targets 20M AI pin units by 2027 battles OpenAI hardware

Microsoft launches Xbox app on all Arm-based Windows 11 PCs

Adobe supercharges Acrobat packs with 12 new AI editing tools

Blue Origin challenges Starlink with multi-orbit TeraWave internet system

YouTube TV launches fully customizable multiview for all live channels

Anthropic revises Claude’s Constitution with 80 new pages of AI ethics

Türkiye competition authority raids Temu offices

© 2021 TechBriefly is a Linkmedya brand.

Searcha page DIY search engine rivals Google’s early storage

Related Posts

LATEST

© 2021 TechBriefly is a Linkmedya brand.

Follow Us