Internet Anthropologist Think Tank: Billions of pages a day.

In a blog post today Google says they’ve identified 1 trillion unique URLs on the web. It’s actually more, they say, but some web pages have multiple URLs with exactly the same content or URLs that are auto-generated copies of each other.

What they note way down in the fourth paragraph, however, is that they don’t actually index all of those pages, so you can’t find them on Google. Estimates on the true size of the Google index are a mere 40 billion pages or so.

Why don’t they index all the pages they’ve found? Some of them are spam. But it’s also very expensive to index sites. And the fact that Google indexes many news sites, blogs and other rapidly changing web sites every 15 minutes makes all that indexing even more expensive. So they make value judgment on what to actually index and what not to. And most of the web is left out.

Google also says “But we’re proud to have the most comprehensive index of any search engine.”

Even after removing those exact duplicates, we saw a trillion unique URLs, and the number of individual web pages out there is growing by several billion pages per day.

SOURCE:

xxxxxxxxxxxxxxxxxxxxxx

That means there are
1,000,000,000,000 web pages,
__40,000,000,000 Google indexed web pages.
______11,000,000 Terrorist pages indexed in our Terror web site search engine.

So there are more pages not INDEXED than Indexed.

Our sources say there is a bigger search engine , watch for updates.
Update: Its here.

G

Labels: 11 million, 40 billion, index, Orders of magnitude, Search Engines, trillion

Internet Anthropologist Think Tank

Tuesday, July 29, 2008

Billions of pages a day.

0 Comments:

About Me

Previous Posts