Search Engines

[Babbages's difference engine]

A different sort of search... for number series...


The search button is the most powerful button ever invented.

Google™ is the desktop of the web and the percentage of time we spend in it every day searching for information is increasing steadily.

Yet it has strange limitations.

*Finding the right keywords to find the content is a skill in itself.
*Sorting through the results is a visual sifting activity unsupported by tools. What we need is for search result sets to be returned in both user visible form and in XML format so that generic filtering and visualization tools can be used to assist the searcher. Gigablast seem to be the first off the block with an XML Search Feed" that can be re-processed.
*The page ranking scheme overlooks many information sources, (such as this site) with no or low inbound links.
*The source form of the content that is searchable is html, and secondarily pdf. Yet most of the content we want, especially in intra-company searches, is not on the Intranet, but locked up in proprietary source formats (Microsoft Word) and in proprietary databases or version control systems.

My current focus is on this last issue.

Search Engine Sightings
* A Search Engine Smorgasbord
Abandoning the security of Google's home page, Dave dips a toe in uncharted waters...

Dave's Search Engines
* Mark I
Dave describes his first effort with the search engine that powers this site...

* Mark II
Dave's Mark II search engine to search a Corporate SourceSafe version control database...

* Mark III
Dave's Mark III search requirements specification...

Google™ is a trademark of Google Inc.


Back to top | ZDS Home | This article updated February 23, 2004.