Roomba Floor Vac Robot US$199.95

Would you like someone else to sweep the kitchen floor? Let the Roomba™ Floor Vac handle that! Like a pool cleaner, this innovative robot moves automatically to vacuum and sweep up dirt, dust, spilled cereal and food crumbs from short-pile carpet, hardwood floors and kitchen tile.
Just press a button to tell Roomba what size room to clean. Once set, the cordless,

rechargeable Roomba goes to work — navigating around obstacles, protected by its non-marring bumper and guided by infrared sensors. A side brush thoroughly cleans next to walls and hard-to-reach places.

An included device creates an invisible infrared "wall" to keep Roomba from crossing open areas as wide as 20 feet! Extra infrared "virtual wall" unit is available (IR107). One charge of the included NiMH battery gives you enough power to clean up to three medium-size rooms. Extra battery available (IR106). Measures 13 3/4" diameter x 3 3/4" high and weighs 7 1/2 lbs. 90-day warranty.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


 

All Robots This page is a test page, trying to attract robots. Please ignore everything below:

www.internetkiosks.com.au if you are looking for Robot kits then Click Here

This is the site map of Abacus Internet Kiosks Pty Ltd

HOME

CD ROM

KIOSKS

POS

REVIEWS

GAMES

LINKS

JOBS

ROBOTICS

PIERCINGS

ONLINE SHOPPING

MUSIC

ANTIQUES

Use this generic robot to set rules for all web bots.

Ahoy! The Homepage Finder

ahoy

maintenance

Ahoy! is an ongoing research project at the University of Washington for finding personal Homepages.

www.cs.washington.edu/research/ahoy/doc/home.html

Alkaline

AlkalineBOT

indexing

Unix/NT internet/intranet search engine.

http://www.vestris.com/alkaline ananzi

EMC Spider

indexing

Arachnophilia

Arachnophilia

The purpose (undertaken by HaL Software) of this run was to collect approximately 10k html documents for testing automatic abstract generation.

ArchitextSpider

ArchitextSpider

indexing, statistics

Its purpose is to generate a Resource Discovery database, and to generate statistics. The ArchitextSpider collects information for the Excite and WebCrawler search engines.

Architext Software

ASpider (Associative Spider)

ASpider/0.09

indexing

ASpider is a CGI script that searches the web for keywords given by the user through a form.

AURESYS

AURESYS/1.0

indexing,statistics

The AURESYS is used to build a personnal database for somebody who search information. The database is structured to be analysed. AURESYS can found new server by IP incremental. It generate statistics...

http://crrm.univ-mrs.fr BackRub

BackRub/*.*

indexing, statistics

Big Brother

Big Brother

maintenance

Macintosh-hosted link validation tool.

Francois Pottier

BlackWidow

BlackWidow

indexing, statistics

Started as a research project and now is used to find links for a random link generator. Also is used to research the growth of specific sites.

bright.net caching robot

Die Blinde Kuh

caching

BSpider

bspider

indexing

BSpider is crawling inside of Japanese domain for indexing.

not yet CACTVS Chemistry Spider

CACTVS Chemistry Spider

indexing.

Locates chemical structures in Chemical MIME formats on WWW and FTP servers and downloads them into database searchable with structure queries (substructure, fullstructure, formula, properties etc.).

Checkbot

Checkbot/x.xx LWP/5.x

maintenance

Checkbot checks links in a given set of pages on one or more servers. It reports links which returned an error code.

CMC/0.01

CMC/0.01

maintenance

This CMC/0.01 robot collects the information of the page that was registered to the music specialty searching service.

http://www2.next.ne.jp/cgi-bin/music/help.cgi?phase=robot Combine System

combine

indexing

An open, distributed, and efficient harvester.

http://www.ub2.lu.se/~tsao/combine.ps ComputingSite Robi/1.0

robi

indexing,maintenance

Intelligent agent used to build the ComputingSite Search Directory.

Tecor Communications S.L.

http://www.computingsite.com/robi/ Conceptbot

conceptbot

indexing

The Conceptbot spider is used to research concept-based search indexing techniques. It uses a breadth first seach to spread out the number of hits on a single site over time. The spider runs at irregular intervals and is still under construction.

http://www.aptltd.com/~sifry/conceptbot CS-HKUST WISE: WWW Index and Search Engine

CS-HKUST-IndexServer/1.0

Its purpose is to generate a Resource Discovery database, and validate HTML. Part of an on-going research project on Internet Resource Discovery at Department of Computer Science, Hong Kong University of Science and Technology (CS-HKUST).

CyberSpyder Link Test

cyberspyder

link validation, some html validation

CyberSpyder Link Test is intended to be used as a site management tool to validate that HTTP links on a page are functional and to produce various analysis reports to assist in managing a site.

http://www.cyberspyder.com/cslnkts1.html DeWeb(c) Katalog/Index

Deweb/1.01

indexing, mirroring, statistics

Its purpose is to generate a Resource Discovery database, perform mirroring, and generate statistics. Uses combination of Informix(tm) Database and WN 1.11 serversoftware for indexing/ressource discovery, fulltext search, text excerpts.

Die Blinde Kuh

Die Blinde Kuh

indexing

The robot is use for indixing and proofing the registered urls in the german language search-engine for kids. Its a none-comercial one-woman-project of Birgit Bachmann living in Hamburg, Germany.

http://www.blinde-kuh.de/robot.html (german language) DienstSpider

dienstspider/1.0

indexing

Indexing and searching the NCSTRL(Networked Computer Science Technical Report Library) and ERCIM Collection.

Antonis Sidiropoulos

Digimarc MarcSpider

Digimarc WebReader/1.2

maintenance

Examines image files for watermarks. In order to not waste internet bandwidth with yet another crawler, we have contracted with one of the major crawlers/seach engines to provide us with a list of specific URLs of interest to us. If an URL is to an image, we may read the image, but we do not crawl to any other URLs. If an URL is to a page of interest (ususally due to CGI), then we access the page to get the image URLs from it, but we do not crawl to any other pages.

Digimarc Corporation

http://www.digimarc.com/prod_fam.html Digimarc Marcspider/CGI

Digimarc CGIReader/1.0

maintenance

Similar to Digimarc Marcspider, Marcspider/CGI examines image files for watermarks but more focused on CGI Urls. In order to not waste internet bandwidth with yet another crawler, we have contracted with one of the major crawlers/seach engines to provide us with a list of specific CGI URLs of interest to us. If an URL is to a page of interest (via CGI), then we access the page to get the image URLs from it, but we do not crawl to any other pages.

Digimarc Corporation

http://www.digimarc.com/prod_fam.html DNAbot

DNAbot/1.0

indexing

A search robot in 100 java, with its own built-in database engine and web server . Currently in Japanese.

http://xx.dnainc.co.jp/dnabot/ DownLoad Express

downloadexpress

graphic download

Automatically downloads graphics from the web.

DownLoad Express Inc

http://www.jacksonville.net/~dlxpress DragonBot

DragonBot

indexing

Collects web pages related to East Asia.

EIT Link Verifier Robot

EIT-Link-Verifier-Robot/0.2

maintenance

Combination of an HTML form and a CGI script that verifies links from a given starting point (with some controls to prevent it going off-site or limitless).

Emacs-w3 Search Engine

Emacs-w3/v[0-9\.]+

indexing

Its purpose is to generate a Resource Discovery database This code has not been looked at in a while, but will be spruced up for the Emacs-w3 2.2.0 release sometime this month. It will honor the /robots.txt file at that time.

Esther

esther

indexing

This crawler is used to build the search database at.

http://search.falconsoft.com/ Felix IDE

FELIX IDE

indexing, statistics

Felix IDE is a retail personal search spider sold by The Pentone Group, Inc.

The Pentone Group, Inc.

http://www.pentone.com FetchRover

ESI

maintenance, statistics

FetchRover fetches Web Pages. It is an automated page-fetching engine. FetchRover can be used stand-alone or as the front-end to a full-featured Spider. Its database can use any ODBC compliant database server, including Microsoft Access, Oracle, Sybase SQL Server, FoxPro, etc.

http://www.engsoftware.com/spiders/ fido

fido

indexing

Fido is used to gather documents for the search engine provided in the PlanetSearch service, which is operated by the Philips Multimedia Center. The robots runs on an ongoing basis.

http://www.planetsearch.com/info/fido.html Fish search

Fish-Search-Robot

indexing

Its purpose is to discover resources on the fly a version exists that is integrated into the Tübingen Mosaic 2.4.2 browser (also written in C).

Fouineur

fouineur

indexing, statistics

This robot build automaticaly a database that is used by our own search engine. This robot auto-detect the language (french, english & spanish) used in the HTML page. Each database record generated by this robot.

http://fouineur.9bit.qc.ca/informations.html Freecrawl

Freecrawl

indexing

The Freecrawl robot is used to build a database for the EuroSeek service.

Jesper Ekhall

FunnelWeb

FunnelWeb-1.0

indexing, statisitics

Its purpose is to generate a Resource Discovery database, and generate statistics. Localised South Pacific Discovery and Search Engine, plus distributed operation under development.

GCreep

gcreep

indexing

Indexing robot to learn SQL.

Instrumentpolen AB

http://www.instrumentpolen.se/gcreep/index.html GetBot

???

maintenance

GetBot's purpose is to index all the sites it can find that contain Shockwave movies. It is the first bot or spider written in Shockwave. The bot was originally written at Macromedia on a hungover Sunday as a proof of concept. - Alex Zavatone 3/29/96.

GetterroboPlus Puu

Getterrobo-Plus

Purpose of the robot. One or more of:

Puu robot is used to gater data from registered site in Search Engin "straight FLASH!!" for building anouncement page of state of renewal of registered site in "straight FLASH!!". Robot runs everyday.

marunaka

http://marunaka.homing.net/straight/getter/ GetURL

GetURL.rexx v1.05

maintenance, mirroring

Its purpose is to validate links, perform mirroring, and copy document trees. Designed as a tool for retrieving web pages in batch mode without the encumbrance of a browser. Can be used to describe a set of pages to fetch, and to maintain an archive or mirror. Is not run by a central site and accessed by clients - is run by the end user or archive maintainer.

Golem

golem

maintenance

Golem generates status reports on collections of URLs supplied by clients. Designed to assist with editorial updates of Web-related sites or products.

Geoff Duncan

http://www.quibble.com/golem/ Google.Com

Googlebot

indexing statistics

GoogleBot is the web crawler and indexing agent for the new Google.Com search engine. Goggle.Com has some nice search features that give it much potential in the online search market. Google is one of the search engines that is accessed from Netscape's Netcenter portal.

Google, Inc.

http://google.com/ Gromit

Gromit

indexing

Gromit is a Targetted Web Spider that indexes legal sites contained in the AustLII legal links database.

http://www2.austlii.edu.au/~dan/gromit/ Hämähäkki

Hämähäkki

indexing

Its purpose is to generate a Resource Discovery database from the Finnish (top-level domain .fi) www servers. The resulting database is used by the search engine.

http://www.fi/www/spider.html HamBot

hambot

indexing

Two HamBot robots are used (stand alone & browser based) to aid in building the database for HamRad Search - The Search Engine for Search Engines. The robota are run intermittently and perform nearly identical functions.

http://www.hamrad.com/ havIndex

havIndex

indexing

HavIndex allows individuals to build searchable word index of (user specified) lists of URLs. havIndex does not crawl - rather it requires one or more user supplied lists of URLs to be indexed. havIndex does (optionally) save urls parsed from indexed pages.

hav.Software and Horace A. (Kicker) Vallas

http://www.hav.com/ HI (HTML Index) Search

AITCSRobot/1.1

indexing

Its purpose is to generate a Resource Discovery database. This Robot traverses the net and creates a searchable database of Web pages. It stores the title string of the HTML document and the absolute url. A search engine provides the boolean AND & OR query models with or without filtering the stop list of words. Feature is kept for the Web page owners to add the url to the searchable database.

HKU WWW Octopus

HKU WWW Robot,

indexing

HKU Octopus is an ongoing project for resource discovery in the Hong Kong and China WWW domain . It is a research project conducted by three undergraduate at the University of Hong Kong.

ht://Dig

htdig

indexing

http://www.htdig.org/howitworks.html HTMLgobble

HTMLgobble v2.2

mirror

A mirroring robot. Configured to stay within a directory, sleeps between requests, and the next version will use HEAD to check if the entire document needs to be retrieved.

IBM_Planetwide

IBM_Planetwide,

indexing, maintenance, mirroring

Restricted to IBM owned or related domains.

Imagelock

Mozilla 3.01 PBWF (Win95)

maintenance

IncyWincy

IncyWincy/1.0b1

Various Research projects at the University of Sunderland.

Informant

Informant

indexing

The Informant robot continually checks the Web pages that are relevant to user queries. Users are notified of any new or updated pages. The robot runs daily, but the number of hits per site per day should be quite small, and these hits should be randomly distributed over several hours. Since the robot does not actually follow links (aside from those returned from the major search engines such as Lycos), it does not fall victim to the common looping problems. The robot will support the Robot Exclusion Standard by early December, 1996.

http://informant.dartmouth.edu/about.html InfoSeek Robot 1.0

InfoSeek Robot 1.0

indexing

Its purpose is to generate a Resource Discovery database. Collects WWW pages for both InfoSeek's free WWW search and commercial search. Uses a unique proprietary algorithm to identify the most popular and interesting WWW pages. Very fast, but never has more than one request per site outstanding at any given time. Has been refined for more than a year.

Infoseek Sidewinder

Infoseek Sidewinder

indexing

Mike Agostino

InfoSpiders

InfoSpiders

search

Application of artificial life algorithm to adaptive distributed information retrieval.

Ingrid

INGRID/0.1

Indexing

Ilse c.v.

Inktomi Slurp

slurp

indexing, statistics

Indexing documents for the HotBot search engine (www.hotbot.com), collecting Web statistics.

Inktomi Corporation

http://www.inktomi.com/slurp.html Inspector Web

inspectorwww

maintentance: link validation, html validation, image size

Provide inspection reports which give advise to WWW site owners on missing links, images resize problems, syntax errors, etc

http://www.greenpac.com/inspector/ourrobot.html IntelliAgent

'IAGENT/1.0'

indexing

IntelliAgent is still in development. Indeed, it is very far from completion. I'm planning to limit the depth at which it will probe, so hopefully IAgent won't cause anyone much of a problem. At the end of its completion, I hope to publish both the raw data and original source code.

Iron33

Iron33

indexing, statistics

The robot "Iron33" is used to build the database for the WWW search engine "Verno".

Takashi Watanabe

http://verno.ueda.info.waseda.ac.jp/iron33/history.html Israeli-search

IsraeliSearch/1.0

indexing.

JCrawler

jcrawler

indexing

JCrawler is currently used to build the Vietnam topic specific WWW index for VietGATE.

Jeeves

jeeves

indexing maintenance statistics

Jeeves is basically a web-mirroring robot built as a final-year degree project. It will have many nice features and is already web-friendly. Still in development.

Jobot

Jobot/0.1alpha libwww-perl/4.0

standalone

Its purpose is to generate a Resource Discovery database. Intended to seek out sites of potential "career interest". Hence - Job Robot.

JoeBot

JoeBot/x.x,

JumpStation

jumpstation

indexing

Jonathon Fletcher

Katipo

Katipo/1.0

maintenance

Watches all the pages you have previously visited and tells you when they have changed.

http://www.vuw.ac.nz/~newbery/Katipo/Katipo-doc.html KDD-Explorer

KDD-Explorer

indexing

KDD-Explorer is used for indexing valuable documents which will be retrieved via an experimental cross-language search engine, CLINKS.

Kazunori Matsumoto

not available Kilroy

*

indexing,statistics

Used to collect data for several projects. Runs constantly and visits site no faster than once every 90 seconds.

OCLC

http://purl.org/kilroy KIT-Fireball

KIT-Fireball

indexing

The Fireball robots gather web documents in German language for the database of the Fireball search service.

Gruner + Jahr Electronic Media Service GmbH

http://www.fireball.de/technik.html (in German) KO_Yappo_Robot

ko_yappo_robot

indexing

The KO_Yappo_Robot robot is used to build the database for the Yappo search service by k,osawa (part of AOL). The robot runs random day, and visits sites in a random order.

http://yappo.com/ LabelGrabber

label-grabber

Grabs PICS labels from web pages, submits them to a label bueau

The label grabber searches for PICS labels and submits them to a label bureau.

http://www.w3.org/PICS/refcode/LabelGrabber/index.htm LinkWalker

linkwalker

maintenance, statistics

LinkWalker generates a database of links. We send reports of bad ones to webmasters.

http://www.seventwentyfour.com/tech.html Lockon

Lockon

indexing

This robot gathers only HTML document.

Seiji Sasazuka & Takahiro Ohmori

logo.gif Crawler

logo_gif_crawler

indexing

Meta-indexing engine for corporate logo graphics The robot runs at irregular intervals and will only pull a start page and its associated /.*logo\.gif/i (if any). It will be terminated once a statistically significant number of samples has been collected.

Lycos

Lycos/x.x

indexing

This is a research program in providing information retrieval and discovery in the WWW, using a finite memory model of the web to guide intelligent, directed searches for specific information needs.

Dr. Michael L. Mauldin

Magpie

Magpie/1.0

indexing, statistics

Used to obtain information from a specified list of web pages for local indexing. Runs every two hours, and visits only a small number of sites.

MediaFox

mediafox

indexing and maintenance

The robot is used to index meta information of a specified set of documents and update a database accordingly.

Lars Eilebrecht

none MerzScope

MerzScope

WebMapping

Robot is part of a Web-Mapping package called MerzScope, to be used mainly by consultants, and web masters to create and publish maps, on and of the World wide web.

(Client based robot)

http://www.merzcom.com MOMspider

MOMspider/1.00 libwww-perl/0.40

maintenance, statistics

To validate links, and generate statistics. It's usually run from anywhere.

Monster

Monster/vX.X.X -$TYPE ($OSTYPE)

maintenance, mirroring

The Monster has two parts - Web searcher and Web analyzer. Searcher is intended to perform the list of WWW sites of desired domain (for example it can perform list of all WWW sites of mit.edu, com, org, etc... domain) In the User-agent field $TYPE is set to 'Mapper' for Web searcher and 'StAlone' for Web analyzer.

Motor

Motor

indexing

The Motor robot is used to build the database for the www.webindex.de search service operated by CyberCon. The robot ios under development - it runs in random intervals and visits site in a priority driven order (.de/.ch/.at first, root and robots.txt first).

Muscat Ferret

MuscatFerret

indexing

Used to build the database for the EuroFerret.

Olly Betts

Mwd.Search

MwdSearch

indexing

Robot for indexing finnish (toplevel domain .fi) webpages for search engine called Fifi. Visits sites in random order.

(none) NEC-MeshExplorer

NEC-MeshExplorer

indexing

The NEC-MeshExplorer robot is used to build database for the NETPLAZA search service operated by NEC Corporation. The robot searches URLs around sites in japan(JP domain). The robot runs every day, and visits sites in a random order.

web search service maintenance group

http://netplaza.biglobe.or.jp/keyword.html Nederland.zoek

Nederland.zoek

indexing

This robot indexes all .nl sites for the search-engine of Nederland.net.

System Operator Nederland.net

NetCarta WebMap Engine

NetCarta CyberPilot Pro

indexing, maintenance, mirroring, statistics

The NetCarta WebMap Engine is a general purpose, commercial spider. Packaged with a full GUI in the CyberPilo Pro product, it acts as a personal spider to work with a browser to facilitiate context-based navigation. The WebMapper product uses the robot to manage a site (site copy, site diff, and extensive link management facilities). All versions can create publishable NetCarta WebMaps, which capture the crawled information. If the robot sees a published map, it will return the published map rather than continuing its crawl. Since this is a personal spider, it will be launched from multiple domains. This robot tends to focus on a particular site. No instance of the robot should have more than one outstanding request out to any given site at a time. The User-agent field contains a coded ID identifying the instance of the spider; specific users can be blocked via robots.txt using this ID.

NetCarta WebMap Engine

NetMechanic

WebMechanic

Link and HTML validation

NetMechanic is a link validation and HTML validation robot run using a web page interface.

Tom Dahm

http://www.netmechanic.com/faq.html NetScoop

NetScoop

indexing

The NetScoop robot is used to build the database for the NetScoop search engine.

http://www.netmechanic.com/faq.html NHSE Web Forager

NHSEWalker/3.0

indexing

To generate a Resource Discovery database.

Nomad

Nomad-V2.x

indexing

Richard Sonnen

Northern Light

gulliver

indexing

Gulliver is a robot to be used to collect web pages for indexing and subsequent searching of the index.

Mike Mulligan

http://www.nlsearch.com/ nzexplorer

explorersearch

indexing, statistics

This crawler is used to build the search database at.

Occam

Occam

indexing

The robot takes high-level queries, breaks them down into multiple web requests, and answers them by combining disparate data gathered in one minute from numerous web sites, or from the robots cache. Currently the only user is me.

Open Text Index Robot

Open Text Site Crawler

indexing

This robot is run by Open Text Corporation to produce the data for the Open Text Index.

http://index.opentext.net/OTI_Robot.html Orb Search

Orbsearch/1.0

indexing

Orbsearch builds the database for Orb Search Engine. It runs when requested.

http://orbsearch.home.ml.org Pack Rat

packrat or *

both maintenance and mirroring

Used for local maintenance and for gathering web pages so that local statisistical info can be used in artificial intelligence programs. Funded by NEMOnline.

Patric

patric

statistics

(contained at http://www.nwnet.net/technical/ITR/index.html ).

toney@nwnet.net

http://www.nwnet.net/technical/ITR/index.html PerlCrawler 1.0

perlcrawler

indexing

The PerlCrawler robot is designed to index and build a database of pages relating to the Perl programming language.

Matt McKenzie

http://www.xav.com/scripts/xavatoria/index.html PGP Key Agent

PGP-KA/1.2

indexing

This program search the pgp public key for the specified user.

Phantom

Duppies

indexing

Designed to allow webmasters to provide a searchable index of their own site as well as to other sites, perhaps with similar content.

Pioneer

Pioneer

indexing, statistics

Pioneer is part of an undergraduate research project.

PlumtreeWebAccessor

PlumtreeWebAccessor

indexing for the Plumtree Server

The Plumtree Web Accessor is a component that customers can add to the Plumtree Server to index documents on the World Wide Web.

http://www.plumtree.com/ Popular Iconoclast

gestaltIconoclast/1.0 libwww-FM/2.17

statistics

This guy likes statistics.

http://gestalt.sewanee.edu/ic/info.html Resume Robot

Resume Robot

indexing.

James Stakelum

Road Runner: The ImageScape Robot

roadrunner

indexing

Create Image/Text index for WWW.

LIM Group

RoadHouse Crawling System

RHCS

indexing.

Robot used tp build the database for the RoadHouse search service project operated by Perceval.

Robbie the Robot

Robbie

indexing

Used to define document collections for the DISCO system. Robbie is still under development and runs several times a day, but usually only for ten minutes or so. Sites are visited in the order in which references are found, but no host is visited more than once in any two-minute period.

Robbie the Robot

Robbie

indexing

Used to define document collections for the DISCO system. Robbie is still under development and runs several times a day, but usually only for ten minutes or so. Sites are visited in the order in which references are found, but no host is visited more than once in any two-minute period.

Robot Francoroute

Robot du CRIM 1.0a

indexing, mirroring, statistics

Part of the RISQ's Francoroute project for researching francophone. Uses the Accept-Language tag and reduces demand accordingly.

Marc-Antoine Parent

Roverbot

Roverbot

indexing

Targeted email gatherer utilizing user-defined seed points and interacting with both the webserver and MX servers of remote sites.

GlobalMedia Design (Andrew Cowan & Brian

SafetyNet Robot

SafetyNet Robot 0.1,

indexing.

Finds URLs for K-12 content management.

Scooter

Scooter

indexing

Scooter is AltaVista's prime index agent.

AltaVista

http://www.altavista.com/av/content/addurl.htm Search.Aus-AU.COM

Search-AU

- indexing: gather content for an indexing service

Search-AU is a development tool I have built to investigate the power of a search engine and web crawler to give me access to a database of web content ( html / url's ) and address's etc from which I hope to build more accurate stats about the .au zone's web content.

http://Search.Aus-AU.COM/ Senrigan

Senrigan

indexing

This robot now gets HTMLs from only jp domain.

SG-Scout

SG-Scout

indexing

Does a "server-oriented" breadth-first search in a round-robin fashion, with multiple processes.

Shai'Hulud

Shai'Hulud

mirroring

Used to build mirrors for internal use.

Dimitri Khaoustov

Simmany Robot Ver1.0

SimBot

indexing, maintenance, statistics

The Simmany Robot is used to build the Map(DB) for the simmany service operated by HNC(Hangul & Computer Co., Ltd.). The robot runs weekly, and visits sites that have a useful korean information in a defined order.

http://simmany.hnc.net/irman1.html SiteTech-Rover

SiteTech-Rover

indexing

Originated as part of a suite of Internet Products to organize, search & navigate Intranet sites and to validate links in HTML documents.

Smart Spider

ESI

indexing

Classifies sites using a Knowledge Base. Robot collects web pages which are then parsed and feed to the Knowledge Base. The Knowledge Base classifies the sites into any of hundreds of categories.

http://www.engsoftware.com/robots.htm Snooper

snooper

Solbot

solbot

indexing

Builds data for the Kvasir search service. Only searches.

Spanner

Spanner

indexing,maintenance

Used to index/check links on an intranet.

http://www.kluge.net/NES/spanner/ SpiderBot 1.0 - P.F.C. "Recuperador p.ginas Web" de Ignacio Cruzado Nu.o (U.B.U.)

yes

indexing

Recovers Web Pages and saves them on your hard disk. Then it reindexes them.

Ignacio Cruzado Nu.o : Student of "Computer Engineering" at Burgos University(Spain)

http://pisuerga.inf.ubu.es/lsi/Docencia/TFC/ITIG/icruzadn/details.htm Suke

suke

indexing

This robot visits mainly sites in japan.

http://www.kuro.net/robot/index.ja.html TACH Black Widow

tach_bw

maintenance: link validation

Exhaustively recurses a single site to check for broken links.

Michael Jennings

http://theautochannel.com/~mjenn/bw-syntax.html Tarantula

yes

indexing

Tarantual gathers information for german search engine Nathanrobot-history: Started February 1997.

http://www.nathan.de/ tarspider

tarspider

mirroring

Olaf Schreck

Tcl W3 Robot

dlw3robot/x.y (in TclX by http://hplyot.obspm.fr/~dl/)

maintenance, statistics

Its purpose is to validate links, and generate statistics.

Laurent Demailly

TechBOT

TechBOT

statistics, maintenance

TechBOT is constantly upgraded. Currently he is used for Link Validation, Load Time, HTML Validation and much much more.

TechAID Internet Services

http://www.echaid.net/TechBOT/ Templeton

templeton

mirroring, mapping, automating web applications

Templeton is a very configurable robots for mirroring, mapping, and automating applications on retrieved documents.

http://www.bmtmicro.com/catalog/tton/ The Jubii Indexing Robot

JubiiRobot/version#

indexing, maintainance

Its purpose is to generate a Resource Discovery database, and validate links. Used for indexing the .dk top-level domain as well as other Danish sites for aDanish web database, as well as link validation.

The NorthStar Robot

NorthStar

indexing

Recent runs (26 April 94) will concentrate on textual analysis of the Web versus GopherSpace (from the Veronica data) as well as indexing.

The NWI Robot

VWbot_K

discovery,statistics

A resource discovery robot, used primarily for the indexing of the Scandinavian Web.

Sigfrid Lundberg, Lund university, Sweden

http://vancouver-webpages.com/VWbot/aboutK.shtml The Peregrinator

Peregrinator-Mathematics/0.7

This robot is being used to generate an index of documents on Web sites connected with mathematics and statistics. It ignores off-site links, so does not stray from a list of servers specified initially.

Jim Richardson

The Web Moose

WebMoose

statistics, maintenance

This robot collects statistics and verifies links. It builds an graph of its visit path.

http://www.nwlink.com/~mikeblas/webmoose/ the World Wide Web Wanderer

WWWWanderer v3.0

statistics

Run initially in June 1993, its aim is to measure the growth in the web.

Matthew Gray

TITAN

TITAN/0.1

indexing

Its purpose is to generate a Resource Discovery database, and copy document trees. Our primary goal is to develop an advanced method for indexing the WWW documents. Uses libwww-perl.

Yoshihiko HAYASHI

http://isserv.tas.ntt.jp/chisho/titan-help/eng/titan-help-e.html TitIn

titin

indexing, statistics

The TitIn is used to index all titles of Web server in .hr domain.

http://www.foi.hr/~dpavlin/titin/tehnical.htm UCSD Crawl

UCSD-Crawler

indexing, statistics

Should hit ONLY within UC San Diego - trying to count servers here.

URL Check

urlck

maintenance

The robot is used to manage, maintain, and modify web sites. It builds a database detailing the site, builds HTML reports describing the site, and can be used to up-load pages to the site or to modify existing pages and URLs within the site. It can also be used to mirror whole or partial sites. It supports HTTP, File, FTP, and Mailto schemes.

http://www.cutternet.com/products/urlck.html URL Spider Pro

URL Spider Pro/1.5

indexing: gather content for an indexing service

URL Spider Pro builds Targeted Search Engines.

Infostreak Software

http://www.infostreak.com/us.htm Valkyrie

Valkyrie libwww-perl

indexing

Used to collect resources from Japanese Web sites for ODIN search engine.

http://kichijiro.c.u-tokyo.ac.jp/odin/robot.html Victoria

Victoria

maintenance

Victoria is part of a groupware produced.

Adrian Howard

vision-search

vision-search/3.0'

indexing.

Intended to be an index of computer vision pages, containing all pages within n links (for some small n) of the Vision Home Page.

Voyager

Voyager

indexing, maintenance

This robot is used to build the database for the Lisa Search service. The robot manually launch and visits sites in a random order.

Voyager Staff

VWbot

VWbot_K

indexing

Used to index BC sites for the searchBC database. Runs daily.

http://vancouver-webpages.com/VWbot/aboutK.shtml W3M2

W3M2/x.xxx

indexing, maintenance, statistics

To generate a Resource Discovery database, validate links, validate HTML, and generate statistics.

w3mir

w3mir

mirroring.

W3mir uses the If-Modified-Since HTTP header and recurses only the directory and subdirectories of it's start document. Known to work on U*ixes and Windows NT.

Web Core / Roots

root/0.1

indexing, maintenance

Parallel robot developed in Minho Univeristy in Portugal to catalog relations among URLs and to support a special navigation aid.

Jorge Portugal Andrade

WebBandit Web Spider

WebBandit/1.0

Resource Gathering / Server Benchmarking

Multithreaded, hyperlink-following, resource finding webspider.

http://pw2.netcom.com/~wooger/ WebCatcher

webcatcher

indexing

WebCatcher gathers web pages that Japanese collage students want to visit.

 

WebCopy

WebCopy/(version)

mirroring

Its purpose is to perform mirroring. WebCopy can retrieve files recursively using HTTP protocol.It can be used as a delayed browser or as a mirroring tool. It cannot jump from one site to another.

 

webfetcher

WebFetcher/0.8,

mirroring

Don't wait! OnTV's WebFetcher mirrors whole sites down to your hard disk on a TV-like schedule. Catch w3 documentation. Catch discovery.com without waiting! A fully operational web robot for NT/95 today, most UNIX soon, MAC tomorrow.

weblayers

weblayers/0.0

maintainance

Its purpose is to validate, cache and maintain links. It is designed to maintain the cache generated by the emacs emacs w3 mode (N*tscape replacement) and to support annotated documents (keep them in sync with the original document via diff/patch).

 

WebLinker

WebLinker/0.0 libwww-perl/0.1

maintenance

It traverses a section of web, doing URN->URL conversion. It will be used as a post-processing tool on documents created by automatic converters such as LaTeX2HTML or WebMaker. At the moment it works at full speed, but is restricted to localsites. External GETs will be added, but these will be running slowly. WebLinker is meant to be run locally, so if you see it elsewhere let the author know!.

 

WebQuest

webquest

indexing

WebQuest will be used to build the databases for various web search service sites which will be in service by early 1998. Until the end of Jan. 1998, WebQuest will run from time to time. Since then, it will run daily(for few hours and very slowly).

 

WebReaper

webreaper

indexing/offline browsing

Freeware app which downloads and saves sites locally for offline browsing.

Mark Otway

webs

webs

statistics

The webs robot is used to gather WWW servers' top pages last modified date data. Collected statistics reflects the priority of WWW server data collection for webdew indexing service. Indexing in webdew is done by manually.

Recruit Co.Ltd,

http://webdew.rnet.or.jp/service/shank/NAVI/SEARCH/info2.html#robot WebSpider

webspider

maintenance, link diagnostics

http://www.csi.uottawa.ca/~u610468 WebStolperer

WOLP

indexing

The robot gathers information about specified web-projects and generates knowledge bases in Javascript or an own format.

 

http://www.suchfibel.de/maschinisten/text/werkzeuge.htm (in German) WebVac

webvac/1.0

mirroring

webwalk

webwalk

indexing, maintentance, mirroring, statistics

Its purpose is to generate a Resource Discovery database, validate links, validate HTML, perform mirroring, copy document trees, and generate statistics. Webwalk is easily extensible to perform virtually any maintenance function which involves web traversal, in a way much like the '-exec' option of the find(1) command. Webwalk is usually used behind the HP firewall.

Rich Testardi

WebWalker

WebWalker

maintenance

WebWalker performs WWW traversal for individual sites and tests for the integrity of all hyperlinks to external sites.

Fah-Chun Cheong

WebWatch

WebWatch

maintainance, statistics

Its purpose is to validate HTML, and generate statistics. Check URLs modified since a given date.

Joseph Janos

Wget

wget

mirroring, maintenance

Wget is a utility for retrieving files using HTTP and FTP protocols. It works non-interactively, and can retrieve HTML pages and FTP trees recursively. It can be used for mirroring Web pages and FTP sites, or for traversing the Web gathering data. It is run by the end user or archive maintainer.

Hrvoje Niksic

WhoWhere Robot

whowhere

indexing

Gathers data for email directory from web pages.

Rupesh Kapoor

Wild Ferret Web Hopper #1, #2, #3

Hazel's Ferret Web hopper,

indexing maintenance statistics

The wild ferret web hopper's are designed as specific agents to retrieve data from all available sources on the internet. They work in an onion format hopping from spot to spot one level at a time over the internet. The information is gathered into different relational databases, known as "Hazel's Horde". The information is publicly available and will be free for the browsing at www.greenearth.com. Effective date of the data posting is to be announced.

Wired Digital

hotwired

indexing

WWWC Ver 0.2.5

WWWC

maintenance

Tomoaki Nakashima.

XGET

XGET

mirroring

Its purpose is to retrieve updated files.It is run by the end userrobot-history: 1997.

http://www2.117.ne.jp/~moremore/x68000/soft/soft.html KIT-Fireball/2.0

KIT-Fireball/2.0

Indexing

Web robot for the German search engine, FireBall.

FireBall

http://www.fireball.de/ Lycos.Com

Lycos_Spider_(T-Rex)

Indexing

This robot is used by Lycos to search the Internet for useful content and to crawl pages and sites that have been submitted to them via their Add URL page.

Lycos Corporation

http://www.lycos.com