| User Agent | | Verified | Date Added |
| Alexa/Internet Archive | [edit] | Yes | 8/02/2004 19:47:49 |
| Alexa.com and archive.org (building the Internet Archive) |
| Arale | [edit] | | 10/02/2004 20:44:54 |
A java multithreaded web spider. Download entire web sites or specific resources from the web. Render dynamic sites to static pages.
Empty user agent |
| Bloodhound | [edit] | | 11/02/2004 0:16:23 |
Bloodhound will download an whole web site depending on the number of links to follow specified by the user.
Empty user agent |
| burglar | [edit] | | 12/09/2006 0:32:09 |
|
| Check&Get | [edit] | Yes | 6/02/2004 0:15:29 |
| Check&Get is handy and powerful bookmark manager and web monitoring program that lets you organize your browser bookmarks, check your favorite Internet pages and detect if their contenthas changed or has become unavailable. |
| collect | [edit] | | 12/09/2006 0:33:52 |
|
| copier | [edit] | | 12/09/2006 0:34:06 |
|
| Custo | [edit] | Yes | 6/02/2004 21:45:59 |
| Capable of reading HTML, CSS, JavaScript, and Shockwave Flash, Custo allows you to quickly retrieve information about the structure of a Web site. |
| DeWeb(c) Katalog/Index | [edit] | Yes | 11/02/2004 16:53:41 |
| Its purpose is to generate a Resource Discovery database, perform mirroring, and generate statistics. Uses combination of Informix(tm) Database and WN 1.11 serversoftware for indexing/ressource discovery, fulltext search, text excerpts. |
| extract | [edit] | | 12/09/2006 0:34:51 |
|
| FurlBot | [edit] | Yes | 2/06/2006 0:37:14 |
Step 1 Sign up and add Furl to your browser Step 2 Browse the Web and save any page with a single click Step 3 Retrieve and share your pages easily |
| GetURL | [edit] | Yes | 12/02/2004 20:46:59 |
| Its purpose is to validate links, perform mirroring, and copy document trees. Designed as a tool for retrieving web pages in batch mode without the encumbrance of a browser. Can be used to describe a set of pages to fetch, and to maintain an archive or mirror. Is not run by a central site and accessed by clients - is run by the end user or archive maintainer |
| GrabNet | [edit] | Yes | 7/09/2006 23:30:04 |
| Grab snips of information from the World Wide Web -including images, text, and URLs - to help you reuse, and organize sites within a customized collection of folders on your desktop. |
| Heritrix | [edit] | Yes | 8/02/2004 0:05:08 |
| Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project. |
| HP Web PrintSmart | [edit] | | 21/02/2007 22:35:50 |
fell for bad bot trap (guestbook trap) Utility from HP to capture and print web pages. |
| HTMLgobble | [edit] | Yes | 14/02/2004 0:25:26 |
| A mirroring robot. Configured to stay within a directory, sleeps between requests, and the next version will use HEAD to check if the entire document needs to be retrieved. |
| HTTrack | [edit] | Yes | 6/02/2004 21:51:04 |
| HTTrack is a free (GPL, libre/open source) and easy-to-use offline browser utility. |
| IBM_Planetwide | [edit] | | 7/03/2004 23:48:26 |
| Restricted to IBM owned or related domains. |
| Internet Explorer DigExt | [edit] | Yes | 1/06/2006 23:20:56 |
| is Internet Explorer's "Make Available Offline" feature. Also known as subscriptions. |
| iSiloX | [edit] | Yes | 5/11/2005 23:55:37 |
| iSiloX is the desktop application that converts content to the iSilo™ 3.x/4.x document format, enabling you to carry that content on your Palm OS® PDA, Pocket PC PDA, Windows® CE Handheld PC, or Windows® computer for viewing using iSilo™. It is currently available for Windows® and Mac OS/X. The X in the name iSiloX represents the "transformation" of content functionality provided by iSiloX. |
| JBot Java Web Robot | [edit] | | 8/03/2004 0:36:08 |
Java web crawler to download web sites User agent can be changed by user |
| JoBo Java Web Robot | [edit] | Yes | 8/03/2004 0:40:18 |
JoBo is a web site download tool. The core web spider can be used for any purpose. User agent can be changed by user |
| JOC Web Spider | [edit] | Yes | 6/02/2004 21:57:49 |
| Download websites to your HD and navigate offline! |
| JoeBot | [edit] | | 8/03/2004 0:44:38 |
| JoeBot is a generic web crawler implemented as a collection of Java classes which can be used in a variety of applications, including resource discovery, link validation, mirroring, etc. It currently limits itself to one visit per host per minute. |