Logo
Home
 
User Agents
New Agents
List All
My User Agent
Add New
 
User-Agents Database

User Agents

User Agent Date Added
PageBoyedit25/07/2005 1:14:43
The robot visits at regular intervals.
Panopticedit7/02/2004 23:15:22
Panoptic is a new generation search engine offering very high quality results. It offers a unique combination of metadata and full text indexing from a variety of sources and does a great job of finding home pages. It can support your web site, portal, e-commerce and customer service initiatives.
ParaSiteedit7/02/2004 23:39:09
ParaSite is an incredibly powerful spider which went through several different versions over the course of two years. It is designed to index a substantial portion of the web quickly. ParaSite runs using a server and multiple downloaders. Each downloader runs a number of threads, capable of indexing five to ten documents per second. Since this is a parallel implementation, multiple downloaders can be run simultaneously. The server sorts the incoming urls into queues and hands of batches of urls to the downloaders for indexing.
pegasusedit25/07/2005 1:18:21
pegasus gathers information from HTML pages (7 important tags). The indexing process can be started based on starting URL(s) or a range of IP address.

This robot was created as an implementation of a final project on Informatics Engineering Department, Institute of Technology Bandung, Indonesia.
PerlCrawleredit25/07/2005 1:22:03
The PerlCrawler robot is designed to index and build a database of pages relating to the Perl programming language.
PGP Key Agentedit25/07/2005 1:29:29
This program search the pgp public key for the specified user.

Originated as a research project at Salerno University in 1995.
Phantomedit25/07/2005 1:22:53
Designed to allow webmasters to provide a searchable index of their own site as well as to other sites, perhaps with similar content.
PhpDigedit25/07/2005 1:23:47
Small robot and search engine written in php.
Picsearchedit8/02/2004 19:52:18
Picsearch is indexing pictures from the web. To do this we use a web-crawler which identifies itself as 'Psbot'.
Pimptrain.com's robotedit25/07/2005 1:26:02
Crawls remote sites as part of a search engine program
Pioneeredit25/07/2005 1:26:45
Pioneer is part of an undergraduate research project.
pipeLineredit10/11/2004 12:33:34
PlumtreeWebAccessoredit25/07/2005 1:35:23
The Plumtree Web Accessor is a component that customers can add to the Plumtree Server to index documents on the World Wide Web.
Poppiedit27/07/2005 22:54:07
Poppi is a crawler to index the web that runs weekly gathering and indexing hypertextual, multimedia and executable file formats.

Created by Antonio Provenzano in the april of 2000, has been acquired from Tomi Officine Multimediali srl and it is next to release as service and commercial.
Portal Juice Spideredit25/07/2005 1:27:47
Indexing web documents for Portal Juice vertical portal search engine

Indexing the web since 1998 for the purposes of offering our commerical Portal Juice search engine services.
PortalB Spideredit27/07/2005 22:55:04
The PortalB Spider indexes selected sites for high-quality business information.
Project XP5edit7/02/2004 23:37:56
proximicedit9/12/2009 0:50:47
Raven Searchedit27/07/2005 23:00:25
Raven was written for the express purpose of indexing the web. It can parallel process hundreds of URLS's at a time. It runs on a sporadic basis as testing continues. It is really several programs running concurrently. It takes four computers to run Raven Search. Scalable in sets of four.
RBSE Spideredit27/07/2005 23:01:45
Developed and operated as part of the NASA-funded Repository Based Software Engineering Program at the Research Institute for Computing and Information Systems, University of Houston - Clear Lake.
RDSIndexeredit7/02/2004 23:44:14
Information Resource Management Tool/Web Portal
Resume Robotedit27/07/2005 23:02:38
Road Runner: The ImageScape Robotedit27/07/2005 23:23:18
Robbie the Robotedit27/07/2005 23:24:37
Used to define document collections for the DISCO system. Robbie is still under development and runs several times a day, but usually only for ten minutes or so. Sites are visited in the order in which references are found, but no host is visited more than once in any two-minute period.

The DISCO system is a resource-discovery component in the OLLA system, which is a prototype system, developed under DARPA funding, to support computer-based education and training.
RoboCrawl Spideredit27/07/2005 23:26:43
The Canadian Content robot indexes for it's search database.

Our robot is a newer project at Canadian Content.
Robot Francorouteedit11/02/2004 23:17:13
Part of the RISQ's Francoroute project for researching francophone. Uses the Accept-Language tag and reduces demand accordingly
RuLeSedit27/07/2005 23:31:49
SafeDNS Search Botedit18/10/2015 8:24:58
The main reason for us at SafeDNS to collect web pages, is to correctly categorize the Internet resources and to develop new technologies and products for SafeDNS.
SafetyNet Robotedit27/07/2005 23:32:43
Finds URLs for K-12 content management.
scrapy-redisedit14/11/2016 13:07:20
Distributed crawling/scraping
Search.Aus-AU.COMedit27/07/2005 23:35:45
Search-AU is a development tool I have built to investigate the power of a search engine and web crawler to give me access to a database of web content ( html / url's ) and address's etc from which I hope to build more accurate stats about the .au zone's web content. the robot started crawling from http://www.geko.net.au/ on march 1st, 1998 and after nine days had 70mb of compressed ascii in a database to work with. i hope to run a refresh of the crawl every month initially, and soon every week bandwidth and cpu allowing. if the project warrants further development, i will turn it into an australian ( .au ) zone search engine and make it commercially available for advertising to cover the costs which are starting to mount up. --dez (980313 - black friday!)
Senriganedit31/07/2005 23:35:36
This robot now gets HTMLs from only jp domain.
SG-Scoutedit31/07/2005 23:38:47
Does a "server-oriented" breadth-first search in a round-robin fashion, with multiple processes.
Sherlock Holmes Search Engineedit8/02/2004 0:08:16
Sherlock Holmes is a universal search engine – a system for gathering and indexing of textual data (text files, web pages, ...), both locally and over the network.
Shim-Crawleredit6/02/2006 15:38:26
Shim-crawler was written by Shim Wonbo of Chikayama-Taura laboratory.The main goal behind writing the crawler is to collect web pages for researches related to web-search and data mining. Recently, we are planning to use it for crawling weblogs too.The Crawler is used by the members of Chikayama-Taura Laboratory to crawl web-pages only for the research purposes.Our crawling policy distinctly respects the general crawling norm.Though we duely understand the concern of the webmasters, we would like to assure that our crawler is only crawling pages for performing researches and not for any business use.Please have a glance at our crawling policy for better understanding.We sincerely appriciate your co-operation and support.
ShopWikiedit23/02/2009 0:49:32
ShopWiki finds products using Web crawlers similar to other search engines. This means we look into a Web site's domain for all robots.txt files, which tell our crawlers which files it may search. All Web sites have the ability to define what parts of their domain are off-limits to specific robot user agents. ShopWiki respects and obeys all robots.txt files.
Siftedit31/07/2005 23:42:25
Subject directed (via key phrase list) indexing.
Simmany Robot Ver1.0edit31/07/2005 23:43:33
The Simmany Robot is used to build the Map(DB) for the simmany service operated by HNC(Hangul & Computer Co., Ltd.). The robot runs weekly, and visits sites that have a useful korean information in a defined order.

This robot is a part of simmany service and simmini products. The simmini is the Web products that make use of the indexing and retrieving modules of simmany.
SiteSpideredit7/02/2004 23:38:20
The indexer is capable of indexing up to 1,000 documents per site, and the information is stored to a database searchable by clients. The user can then utilize a simple search server protocol to query the database and generate a search service for their site.
Sleekedit31/07/2005 23:32:53
Crawls remote sites and performs link popularity checks before inclusion.
Snipebotedit3/04/2013 11:06:20
Spock Crawleredit2/08/2007 23:23:01
As part of Spock's mission to index every single human being on the planet, we have developed a crawler to collect pages all over the Internet.
Svenedit7/02/2004 23:40:46
Emtpy user agent
SWISH-Eedit8/02/2004 0:12:03
SWISH-E is a fast, powerful, flexible, free, and easy to use system for indexing collections of Web pages or other files
t6labsedit24/12/2006 22:54:34
T6 Labs is an R&D lab which is into higher order tensor analysis to solve variety of industry related problems. Philosophically all problems of this world where there is an information overload, be it web or computational fluid dynamics, needs higher order tensor analysis for better abstraction. Higher order tensor analysis techniques developed by T6 Labs is being currently used for developing SPAC – a search engine personalization and collaboration platform.
TeraText AGLS Harvesteredit7/02/2004 23:50:11
A text database system and search engine built for handling large text collections
The NorthStar Robotedit25/07/2005 1:05:10
Recent runs (26 April 94) will concentrate on textual analysis of the Web versus GopherSpace (from the Veronica data) as well as indexing.
The Peregrinatoredit25/07/2005 1:19:41
This robot is being used to generate an index of documents on Web sites connected with mathematics and statistics. It ignores off-site links, so does not stray from a list of servers specified initially.
TheRarestParseredit23/02/2009 0:30:43
TheRarestParser is my bot which goes out collecting words used in web pages for “The Rarest Words” project.
Thunderstone Webinatoredit7/02/2004 23:48:37
Webinator is a Web walking and indexing package that allows a Website administrator to easily create and provide a high quality retrieval interface to collections of HTML documents.

Add new user agent

User Agents - Search

Enter keyword or user agent: