Search services: Excite
Header
Service name: Excite
Last update of this description: 3.9.1996
Description written by: Kai Halttunen
General information
-
Type of service (according to TK's typology): Robot based index
-
Access:(free, commercial): Free
-
Volume: 50 000 000 pages, 1 000 000 articles from 10 000 Usenet newsgroups,
50 000 reviews, Usenet classified advertisements from the past two weeks,
hourly news update from Reuters
-
URLs known: 50 000 000
-
Number of documents indexed: 50 000 000
-
Publisher: Architext Software
-
URL for Top-level Page: http://www.excite.com/
-
Mirror sites: No
-
URL for the organization: http://www.architext.com/
-
History:
-
Update frequency of the whole database: once a week
-
Document rating, reviews, "added value" included: Yes (Net Directory
-section)
-
Registration needed: No
-
Costs: No
-
Performance:
-
Response time: Fast
-
Time outs: No
-
Image download time:
Harvesting
-
Harvesting software: Excites own spider
-
Robot (type; follows robot exclusion standard?): Yes
-
Method:
-
Human:
-
Automatic: Yes
-
User registration: Yes
-
User deletable: No
-
Depth first
-
Breadth first: Yes
-
Type coverage:
-
WWW: Yes
-
gopher:
-
WAIS:
-
ftp:
-
telnet (OPACs):
-
UseNet News: Yes
-
Listserv:
-
IRC:
-
Other databases (numeric, commercial): Database of over 50,000 web site
reviews by Excite and Usenet classified advertisements from the past two
weeks.
-
Multimedia products (images, movie, sounds):
-
Other types:
-
Geographic coverage:
-
Subject coverage (General or specialized content): General
-
Update frequency for visiting the same sites/documents again: Once a
week
-
Number of dead links:
Indexing
-
Indexing software: Excite for Web Servers (EWS)
-
What is indexed:
-
Extracted information, fields indexed:
-
Titles:
-
Headings:
-
Header information (included metainformation):
-
File information (size, date):
-
Links (URLs):
-
The anchor text of links:
-
Other HTML tags:
-
Summary/excerpts (how generated): Yes ("Summaries are taken from the
text within a page being indexed. This does not include META tags. Preference
is also given to punctuated sentences. Using our 'concept based' technology,
our software attempts to determine dominant themes or terms on a page,
and then selects the lines for the summary that best contain these terms.
These themes are then used as search terms or "keywords" for people to
search for while looking for the site. We feel this will generally produce
a more accurate summary and keyword relationship than just selecting the
first few lines of text from a document, as most other search engines do.")
(Excite FAQ)
-
Full text: Yes
-
What is not indexed: META -tag
-
Separate metainformation provided by the search service:
-
Human cataloguing and indexing:
-
Human summary/abstract, excerpt, review: Reviews in NetDirectory section
Retrieval system:
Search software:?
Type of retrieval system:
-
Boolean (exact match): Yes
-
Best match: Yes
-
Combination: Yes
-
Vector retrieval:
-
nonverbal (citation indexing):
-
Other:
Query structures and operations supported
-
Natural language: Yes
-
Word list (no Boolean operators associated): Yes
-
Boolean query: Yes
-
Boolean operators:
-
AND: Yes
-
OR: Yes
-
NOT: Yes
-
Nesting (parentheses supported): Yes
-
Restrictions:
-
mixing of operators:
-
number of search keys:
-
distance in number of words:
-
distance in text structure:
-
bound phrases:
-
Other:
-
Ranking algorithm:
-
ranking factors:
-
calculation of scores:
-
User weighted words: + sign forces search key to be in document retrieved.
If you want certain words in your search statement to be given extra consideration,
you may repeat them.
Search terms:
-
Truncation:
-
Not supported:
-
Automatic: Yes
-
stemming algorithm (morfological)
-
add wildcard (mechanical)
-
left (mechanical)
-
right (mechanical): Yes
-
Manual
-
What is the default and is it user changeable?: Not changeable
-
String match features:
-
regular expressions:
-
internal masking:
-
case sensitive specify:
-
others:
-
Any limits for a search term (character sets supported): Searches but
does not display Latin-1.
-
Any limits for the size of a result set: No
WHAT IS SEARCHABLE:
-
Possibility to specify source types: Yes (WWW,Usenet,Reviews, Classified)
-
System searches as default:
-
URL:
-
Title, headings:
-
Keywords:
-
Summary:
-
Fulltext: Yes
-
cited URL, anchor text:
-
others:
-
User selectable search fields: No
-
URL:
-
Title, headings:
-
keywords:
-
Summary:
-
Fulltext:
-
cited URL, anchor text:
-
others:
-
Other search options:
-
Stopword list:
-
Uses the system a stopword list?: Yes
-
How is the stopword list constructed?: (e.g. words exceeding a given
absolut frequency are automatically put into the stopword list)
-
Can the stopword list be sidestepped in a search?: (e.g. in a phrase
search)
SEARCH IMPROVEMENT:
-
Concept search: Yes
-
Query expansion: Yes (from the result list)
-
Controlled Vocabulary, thesauri: No
-
Relevance feedback, find similar: Yes - find similar
-
Improve your search support or form: Yes (form)
-
Navigation and graphical features:
-
Other features:
RESULT DISPLAY:
-
Result set information:
-
total: No
-
subsets: No
-
Possible to choose number of displayed hits?: No
-
Is the number of hits displayed limited by the service?: No
-
What can be displayed:
-
URL:
-
Hotlink to original document: Yes (title)
-
Title, headings: Title
-
Keywords: No
-
Summary: Excerpt (few most important sentences)
-
Fulltext: No
-
cited URL, anchor text: No
-
Show hits in context: No
-
Highlight hits: No
-
document size: No
-
document last updated: No
-
document last visited: No
-
Pre-defined display formats: No
-
Other display options: Yes, once the results are displayed there is
a possibility to sort them by site (sorted by confidence is default)
-
Information about relevance scores:
-
Score displayed?: Yes (%)
-
Matching terms: No
-
Sorting:
-
URL-based: Yes (grouped by site)
-
others (size, number of links): Relevanve based
-
Afterprocessing of the result by the service:
-
duplicate check: No
-
link check: No
-
Browsing structure (Subject catalogue), Organization of the result:
Subject catalogue in Net Directory section.
-
Browsing structure integrated with index?: No
User interface
-
General description of interface: Query field. Two pull-down menus to
choose sourcse types (Web, Reviews, Usenet, Classified) and search type
(concept, keyword)
-
Clarity of interface: Clear
-
Clarity of search page or index: Clear
-
Text-Only support:
-
HTML Forms support: Yes
-
URL for Forms Search Page: http://www.excite.com/advanced.query.html
or http://www.excite.com/
-
Query input form:
-
Optional forms for input:
-
simple but limited:
-
structured:
-
free not limited: Yes
-
other supported:
-
Non-Forms support: No
-
URL for Non-Forms Search Page:
-
Adaptations to special browsers (Netscape, lynx): Netscape recommended
-
Online Help?:
-
URL for FAQ Page: http://www.excite.com/FAQ.html
-
URL for Help Page: http://www.excite.com/cgi/comsubhelp.cgi?display=html;path=/query.html;section=search;Help=Help
-
Navigation Aids: Yes
-
Search Tutorials: No
-
Sample Searches: No
-
Server Load Indicators: No
-
What's New page: No
-
What's Popular page: No
<
Documentation
-
Manual:
-
Literature:
-
Reviews:
URL for Copyright/Legal Page: http://www.excite.com/disclaimer.html
(Disclaimer)
URL for Subscription Page:
URL for Creator's Page: http://corp.excite.com/
Our evaluation of the service
(Summary. strong points, weaknesses, criticism, recommendations to users
etc.)
Database is quite small, U.S. oriented. Display of the search results
is poor. Concept search gives results, but it cant handle well different
languages. Difficult to evaluate what is indexed in the database, concept
based technology is working quite fuzzy. If the result display shows what
is indexed indexing is done very poorly.
Traugott Koch
(Traugott.Koch@ub2.lu.se)
Anna Brümmer, anna@munin.ub2.lu.se
Lotta Åstrand, lotta@munin.ub2.lu.se
Kai Halttunen, likaha@uta.fi
Eero Sormunen, lieeso@uta.fi
Anne Suoniemi, tmansu@uta.fi
Last update: 96-09-03