Tuesday, October 14, 2008

Mark P. McCahill
From Wikipedia, the free encyclopedia
Mark P. McCahill (born February 7, 1956) has been involved in developing and popularizing a number of Internet technologies since the late 1980s.
Mark McCahill received a BA in Chemistry at the University of Minnesota in 1979, spent one year doing analytical environmental chemistry, and then joined the University of Minnesota Computer Center's microcomputer support group as an Apple II and CDC Cyber programmer.
In 1989, McCahill lead the team at the University of Minnesota that developed one of the first popular Internet e-mail clients, POPmail, for the Macintosh (and later the PC). The usage of graphical user interface clients for Internet standards-based protocols proved to be one of the dominant themes in the popularization of the Internet. At about the same time as POPmail was being developed, Steve Dorner at the University of Illinois at Urbana-Champaign developed Eudora, and the user interface conventions found in these early efforts continue to be present in modern-day e-mail clients.
In 1991, McCahill led the original Gopher development team (Farhad Anklesaria, Paul Lindner, Dan Torrey, Bob Alberti), which invented a simple way to navigate distributed information resources on the Internet. Gopher's menu-based hypermedia combined with full-text search engines paved the way for the popularization of the World Wide Web and was the de facto standard for Internet information systems in the early to mid 1990s.
McCahill is credited with the first known Usenet usage of the phrase "surfing the internet", which was later popularized by Jean Armour Polly (who did not know of McCahill's post at the time). McCahill later explained that his choice of the word surfing was inspired by windsurfing, a favorite sport of his.[1]
Working with other pioneers such as Tim Berners-Lee, Marc Andreessen, Alan Emtage and Peter J. Deutsch (creators of Archie) and Jon Postel, McCahill was involved in creating and codifing the standard for Uniform Resource Locators (URLs).
In 1994-95 McCahill's team developed GopherVR, a 3D user interface for the Gopher protocol to explore how spatial metaphors could be used to organize information and create social spaces. While there was significant interest in the mid-1990s in 3D Internet-enabled information/social spaces (see VRML), the limitied capabilities of mainstream hardware resulted in little uptake of these technologies. Mark McCahill is currently involved in the Croquet project along with David P. Reed, Andreas Raab, David A Smith, Julian Lombardi, and Alan Kay.
In April 2007, McCahill left the University of Minnesota to join the Office of Information Technology at Duke University as an architect of 3-D learning and collaborative systems.
Archie search engine
From Wikipedia, the free encyclopedia
Jump to: navigation, search
Archie is a tool for indexing FTP archives, allowing people to find specific files. It is considered to be the first Internet search engine.[1] The original implementation was written in 1990 by Alan Emtage, Bill Heelan, and J. Peter Deutsch, then students at McGill University in Montreal.
The earliest versions of archie simply contacted a list of FTP archives on a regular basis (contacting each roughly once a month, so as not to waste too much resources on the remote servers) and requested a listing. These listings were stored in local files to be searched using the Unix grep command. Later, more efficient front- and back-ends were developed, and the system spread from a local tool, to a network-wide resource, to a popular service available from multiple sites around the Internet. Such archie servers could be accessed in multiple ways: using a local client (such as archie or xarchie); telneting to a server directly; sending queries by electronic mail; and later via World Wide Web interfaces.
The name derives from the word "archive", but is also associated with the comic book series of the same name. This was not originally intended, but it certainly acted as the inspiration for the names of Jughead and Veronica, both search systems for the Gopher protocol, named after other characters from the same comics.
Web crawler
From Wikipedia, the free encyclopedia
Jump to: navigation, search

The references used in this article may be clearer with a different or consistent style of citation, footnoting, or external linking. For example, citations normally are placed immediately after a punctuation mark, such as after a comma, colon, or the period at the end of a sentence.This article has been tagged since December 2007.


This article or section contains too much jargon and may need simplification or further explanation.Please discuss this issue on the talk page, and/or remove or explain jargon terms used in the article. Editing help is available.This article has been tagged since June 2008.
For the search engine of the same name, see WebCrawler.
For the fictional robots called Scutters, see Red Dwarf characters#The Skutters.
A web crawler (also known as a web spider, web robot, or—especially in the FOAF community—web scutter[1]) is a program or automated script which browses the World Wide Web in a methodical, automated manner. Other less frequently used names for web crawlers are ants, automatic indexers, bots, and worms.[2]
This process is called web crawling or spidering. Many sites, in particular search engines, use spidering as a means of providing up-to-date data. Web crawlers are mainly used to create a copy of all the visited pages for later processing by a search engine that will index the downloaded pages to provide fast searches. Crawlers can also be used for automating maintenance tasks on a website, such as checking links or validating HTML code. Also, crawlers can be used to gather specific types of information from Web pages, such as harvesting e-mail addresses (usually for spam).
A web crawler is one type of bot, or software agent. In general, it starts with a list of URLs to visit, called the seeds. As the crawler visits these URLs, it identifies all the hyperlinks in the page and adds them to the list of URLs to visit, called the crawl frontier. URLs from the frontier are recursively visited according to a set of policies.

Given the current size of the Web, even large search engines cover only a portion of the publicly available internet; a study by Lawrence and Giles (Lawrence and Giles, 2000) showed that no search engine indexes more than 16% of the Web.
Cho et al. (Cho et al., 1998) made the first study on policies for crawling scheduling. Their data set was a 180,000-pages crawl from the stanford.edu domain, in which a crawling simulation was done with different strategies.
Abiteboul (Abiteboul et al., 2003) designed a crawling strategy based on an algorithm called OPIC (On-line Page Importance Computation). In OPIC, each page is given an initial sum of "cash" which is distributed equally among the pages it points to. It is similar to a Pagerank computation, but it is faster and is only done in one step. An OPIC-driven crawler downloads first the pages in the
Boldi et al. (Boldi et al., 2004) used simulation on subsets of the Web of 40 million pages from the .it domain and 100 million pages from the WebBase crawl, testing breadth-first against depth-first, random ordering and an omniscient strategy.
Examples of Web crawlers
The following is a list of published crawler architectures for general-purpose crawlers (excluding focused web crawlers), with a brief description that includes the names given to the different components and outstanding features:
RBSE [22] was the first published web crawler. It was based on two programs: the first program, "spider" maintains a queue in a relational database, and the second program "mite", is a modified www ASCII browser that downloads the pages from the Web.
WebCrawler [10] was used to build the first publicly-available full-text index of a subset of the Web. It was based on lib-WWW to download pages, and another program to parse and order URLs for breadth-first exploration of the Web graph. It also included a real-time crawler that followed links based on the similarity of the anchor text with the provided query.
World Wide Web Worm [23] was a crawler used to build a simple index of document titles and URLs. The index could be searched by using the grep Unix command.
Google Crawler [20] is a crawling and caching module implemented in Java, and used as a part of a more generic system called eRACE. The system receives requests from users for downloading web pages, so the crawler acts in part as a smart proxy server. The system also handles requests for "subscriptions" to Web pages that must be monitored: when the pages change, they must be downloaded by the crawler and the subscriber must be notified. The most outstanding feature of WebRACE is that, while most crawlers start with a set of "seed" URLs, WebRACE is continuously receiving new starting URLs to crawl from.
Ubicrawler (Boldi et al., 2004) is a distributed crawler written in Java, and it has no central process. It is composed of a number of identical "agents"; and the assignment function is calculated using consistent hashing of the host names. There is zero overlap, meaning that no page is crawled twice, unless a crawling agent crashes (then, another agent must re-crawl the pages from the failing agent). The crawler is designed to achieve high scalability and to be tolerant to failures.
FAST Crawler [24] is a distributed crawler, used by Fast Search & Transfer, and a general description of its architecture is available.[citation needed]
· Lycos
· From Wikipedia, the free encyclopedia
· Jump to: navigation, search

Type
Subsidiary
Founded
1994
Headquarters
Waltham, Massachusetts, United States
Revenue
$ UNKNOWN
Employees
72 in US (2007)
Parent
Daum Communications
Website
Lycos Search Home
Type of site
Search Engine and Web Portal
Registration
optional
Available in
multilingual
Launched
April 13, 1995 (1995-04-13) (13 years ago)
Current status
active
Screenshot[show]
A screenshot of Lycos.com
· Lycos is a search engine and web portal centered around broadband entertainment content. It began as a search engine research project by Dr. Michael Loren Mauldin of Carnegie Mellon University in 1994. Bob Davis joined the company as its CEO and first employee in 1995. Lycos then enjoyed several years of astounding growth and, in 1999, became the most visited online destination in the world with a global presence in more than 40 countries. Lycos was sold to Terra Networks of Spain in May 2000 for $5.4 billion, forming a new company, Terra Lycos, maintaining its position as one of the world's largest Internet companies. Shortly after the merger Davis left the company to become a venture capitalist with Highland Capital Partners in Boston. In October 2004, Lycos was sold by Terra's parent company, Telefonica, to Daum Communications Corporation, the second largest Internet portal in Korea, becoming, once again, Lycos, Inc. Lycos remains a top 25 Internet destination in the US.
· The founder and CEO of Lycos from inception was Bob Davis, a native of Boston who incorporated the company in Massachusetts and concentrated on building it into an advertising-supported web portal. Lycos grew from a crowded field in 1995 to become the most-visited web portal in the world in the spring of 1999 (as measured by visits to all of its sites).
· In 1996, the company completed the fastest IPO, from inception to offering, in NASDAQ history and in 1997 became one of the first profitable internet businesses in the world. Over the course of the next several years Lycos acquired nearly two dozen high profile internet brands including Tripod, Gamesville, WhoWhere, Wired Digital, Quote.com, Angelfire, and Raging Bull.
· Lycos Europe was a joint venture between Bertelsmann and Lycos, but has always been a distinct corporate entity. Although Lycos Europe is the largest of the overseas ventures, several other companies also entered into joint venture agreements, including Lycos Canada, Lycos Korea, and Lycos Asia.
· Netscape
· From Wikipedia, the free encyclopedia
· Netscape Communications (formerly known as Netscape Communications Corporation and commonly known as Netscape) is an American computer services company, best known for its web browser. The browser was once dominant in terms of usage share, but lost most of that share to Internet Explorer during the first browser war. By the end of 2006, the usage share of Netscape browsers had fallen, from over 90% in the mid 1990s, to less than 1%.
· Netscape stock traded between 1995 and 2003, subsequently as a subsidiary of AOL LLC. However, it became a holding company following Netscape's purchase by AOL in 1998. The Netscape brand is still extensively used by AOL. And some services currently offered under the Netscape brand, other than the web browser, include a discount Internet service provider and a popular social news website. As of December 2007, AOL announced it would no longer be updating the Netscape browser. Tom Drapeau, director of AOL's Netscape Brand, announced that the company would stop supporting Netscape software products as of March 1, 2008.[1] The decision met mixed reactions from communities, with many arguing that the termination of product support is significantly belated. Internet security site Security Watch stated that a trend of infrequent security updates for AOL's Netscape cause the browser to become a "security liability", specifically the 2005-2007 versions, Netscape Browser 8.[2] Asa Dotzler, one of Firefox's original programmers, greeted the news with "good riddance" in his blog post, but praised the various members of the Netscape team over the years for enabling the creation of Mozilla in 1998.[3] Others protested and petitioned AOL, including online petitions and propaganda, to continue providing vital security fixes to unknowing or loyal users of its software, as well as protection of a well-known brand.[4][5][6]
Netscape Internet Service


Netscape ISP Logo
Netscape ISP is a "high speed" 56 kbit/s dialup-up service offered at $9.95 per month[7] ($6.95 with 12-month commitment). The company serves webpages in a compressed format to increase effective speeds up to 1300 kbit/s (average 500 kbit/s). The Internet service provider is run by AOL under the Netscape brand. The low-cost ISP was officially launched on January 8, 2004.[8]. Its main competitor is NetZero. Netscape ISP's advertising is generally aimed at a younger demographic, e.g., college students, and people just out of school, as an affordable way to gain access to the Internet. Additional features can be added to the service at extra cost such as:
PC Anti-virus Protection
Advanced Spam Blocker
E-mail VirusScan
Extra E-mail Storage
Extra E-mail Addresses
History
Mosaic Communications Corporation, on April 4, 1994, the brainchild of Jim Clark who had recruited Marc Andreessen as co-founder and Kleiner Perkins Caufield & Byers as investors. Clark recruited other early Netscape team members from SGI and NCSA Mosaic. The company's first product was the web browser, called Mosaic Netscape 0.9, released on October 13, 1994. This browser was subsequently renamed Netscape Navigator, and the company took the 'Netscape' name on November 14, 1994[9] to avoid trademark ownership problems with NCSA, where the initial Netscape employees had previously created the NCSA Mosaic web browser. The Mosaic Netscape web browser utilized some NCSA Mosaic code with NCSA's permission, as noted in the application's "About" dialog box. Netscape made a very successful IPO on August 9, 1995. The stock was set to be offered at $14 per share. But, a last-minute decision doubled the initial offering to $28 per share. The stock's value soared to $75 on the first day of trading, nearly a record for first-day gain. The company's revenues doubled every quarter in 1995.[10]
Aliweb
From Wikipedia, the free encyclopedia
Jump to: navigation, search
ALIWEB (Archie Like Indexing for the WEB) can be considered the first Web search engine, as its predecessors were either built with different purposes (the Wanderer, Gopher) or were literally just indexers (Archie, Veronica and Jughead).
First announced in November 1993[1] by developer Martijn Koster, and presented in May 1994[2] at the First International Conference on the World Wide Web at CERN in Geneva, ALIWEB preceded WebCrawler by several months.[3]
ALIWEB allowed users to submit the locations of index files on their sites[4][3] which enabled the search engine to include webpages and add user-written page descriptions and keywords. This empowered webmasters to define the terms that would lead users to their pages, and also avoided setting bots (e.g. the Wanderer) which used up bandwidth. As relatively few people submitted their sites, ALIWEB was not very widely used.
Martijn Koster, who was also instrumental in the creation of the Robots Exclusion Standard,[5][6] detailed the background and objectives of ALIWEB with an overview of its functions and framework in the paper he presented at CERN.[2]
Koster is not associated with a commercial website which uses the aliweb name.[7
Google search
From Wikipedia, the free encyclopedia
(Redirected from Google Search)
Jump to: navigation, search
Google
The Google homepage (using Safari web browser)
URL
www.google.comlist of domain names
Commercial?
yes
Type of site
Search Engine
Registration
optional
Available language(s)
multilingual (~100)
Owner
Google Inc.
Created by
Larry Page and Sergey Brin
Launched
September 15, 1997[1]
Revenue
from AdWords
Current status
active
Google search is a Web search engine owned by Google, Inc., and it is the most used search engine on the Web. Google receives several hundred million queries each day through its various services.
The domain google.com attracted at least 135 million U.S. visitors in May 2008 according to Compete.com.[2]
Functionality


Image of definition link provided for many search terms.
The Google search engine has many intuitive features making it more functional. This could have played a role in making it as popular as it is today. Google is one of the top ten most-visited websites today[6]. Some of its features include a definition link for most searches including dictionary words, a list of how many results you got on your search, links to other searches (e.g. you misspelled something, it gives you a link to the search results had you typed in the correct search), and many more. It is unknown whether functionality, speed, or luck brought it its peak status.
Inktomi Corporation
From Wikipedia, the free encyclopedia
(Redirected from Inktomi)
Jump to: navigation, search
For the Lakota spider-trickster god, see Iktomi.

Inktomi Corporation was a California company that provided software for Internet service providers. It was founded in 1996 by UC Berkeley professor Eric Brewer and graduate student Paul Gauthier. The company was initially founded based on the real-world success of the search engine they developed at the university. After the bursting of the dot-com bubble, Inktomi was acquired by Yaho
History
Inktomi's software was incorporated in the widely-used HotBot search engine, which displaced AltaVista as the leading web-crawler-based search engine, and which was in turn displaced by Google. In a talk given to a UC Berkeley seminar on Search Engines[1] in October 2005, Eric Brewer credited much of the AltaVista displacement to technical differences of scale (Inktomi used distributed network technology, while AltaVista ran everything on a single machine).
The company went on to develop Traffic Server, a proxy cache for web traffic and on-demand streaming media. Traffic Server found a limited marketplace due to several factors, but was deployed by several large service providers including AOL. In November 1999 Inktomi acquired Webspective; in August 2000 Inktomi acquired Ultraseek Server from Disney's Go.com; in September, 2000, Inktomi acquired FastForward Networks[2]; in December 2000, Inktomi acquired the Content Bridge Business Unit from Adero, a content delivery network, which had formed the Content Bridge Alliance with Inktomi, AOL and a number of other ISPs, hosting providers and IP transport providers; and in June 2001 Inktomi acquired eScene Networks. Webspective developed technology for synchronizing and managing content across a host of distributed servers to be used in clustered or distributed load-balancing. Fast Forward developed software for the distribution of live streaming media over the Internet using "app-level" multicast technology. eScene Networks developed software that provided an integrated workflow for the management and publishing of video content (now owned by Media Publisher, Inc.). With this combination of technologies, Inktomi became an "arms merchant" to a growing number of Content Delivery Network (CDN) service providers. Inktomi stock peaked with a split-adjusted price of $241 a share in March 2000.
Corporate officers
David C. Peterschmidt - Chairman, President and Chief Executive Officer
Dr. Eric A. Brewer - Chief Scientist
Timothy J. Burch - Vice President of Human Resources
Ted Hally - Senior Vice President and General Manager of Network Products
Jerry M. Kennelly - Executive Vice President, Chief Financial Officer and Secretary
Al Shipp - Senior Vice President of Worldwide Field Operations
Timothy Stevens - Senior Vice President of Business Affairs, General Counsel and Assistant Secretary
Steve Hill - Vice President of Europe
[edit] Board of directors
David C. Peterschmidt - Chairman, President and Chief Executive Officer , Inktomi *Corporation
Dr. Eric A. Brewer - Chief Scientist, Inktomi Corporation
Frank Gill Retired - Executive Vice President, Intel Corporation
Fredric W. Harman - General Partner, Oak Investment Partners
Alan F. Shugart - Chief Executive Officer, Al Shugart International
Live Search
From Wikipedia, the free encyclopedia
Jump to: navigation, search
For the internet portal, see Windows Live Personalized Experience.
Live Search

Live Search homepage
URL
http://search.live.com
Commercial?
Yes
Type of site
Search Engine
Registration
Optional
Available language(s)
Multilingual
Owner
Microsoft
Created by
Microsoft
Launched
March 8, 2006 (beta)September 11, 2006 (1.0)2007-09-26 (2.0)
Current status
Active
Live Search (formerly Windows Live Search and MSN Search) is the name of Microsoft's web search engine, designed to compete with the industry leaders Google and Yahoo!. Live Search is accessible through Microsoft's Live.com and MSN.com web portal. Currently, Live Search is the fourth most used search engine after Google, Baidu, and Yahoo![1]
The search engine offers some innovative features, such as the ability to view additional search results on the same web page (instead of needing to click through to subsequent search result pages) and the ability to dynamically adjust the amount of information displayed for each search-result (i.e. just the title, a short summary, or a longer summary). It also allows the user to save searches and see them updated automatically on Live.com.
History
Live Search
The first public beta of Live Search was unveiled on March 8, 2006, with the final release on September 11, 2006 replacing MSN Search.
On March 21, 2007, it was announced that Microsoft would separate its Live Search developments from the Windows Live services family. Live Search was integrated into the Live Search and Ad Platform headed by Satya Nadella, part of Microsoft's Platform and Systems division. As part of this change, Live Search was consolidated with Microsoft adCenter.[2]
In the roll-over from MSN Search to Live Search, Microsoft stopped using Picsearch as their image search provider and started performing their own image search, fueled by their own internal image search algorithms.[3]
info.com
From Wikipedia, the free encyclopedia
Jump to: navigation, search

The Info.com logo
Info.com is a metasearch engine which provides results from leading search engines and pay-per-click directories, including Google, Yahoo!, MSN Search, Ask, LookSmart, About and Open Directory.
Info.com is partnered with other search providers to include comparison shopping and product reviews, a selection of news, health, pictures, video, classifieds, eBay, jobs, White Pages and Yellow Pages, tickets, flights, hotels, weather, maps and directions.

PC World - Tips & Tweaks: Essential Web Sites - Steve Bass - Wednesday, March 02, 2005
PC Magazine, 11.02.04 - "Info.com" by Neil J. Rubenking
Web search query
From Wikipedia, the free encyclopedia
Jump to: navigation, search
A web search query is a query that a user enters into web search engine to satisfy his or her information needs. Web search queries are distinctive in that they are unstructured and often ambiguous; they vary greatly from standard query languages which are governed by strict syntax rules.
Types
There are three broad categories that cover most web search queries[1]:
Informational queries – Queries that cover a broad topic (e.g., colorado or trucks) for which there may be thousands of relevant results.
Navigational queries – Queries that seek a single website or web page of a single entity (e.g., youtube or delta airlines).
Transactional queries – Queries that reflect the intent of the user to perform a particular action, like purchasing a car or downloading a screen saver.
Search engines often support a forth type of query that is used far less frequently:
Connectivity queries – Queries that report on the connectivity of the indexed web graph (e.g., Which links point to this URL?, and How many pages are indexed from this domain name?).
[edit] Characteristics
Most commercial web search engines do not disclose their search logs, so information about what users are searching for on the Web is difficult to come by[2]. Nevertheless, a study in 2001 [3] analyzed the queries from the Excite search engine showed some interesting characteristics of web search:
The average length of a search query was 2.4 terms.
About half of the users entered a single query while a little less than a third of users entered three or more unique queries.
Close to half of the users examined only the first one or two pages of results (10 results per page).
Less than 5% of users used advanced search features (e.g., Boolean operators like AND, OR, and NOT).
The top three most frequently used terms were and, of, and sex.
A study of the same Excite query logs revealed that 19% of the queries contained a geographic term (e.g., place names, zip codes, geographic features, etc.)[4].
A 2005 study of Yahoo's query logs revealed 33% of the queries from the same user were repeat queries and that 87% of the time the user would click on the same result[5]. This suggests that many users use repeat queries to revisit or re-find information.
In addition, much research has shown that query term frequency distributions conform to the power law, or long tail distribution curves. That is, a small portion of the terms observed in a large query log (e.g. > 100 million queries) are used most often, while the remaining terms are used less often individually. [6] This example of the Pareto principle (or 80-20 rule) allows search engines to employ optimization techniques such as index or database partitioning, caching and pre-fetching.
[edit] References
1. ^ Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schutze (2007), Introduction to Information Retrieval, Ch. 19
2. ^ Dawn Kawamoto and Elinor Mills (2006), AOL apologizes for release of user search data
3. ^ Amanda Spink, Dietmar Wolfram, Major B. J. Jansen, Tefko Saracevic (2001). "Searching the web: The public and their queries". Journal of the American Society for Information Science and Technology 52 (3): 226–234. doi:10.1002/1097-4571(2000)9999:9999<::AID-ASI1591>3.3.CO;2-I.
4. ^ Mark Sanderson and Janet Kohler (2004). "Analyzing geographic queries". Proceedings of the Workshop on Geographic Information (SIGIR '04).
5. ^ Jaime Teevan, Eytan Adar, Rosie Jones, Michael Potts (2005). "History repeats itself: Repeat Queries in Yahoo's query logs". Proceedings of the 29th Annual ACM Conference on Research and Development in Information Retrieval (SIGIR '06): 703-704. doi:10.1145/1148170.1148326.
6. ^ Ricardo Baeza-Yates. "Applications of Web Query Mining", Springer Berlin / Heidelberg, pp. 7-22
Advertising
Advertising is a form of communication that typically attempts to persuade potential customers to purchase or to consume more of a particular brand of product or service. Many advertisements are designed to generate increased consumption of those products and services through the creation and reinforcement of "brand image" and "brand loyalty". For these purposes, advertisements sometimes embed their persuasive message with factual information. Every major medium is used to deliver these messages, including television, radio, cinema, magazines, newspapers, video games, the Internet and billboards. Advertising is often placed by an advertising agency on behalf of a company or other organization.
History
In June 1836, French newspaper La Presse is the first to include paid advertising in its pages, allowing it to lower its price, extend its readership and increase its profitability. The formula is soon copied by all titles. Around 1840, Volney Palmer established a predecessor to advertising agencies in Boston.[7] Around the same time, in France, Charles-Louis Havas extended the services of his news agency, Havas to include advertisement brokerage, making it the first French group to organize. At first, agencies were brokers for advertisement space in newspapers. N. W. Ayer & Son was the first full-service agency to assume responsibility for advertising content. N.W. Ayer opened in 1869, and was located in Philadelphia.[7]
When radio stations began broadcasting in the early 1920s, the programs were however nearly exploded. This was so because the first radio stations were established by radio equipment manufacturers and retailers who offered programs in order to sell more radios to consumers
early 1950s. A fierce battle was fought between those seeking to commercialise the radio and people who argued that the radio spectrum should be considered a part of the commons – to be used only non-commercially and for the public good. The United Kingdom pursued a public funding model for the BBC, originally a private company but incorporated as a public body by Royal Charter in 1927. In Canada, advocates like Graham Spry were likewise able to persuade the federal government to adopt a public funding model. However, in the United States, the capitalist model prevailed with the passage of the 1934 Communications Act which created the Federal Communications Commission.[8] To placate the socialists, the U.S. Congress did require commercial broadcasters to operate in the "public interest, convenience, and necessity".[9] Nevertheless, public radio does exist in the United States of America. In the early 1950s, the Dumont television network began the modern trend of selling advertisement time to multiple sponsors. Previously, Dumont had trouble finding sponsors for many of their programs and compensated by selling smaller blocks of advertising time to several businesses.
Excite
Excite is an Internet portal, and as one of the "dotcoms" of the 1990s (along with Yahoo! and Netscape), was once one of the most recognized brands on the Internet.
Excite offers a variety of services, including search, web-based email, instant messaging, stock quotes, and a customizable user homepage. The content is collated from over 100 different sources.
History
Excite was founded as Architext in 1994 by Graham Spencer, Joe Kraus, Mark Van Haren, Ryan McIntyre, Ben Lutch and Martin Reinfried, who were all computer science students at Stanford University (except for Kraus, who was a political science major). In July 1994 International Data Group paid them $100,000 to develop an online service.
In January 1995, Vinod Khosla (also a former Stanford student), a partner at venture capital firm Kleiner Perkins Caufield & Byers, arranged $250,000 first round backing with $1.5 million in ten months. Geoff Yang of Institutional Venture Partners brought in an additional $1.5 million in financing. Excite was formally launched in December 1995.
In January 1996, George Bell joined Excite as its Chief Executive Officer. Excite also bought two search engines (Magellan and WebCrawler), and signed exclusive distribution agreements with Netscape, Microsoft, Apple, and other companies. On April 4, 1996, Excite went public with an initial offering of two million shares priced at $8.50 per share, and in June 1997, Intuit, maker of Quicken and TurboTax, purchased a 19% stake in Excite — a deal worth $40 million — and finalized a seven-year partnership deal.
On October 16, 1997, Excite purchased comparison shopping service company Netbot for around $300 million. At the same time Intuit announced the launch of "Excite Business & Investing". Later that year a deal with Ticketmaster to provide direct online ticketing was finalized.
On March 31 1998, Excite reported a net loss of approximately $30.2 million and according to its Q1 report it only had enough available capital to met obligations through December. [1] As of this time Excite had Borrowings under bank line of credit over $6.1 million, in 1997 they reported only 1.2 million in barrowed lines of credit.
In December 1998, Yahoo! was in negotiations with Excite to purchase them for $5.5 billion to $6 billion. However, prompted by Kleiner Perkins, @Home Network's Chairman and CEO Thomas Jermoluk met with Excite’s Chairman and CEO George Bell on December 19, and Excite was subsequently acquired by @Home Network on January 19, 1999.
[edit] Excite@Home
Main article: @Home Network
The $6.7 billion merger of Excite and @Home became one of the largest mergers of two Internet companies ever; it combined @Home's high speed internet services and existing portal with Excite’s search engine and portal. The new portal also moved towards personalized web portal content, a concept now commonplace.
The new company became "Excite@Home", though the stock symbol and the company's name in regulatory filing records remained as "At Home Corporation" (ATHM). Six months after the merger, Tom Jermoluk stepped down as CEO of Excite@Home but remained Chairman of the Board, and Excite’s George Bell, who was the President of the Excite division of @Home after the merger, became the new CEO of combined Excite@Home.
Following the merger, the Excite division purchased iMall for about $425 million in stock, and also online greeting card company Blue Mountain Arts for 11.2 million shares of stock (approximately $430 million worth), and paid $350 million in cash. Excite also acquired photo-sharing company Webshots for $82.5 million in stock. Excite furthermore paid for sponsorship of Infiniti Indy car driver Eddie Cheever, Jr., through the 2000 and 2001 racing seasons for an undisclosed amount.
However, the merger between Excite and @Home fell disastrously short of expectations. Online advertising revenue plummeted. Cable network ISP revenue continued to grow. On September 21, 2000 George Bell announced plans to step down as CEO by March 2001. Stock value had dropped 90% during his tenure.
On April 23, 2001, Patti S. Hart, the former CEO of Telocity, joined Excite@Home as its third CEO (and @Home's fourth). In the same announcement, George Bell resigned and left the company completely. The company also reported first-quarter net loss of $61.6 million, or 15 cents per share, on revenue of $142.8 million compared with a loss of $4.6 million, or 1 cent, on revenue of $138 million in the same period the prior year.
On June 11, 2001, Excite@Home announced that it had raised $100 million in financing from Promethean Capital Management and Angelo Gordon & Co. Part of the deal was that the loan was repayable immediately if Excite@Home stock was delisted by Nasdaq. The loan, structured as a note convertible into shares of Excite, had an interest rate of zero.
By August 20, 2001, Excite@Home had replaced its auditors Ernst & Young with PricewaterhouseCoopers and received a demand for the immediate repayment of $50 million in debt from Promethean Capital Management and Angelo Gordon & Co. Furthermore, Cox Cable and Comcast announced that they would separate from Excite@Home by the first quarter of 2002.
On September 13, 2001, Excite@Home sold Blue Mountain Arts for $35 million to American Greetings - less than 5% of what it had paid less than two years earlier.
On October 1, 2001, Excite@Home filed for Chapter 11 bankruptcy protection with the U.S. Bankruptcy Court for the Northern District of California. The company's remaining 1,350 employees were laid off over the following months into the first quarter of 2002. As part of the agreement, @Home's national high-speed fiber network access would be sold back to AT&T for $307 million in cash. At Home Liquidating Trust became the successor company to Excite@Home, charged with the sale of all assets of the former company.
At the end of 2001, the Webshots assets were purchased by the company's founders for $2.4 million in cash from the Bankruptcy Court.
SAPO
SAPO (Portuguese for toad), Servidor de Apontadores Portugueses, is a brand and subsidiary company of Portugal Telecom Group. It is a portuguese internet service provider that started being a search engine when founded in 1995

History
APO was created on September 4, 1995 at the University of Aveiro, by seven members of the Computer Science Center of the University. The name appeared from the acronym of the service, S.A.P. (Servidor de Apontadores Portugueses), in which it was easy to get to SAPO.
In 1997 the members of the Computer Science Center left the university and founded a company, called Navegante, in which SAPO then become property of that company. The portal then started to have a commercial exploration.
Later, in September 1998, Saber & Lazer - Informática e Comunicação S.A. bought SAPO from Navegante. With Saber & Lazer, SAPO launched new services, a free e-mail, a virtual shopping and some new features for the search engine.
Still in that year, due to increasing traffic SAPO and Telepac sign an agreement, becoming Telepac their new internet service provider.
In September 1999, PT Multimédia acquired 74,9% of Saber e Lazer. And in March 2000, SAPO was assigned to PTM.com, with the objective of joining all internet projects on only one company. Currently the company is 100% withheld by PTM.com (which belongs to Portugal Telecom, after selling it in 2005).
After some improvements in infrastructures and accesses, finally in June 2002 was launched the ADSL access service, starting the era of new contents for the portal.
On March 28, 2006, SAPO XL was launched: a project for broadband content, in which the main content is videos, on-line television transmission and real-time transmission of events.

ChaCha (search engine)
ChaCha is a search engine that pays human guides to answer questions for users. This is a technique known as social searching. ChaCha was created by Scott A. Jones, inventor and entrepreneur, and Brad Bostic, the chairman of Bostech Corporation. ChaCha is based in Carmel, Indiana, a suburb of Indianapolis.
The alpha version of the search engine was launched on September 1, 2006. A beta version was introduced on November 6, 2006.[1] The service reported 20,000 guides had registered by year end.[2] ChaCha also raised $6 million in development funds, including support from Bezos Expeditions, a personal investment firm owned by Jeff Bezos, the entrepreneur behind Amazon.com.[3]
Relevance (information retrieval)
Relevance most commonly refers to topical relevance or aboutness, i.e. to what extent the topic of a result matches the topic of the query or information need. Relevance can also be interpreted more broadly, referring to generally how "good" a retrieved result is with regard to the information need. The latter definition of relevance, sometimes referred to as user relevance, encompasses topical relevance and possibly other concerns of the user such as timeliness, authority or novelty of the result.

History
The formal study of relevance began in the 20th Century with the study of what would later be called bibliometrics. In the 1930s and 1940s, S. C. Bradford used of the term "relevant" to characterize articles relevant to a subject (cf., Bradford's law). In the 1950s, the first information retrieval systems emerged, and researchers noted the retrieval of irrelevant articles as a significant concern. In 1958, B. C. Vickery made the concept of relevance explicit in an address at the International Conference on Scientific Information. [1]
Since 1958, information scientists have explored and debated definitions of relevance. A particular focus of the debate was the distinction between "relevance to a subject" or "topical relevance" and "user relevance".
1. ^ Mizzaro, S. (1997). Relevance: The Whole History. Journal of the American Society for Information Science. 48, 810‐832.
2. ^ F. Diaz, Autocorrelation and Regularization of Query-Based Retrieval Scores. PhD thesis, University of Massachusetts Amherst, Amherst, MA, February 2008, Chapter 3.
3. ^ W. B. Croft, “A model of cluster searching based on classification,” Information Systems, vol. 5, pp. 189–195, 1980.
4. ^ a b A. Griffiths, H. C. Luckhurst, and P. Willett, “Using interdocument similarity information in document retrieval systems,” Journal of the American Society for Information Science, vol. 37, no. 1, pp. 3–11, 1986.
5. ^ X. Liu and W. B. Croft, “Cluster-based retrieval using language models,” in SIGIR ’04: Proceedings of the 27th annual international conference on Research and development in information retrieval, (New York, NY, USA), pp. 186–193, ACM Press, 2004.
6. ^ a b E. M. Voorhees, “The cluster hypothesis revisited,” in SIGIR ’85: Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval, (New York, NY, USA), pp. 188–196, ACM Press, 1985.
7. ^ S. Preece, A spreading activation network model for information retrieval. PhD thesis, University of Illinois, Urbana-Champaign, 1981.
8. ^ T. Qin, T.-Y. Liu, X.-D. Zhang, Z. Chen, and W.-Y. Ma, “A study of relevance propagation for web search,” in SIGIR ’05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, (New York, NY, USA), pp. 408–415, ACM Press, 2005.
9. ^ A. Singhal and F. Pereira, “Document expansion for speech retrieval,” in SIGIR ’99: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, (New York, NY, USA), pp. 34–41, ACM Press, 1999.
10. ^ F. Diaz, “Regularizing query-based retrieval scores,” Information Retrieval, vol. 10, pp. 531–562, December 2007.
Northern Light Group
Northern Light Group, LLC is a company specializing in strategic research portals, enterprise search technology, and text analytics solutions. The company provides custom, hosted, turnkey solutions for its clients.
Enterprise search
From Wikipedia, the free encyclopedia
Jump to: navigation, search
Enterprise search is the practice of identifying and enabling specific content across the enterprise to be indexed, searched, and displayed to authorized users.
Major vendors
Autonomy:
IDOL Server
K2 Enterprise (formerly Verity)
Dieselpoint: Search & Navigation
Endeca: Information Access Platform
Exalead: exalead one:enterprise
Fast Search & Transfer (acquired by Microsoft):
Enterprise Search Platform (ESP)
RetrievalWare (formerly Convera)
Open Text:
Hummingbird Search Server
Livelink Search
Sinequa: Connect to KnowledgeTM
Sinequa CS
Vivisimo: Vivisimo Velocity
X1 Technologies: X1 Enterprise Search Suite
ZyLAB Technologies: ZyIMAGE Information Access Platform
[edit] Specialized vendors
Aduna: Enterprise search solutions based on Guided Exploration
AskMeNow: S3 - Semantic Search Solution
conceptsearching
conceptSearch
conceptClassifier
Centric Software structured and unstructured product data (from acquisition of Product Sight[1]
Centric Insight
Coveo
Coveo G2B Information Access Suite
Coveo G2B for Email
Coveo G2B for CRM
Coveo G2B for Multimedia
dtSearch
Desktop/Network
Engine(SDK)
Web
Publish
Expert System S.p.A.: Cogito
Funnelback: Funnelback Search
InQuira: InQuira
ISYS Search Software:
ISYS:desktop
ISYS:web
ISYS:sdk
[1] : QUADRA - "The Question Answering Digital Research Assistant"
[SoftInform]]: SearchInform Server
Siderean Software: Seamark Navigator
TeraText: TeraText Suite
USU AG: USU KnowledgeMiner - selflearning Enterprise Search for multiple formats and systems
Whatever sa/nv: Knowledge Plaza - "The place beyond search". The Enterprise Social Search platform
[edit] Superplatforms
IBM: OmniFind
Microsoft: SharePoint Search Services
SAP: TREX and SAP Enterprise Search
Oracle Corporation: Secure Enterprise Search 10g
USU: KnowledgeMiner
Mindbreeze: Mindbreeze Enterprise Search Platform
[edit] SAP-oriented
IXULT: HIVE
SAP: TREX and SAP Enterprise Search
[edit] Microsoft-oriented
conceptsearching:
conceptClassifier for SharePoint
Coveo: Coveo G2B Information Access Suite
dtSearch: dtSearch
Innerprise: ES.NET 2004
Forward IT: Forward Search
Mondosoft: MondoSearch
Microsoft: Microsoft Search Server
[edit] Search appliances
Google: Google Search Appliance
Queplix: Queplix Universal Search Appliance
Thunderstone: Thunderstone Search Appliance
[edit] Hosted services
Adeptic: AESS Adeptic Enterprise Search Suite
Blossom Software: Blossom Enterprise Search
VisualSciences: Search
Aspect Search: Hosted Search Service
[edit] Lower-cost, web-oriented
Autonomy:Ultraseek
IBM: OmniFind Yahoo! Edition
[edit] Open source
Aduna: Open source enterprise search solutions based on Guided Exploration
Apache Lucene
Flax: Open source, based on Xapian. Project page
Nutch web crawler based on Apache Lucene
Solr Enterprise Search Server based on Apache Lucene
[edit] See also
Enterprise information access
Knowledge management
Text mining

[edit] Further reading
"Making Search Work - Implementing Web, Intranet & Enterprise search" Martin White (2007) ISBN 978-1-85604-502-2
[edit] External links
SharePoint Magazine
Butler Group: Enterprise Search and Retrieval (Oct 2006)
CMS Watch: The 2008 Enterprise Search Report (Nov 2007)
Gartner: Magic Quadrant for Information Access Technology, 2007 (before 2005 titled "Enterprise Search")
Goebel Group: Enterprise Desktop Search Tool Matrix
CMSWatch, Enterprise Search Vendor List
[edit] References
^ Kenneth Wong, Cadalyst, April 2007Intelligence Beyond the Known Universe
2.Quaero
3. Quaero (Latin: I seek) is a European research and development program which has the goal of developing multimedia and multilingual indexing and management tools for professional and general public applications (such as search engines)[1]. The European Commission approved the aid granted by France on 11 March 2008[2]
4. This program is supported by the OSEO. It is a French project with the participation of several German partners. The consortium is led by Thomson. Other companies involved in the consortium are: France Télécom, Exalead, Bertin Technologies, Jouve, Grass Valley GmbH, Vecsys, LTU Technologies, Siemens A.G. and Synapse Développement. Many public research institutes are also involved, including LIMSI-CNRS, INRIA, IRCAM, RWTH Aachen, University of Karlsruhe, IRIT, Clips Imag, GET, INRA; as well as other public organisations such as INA, BNF, LIPN and DGA.
5. According to the AII press release the main targeted applications can be divided in three broad classes: multimedia indexing and search tools for professional and general public use, including mobile environments; professional solutions for production, post-production, management and distribution of multimedia documents; and facilitation of access to cultural heritage such as audiovisual archives and digital libraries.
The search engine
The search engine application has been the focus of the attention of many news articles. As a consequence, Quaero is often cited as a European competitor to Google, as well as other commercial search engines such as Yahoo, MSN and Ask.com.
Quaero is not intended to be a text-based search engine but is mainly meant for multimedia search. The search engine will use techniques for recognizing, transcribing, indexing, and automatic translation of audiovisual documents and it will operate in several languages. There is also mention of automatic recognition and indexing of images.
According to an article in The Economist[4], Quaero will allow users to search using a "query image", not just a group of keywords. In a process known as "image mining", software that recognises shapes and colours will be used to look for and retrieve still images and video clips that contain images similar to the query image. (The software is supplied by LTU Technologies.) A technique called "keyword propagation" will be used so that when Quaero finds a descriptionless image which contains elements of or completely matches a properly labelled image, it will append the description from the labelled image to the unlabelled one. This will ensure faster searches and a definite enrichment of the web, also linguistically, as the primary interface and query terms were supposed to be in French and German.
As France will be researching image-searching, Germany was supposed to be advancing voice clip and sound media searches, with the intention of transcribing their content to text, and translating it to other languages, before they pulled out of the project. This would also allow for "query sound clips" following the paradigm of the "query image" mentioned above.