Search this book

Previous | Table of Contents | Next

Page 299

different users. An increasingly popular use of filtering agents among parents and educators is to block certain web sites that contain inappropriate or indecent materials. Some examples are: Cyberpatrol (, Cybersitter (, and NetNanny ( Although these examples are software programs that can be downloaded or installed by individual users, N2H2 ( provides server-based solutions, where filtering is implemented for all users connected to the server. The same filtering scheme is used to remove only the unwanted portion of a web document. In the WebFilter implementation developed by Axel Boldt

(, the filter is a proxy server that retrieves a document and removes prescribed features such as advertising banners or large graphics before presenting it to a user.

Figure 7.10Various functions of a filtering agent. Page 300

Finally, the filter can be used as an agent that selects (that is, filters) documents from all incoming messages. This use of information filtering is gaining popularity because of the tremendous growth in junk e-mails and spamming on UseNet newsgroups. For example, suppose that you have received 50 e-mail messages. Rather than opening and reading them one by one, you can use a filtering program, which assigns a value to each message based on your selection profile. A message from a known advertiser will get a zero score; and a message dealing with your favorite subject gets a higher score. The result is then displayed on your screen so that you can decide which one to read and whether you want to respond. InfoScan (, a filtering program, displays the result on a radar screen (see fig. 7.11), where only five out of 50 messages are selected as relevant, the ones closest to the center of the radar screen have the highest scores.

Figure 7.11InfoScan's radar screen presents its result of filtering 50 documents.

An interesting application of this filtering agent is gaining support to counter spamming on the UseNet. A UseNet newsgroup can be either moderated or unmoderated. A moderated group has one or more moderators who screen all messages before forwarding them to the UseNet. The majority of

Page 301

newsgroups are unmoderated for several reasons: UseNet users prefer unfettered, equal participation; unpaid moderators have to spend time and effort to screen messages; and messages may be delayed unnecessarily. However, due to the increasing level of abuse in many newsgroups, some type of moderation will be needed for most newsgroups in the near future. A hybrid solution is to use an intelligent software agent. This "bot moderation" or "robomoderation" screens messages, rejecting those with "MAKE EASY MONEY" or those cross-posted in many newsgroups. Also, the robomoderator handles notification, acceptance, and forwarding automatically, reducing the workload of human moderators. For example, Secure Team-based UseNet Moderation Program (STUMP), a freely available program (see "Online Resources" at the end of this chapter), can save time needed for moderation but also allows messages to be archived as web pages.

Although user-oriented filtering agents are acquiring more diverse uses, in terms of network efficiency, a middle ground may entail using intermediaries, which filter information in the middle of the acquisition process. If a large number of such intermediaries exists, consumers can also be guaranteed a choice. One thing to note, however, is that intermediaries are increasingly using advertising, which may unfortunately cause consumers to doubt the objectivity of their search results. In some economic activities, independent third-party status is clearly important, and information search is one of these activities. An element of trust and neutrality is necessary so that filtering is not seen to be a result of censorship or blatant advertising. Therefore, rather than advertising, search intermediaries may benefit from the adoption of micropay-ment methods by which consumers pay a small amount, say a penny, for each search, and intermediaries guarantee full and unbiased access to their databases.

An efficient search mechanism is critical in guaranteeing seller competitiveness and consumer welfare. To make searches efficient, sellers must be willing to offer the maximum amount of information about their products, selection process should be based on clearly defined and objective criteria, and consumers must be allowed full access to this information. Online contents are growing and information filtering programs are beginning to address the problem

Page 302

of information overload, pointing to a more efficient market for searches. But a technical problem remains in setting a standard to describe a digital document, which can facilitate the task of summarizing and compiling search databases. Also, search services, being the first significant commercial projects on the Internet, increasingly depend on advertising revenues to provide a service that is essential for the electronic marketplace to be efficient. This section examines these issues, and also briefly compares consumer searches with advertising (two topics discussed in this and the last chapters) to examine whether one or the other channel of information may be more desirable for electronic commerce.

Cataloging millions of web documents is significantly different from compiling a phone directory or an economic database because of the diversity of web documents. They are in general in multimedia format. That is, a document contains not only texts, for which summarizing consists of abstracting a few keywords, but also graphics, sound files, and animated images. A suitable standard to describe such complex files is a prerequisite to building an efficient search database.

Geographic information systems managers are familiar with metadata standards, by which all geographic data is summarized and described. Metadata is data about data, and accompanies all distributed geographic data to simplify importing and exporting them. Metainformation, in the same manner, is defined as information about information. Metainformation describes an information product, its variables, size, quality, author, and other characteristics. Metainformation itself can be an information product. In fact, in search markets what is exchanged is not information products but metainformation. To illustrate the concept of metainformation, take the case of data and metadata. Suppose you have census data for the city of Austin, Texas. The data contains the number of households in each census tract, sorted by age groups, income groups, home ownership status, and marital status. The data set is a spread sheet with columns of variables and rows of census tracts. The column headings are written as AGE01, AGE02, and so on for, say, 10 age groups; and INC01, INC02, and so on for, say, 20 income groups, and so

Previous | Table of Contents | Next

Products | Contact Us | About Us | Privacy | Ad Info | Home

Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc.

All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement.

Was this article helpful?

0 0

Post a comment