Searching
the World Wide Web: The Need for a Revolution. A recent review of a classic study on the origins of the printing
press brought this sage commentary which might seem relevant to the challenges
of the World Wide Web: “Ancient and Medieval scribes had faced tremendous
difficulties in preserving the knowledge that they already possessed, which,
despite their best efforts, inevitably grew more corrupted and fragmented over
time. With the establishment of printing presses, accumulation of knowledge was
for the first time possible. Rather than spending most of their energies
searching for scattered manuscripts and copying them, scholars could now focus
their efforts on revision of these texts and the gathering of new data”
(Duffy). Anyone using the Web probably
feels like the ancient scholar searching for any scrap of information they can
find, with one exception – the Web searcher turns up thousands of links for
every search and then has to sort through what must appear to be the most
random of searching processes.
At first glance the ability to search the Web must seem
reasonably straightforward. As one
cyberculture legal specialist suggests, “The genius of the World Wide Web lies
in its formatting language, called hypertext markup language (HTML), which
permits users to move rapidly from one document to another. Each HTML document
on the Web has a unique address that corresponds to the computer on which it is
stored. If users know the address of a document that they wish to see, they can
then access it by typing its address into their Web browser. Users can also
move from document to document by "hyperlinking" to other Web
material. Typically a hyperlink takes the form of highlighted text describing
the contents of another document. If the viewer of one document wants to view
the contents of the document described by the link, then she simply
"clicks" on the hyperlink, which "transports" the user to
the address of the desired document. Because of the tremendous amount of
material published in HTML format, the Web now provides educators, students,
professionals, entrepreneurs, and ordinary citizens with a powerful tool for
the acquisition and dissemination of information. The Web promises to become
the public library of the twenty-first century and threatens to make the
shopping mall a thing of the past. More than any comparable communications
innovation, the Web epitomizes the Information Age” (“Recent Developments”). However, this is a description relating to
one document, assuming you have found or started with one, and it’s tracking
from there to others. When you need to
search through the entire Web, now encompassing billions of documents, the task
becomes more daunting.
There is no alchemy involved in World Wide Web searching. There are category search engines (“These
search engines allocate site entries to one of a set of predefined categories
after a review by a human being, and thereby create a growing and structured
database of manually reviewed sites”), general search engines (which
“automatically scans the net for any site it can find”), and specialized search
engines (providing “access to specific types of information, such as
newsgroups, legal information, research information, and other specific
categories of data.”) (“Types of Search Engines”). In fact, none of the search engines covers the entire Web; by
one recent estimate the best any search engine does is to cover 56 percent of
the Web, something over a billion pages (Sullivan). Being able to search a billion pages should seem like a godsend
to most, but the uncertainty of what will be found in any given search and the
lack of confidence one always has about whether the best sources have been
found must always be a nagging concern.
That there would be difficulties in
searching the Web can be seen in new efforts to map or visualize cyberspace,
such as represented by the Atlas of Cyberspace (http://www.cybergeography.org/atlas/atlas.html). It describes itself as “an atlas of maps and
graphic representations of the geographies of the new electronic territories of
the Internet, the World-Wide Web and other emerging Cyberspaces. These maps of
Cyberspaces - cybermaps - help us
visualize and comprehend the new digital landscapes beyond our computer screen,
in the wires of the global communications networks and vast online information
resources. The cybermaps, like maps of the real world, help us navigate the new
information landscapes, as well being objects of aesthetic interest. They have
been created by 'cyber-explorers' of many different disciplines, and from all
corners of the world.” This is a
different kind of atlas than one normally sees in that visionary, conceptual,
imaginative images are placed along side images of more realistic
representations.
At the least, archivists and records
managers need to develop some better means to develop searching tools to use on
the World Wide Web. This could come
through the development of new kinds of Web clearinghouses or via the
development of services providing professional advice by expert records
professionals. As some have argued
persuasively, human agents play similar and often more critical roles as do
intelligent software agents (Zick). But
it is more complicated than even simple comparisons. There is a whole host of new and different actions needed to be
considered by archivists and records managers when considering the World Wide
Web.