June 15 - 17, 2003   
Wequassett Inn, Cape Cod   
Chatham, Massachusetts   
NSF/JISC Workshop
 
General
Welcome
Background
Agenda
References
Important Dates
Participants List
   
OUTREACH
China 2004
Bangalore 2005
   
For Contributors
Call For Papers
Papers
Breakout Reports
Final Report
Opening Plenary Session
Supplementary Contributions
   
For Participants
Expense Form
Accommodation
Tourist info
Travel
   
Organization
Sponsors
Contacts
   

 

   
   
Background
 
   
   

Ubiquitous Knowledge Environments - The Cyberinfrastructure Information Ether

The January 2003 NSF report “Revolutionizing Science and Engineering Through Cyberinfrastructure”1 led off with the observation that “multiple accelerating trends are converging and crossing thresholds in ways that show extraordinary promise … in how we create, disseminate, and preserve scientific and engineering knowledge.” That study concluded that the “National Science Foundation should establish and lead a large-scale, interagency, and internationally coordinated Advanced Cyberinfrastructure Program (ACP) to create, deploy, and apply cyberinfrastructure in ways that radically empower all scientific and engineering research and allied education.” It envisioned a program “to build more ubiquitous, comprehensive digital environments that become interactive and functionally complete for research communities in terms of people, data, information, tools, and instruments and that operate at unprecedented levels of computational, storage, and data transfer capacity.”

Were it not that the Panel’s charge limited it to science and engineering, this observation would no doubt have been extended to knowledge resulting from all scholarly and creative pursuits. Digital libraries were envisioned a decade ago as the answer to networked knowledge environments, and much progress has been made towards that goal, but the pursuit of the goal has also extended our view beyond that which could have been anticipated a decade ago.

What began as an effort to create “digital libraries” has transformed into something much more dynamic than was originally envisioned. Certainly, the idea of curated, network-accessible repositories was (and remains) a fundamental need of scholarly communication, as was the notion that these repositories should support information in multiple formats, representations, and media. But not until serious efforts were made to build such resources, particularly for non-print digital content (audio, image, and video, for example) did it become apparent that this venture would stretch the bounds of computer and information science, and, indeed, require the articulated confluence of multiple computer and information science disciplines. It now appears that these emerging tools and capabilities hold the potential to transform the conduct of disciplinary research, itself, and even to foster the creation of new areas of investigation at the interstices of existing disciplines.

Much of the potential for new digital content is only beginning to be understood. See Tim Rowe’s work on paleoradiology, recently featured on NPR2 , for an excellent example of an emerging source of novel data of substantial scientific interest. An appropriate network-based information infrastructure would make access to and utilization of such information sources routine and intuitive. The science, the knowledge that results from project’s such as these is not fundamentally different because of the digital medium chosen to express it, however. Fundamentally, knowledge communicated or expressed in digital form is the same as that expressed by other means.

Early efforts largely focused on building digital replicas of print-based materials rapidly morphed into a new mode of communication among scientists, enabling rapid dissemination of new findings, discussion and debate around these findings leading to major reductions in time for fully-vetted results, and a new form of scholarly communication infrastructure that holds the promise of enabling fuller exploitation of knowledge. Whereas traditional means of scholarly communication led to highly compartmentalized knowledge along axes of discipline, geography, and individual, the networked knowledge paradigm holds out the potential for exhaustive exploitation of new and existing knowledge, across all previous barriers. The breakdown of disciplinary barriers creates a multiplicative effect unrealizable with traditional approaches. Our present vocabulary falls short of capturing the essence of this phenomenon. Traditional characterizations feel passive in light of what is emerging, and traditional infrastructures such as scholarly journals and research libraries fall short of delivering on this potential. The terms “repository” and “library” capture only a small portion of what is emerging.

But developing this new scholarly communication infrastructure (embedded in the larger cyberinfrastructure advocated in the referenced report) is no mere software development activity. Experience has already shown that it engages the talents of the best minds and talents of computer and information scientists, including network and systems design, human computer interaction, artificial intelligence, information retrieval, information organization, machine translation, database systems, and complexity theory. The real challenge is to build systems supporting scholarly communication that yield new capabilities and capacities so effectively and efficiently that they are intuitive and transparent in their operation. Indeed, a serious measure of success may be how simple the resulting infrastructure appears to operate to the casual (and serious) user. This is what we refer to here as the Ubiquitous Knowledge Environment, or the “information ether.” Underlying this concept is a vision that “information retrieval” could become a dated notion, a vestige of an era in which information was difficult to locate, and even more difficult to acquire.

Realizing such a vision requires not only advances derived from research but also international coordination, as noted in the Cyberinfrastructure report. The European Commission has also taken note of this opportunity and the research needed to realize it. Its DELOS-initiated “6th Framework Programme,”3 calls for a “new infrastructure and environment … created by the integration and use of computing, communications, and digital content on a global scale.” NSF and the DELOS Network of Excellence on Digital Libraries have organized six joint working groups “whose aim is to define a research agenda … for cooperation between the EU and US researchers4 . Working groups include:

  • Spoken-Word Digital Audio Collections
  • Information Extraction from Digital Libraries
  • Personalization and Recommender Systems in Digital Libraries
  • ePhilology: Emerging Language Technologies and Rediscovery of the Past
  • Digital Imagery for Significant Cultural and Historical Materials
  • Preservation and Archiving

This proposal to the National Science Foundation suggests that the time for a serious investigation and development of “Cyberinfrastructure” is, indeed, both timely and strategically important to the United States, and that the Panel’s recognition of the need for a new infrastructure supporting scholarly communication and collaboration is both necessary and timely. We propose to bring together a group of recognized national and international scholars and researchers to propose a plan for the long-term research necessary to realize such a scholarly communication infrastructure, building on the Cyberinfrastructure results and complementary results of a related study in which NSF was encouraged to initiate a program of research into the science of information management.

The Science of Information Management

The November 2002 report “Technology Requirements for Information Management”5 documents the findings of a NSF / DARPA / NASA study panel conducted “in support of application domains of particular government interest, including digital libraries, mission operations, and scientific research.” Of particular interest to the federal sponsors was the pursuit of research and development to support applications typified as “highly distributed with very large amounts of data and a high degree of heterogeneity of sources, data, and users.”

What is becoming increasingly clear from these studies is the universal recognition of the need for a focused research effort, beyond the development of faster hardware and greater bandwidth, to reap the benefit of Moore’s Law in the use of information. Digital library research began this exploration nearly ten years ago and, while notably successful in what it achieved, may be more noteworthy for clarifying what remains beyond reach.

Digital libraries are not the total solution to the problems identified, nor are they the sole path to the opportunities envisioned. But they are part of the solution, just as computers, networks, and operating systems are part of the solution. And they are on the path, just as the other components are. The challenge, however, is to understand what else may be on that path, particularly at a conceptual or theoretical level. This is what the RIACS report identifies as the science of information management. “A science of information management would deal with the underlying principles of information management and how humans deal with it. By contrast, information technology provides the tools and systems to achieve the desired functions and goals.”6

Some of the defining characteristics of the problems to which a science of information management applies are:

  • Heterogeneity of information systems and sources
  • Heterogeneity of users and information providers
  • Seemingly unbounded scale of data and users
  • Need for preservation and records management for an indefinite future and into indeterminate future environments
  • Evolving legal and social frameworks
  • Varying human and organizational capabilities and behavior
  • Curated information
  • Real-time operation
  • Collaborative, synchronous processes
  • Both proactive and reactive uses
  • Incomplete and uncertain information
  • Enormous range of granularity of data
  • Need for comprehensive metadata
  • Increasing emphasis on the semantics of and correlative relationships among data

Interoperation is considered a key and well-known requirement, and one for which the needs are better known than the solutions. Some characteristics of interoperation for common applications such as collaboration, visualization, and decision-making include interoperation:

  • of sources, services, and ontologies
  • with information about possible futures
  • in spite of incompleteness, inconsistency, and uncertainty
  • that supports authentication, privacy, and security.

A ubiquitous information infrastructure is envisioned by the studies cited. The successful realization of such an infrastructure will be supported by a science of information management that will yield new generations of knowledge environments that evolve in pace with advances in computing and communications.

This proposal initiates the detailed development of such a program with an invited workshop of national and international researchers and scholars, to be held in June 15-17, 2003, at Wequassett Inn in Chatham, Massachusetts. Approximately 40 participants are anticipated for the 2.5 day workshop. A report will be produced for the National Science Foundation following the workshop that details the participants’ recommendations and rationales. While this proposal is not specifically proposing it, a follow-up conference open to a hundred or more participants is anticipated, where the report would be

presented, critiqued, and vetted, with the intent of providing NSF the detailed advice and counsel of the scientific community regarding the development of a research program to develop a science of information management.


1.Atkins, Daniel E., et. al., “Revolutionizing Science and Engineering Through Cyberinfrastructure: Report of the National Science Foundation Blue Ribbon Advisory Panel on Cyberinfrastructure,” January 2003, available online at http://www.communitytechnology.org/nsf_ci_report/.
2. See NPR’s story on Tim Rowe at http://www.npr.org/display_pages/features/feature_1183920.html
3. Schauble, Peter and Alan F. Smeaton, “An International Research Agenda for Digital Libraries,” October 1998, available online at http://www.iei.pi.cnr.it/DELOS/NSF/Brussrep.htm
4. DELOS/NSF Joint Working Groups, http://delos-noe.iei.pi.cnr.it/activities/internationalforum/Joint-WGs/joint-wgs.html
5. Graves, Sara, Craig A. Knoblock, & Larry Lannom, “Technology Requirements for Information Management,” RIACS Technical Report 2.07, November 2002, available online at http://www.riacs.edu/trs/
6. RIACS TR 2.07, see previous footnote