Suggestions for Student Research Projects

The ideas and project descriptions below are broken down below in five major areas. In many cases the work in one area may be combined with the work in another area, so some of the boundaries are artificially imposed. . The areas include:

Augmentation
Document Processing
Patterns
Standards
Virtualization

Augmentation

Augmentation has to do with the study of how the computer may be used to augment or supplement human endeavors. Specifically, it examines how human skill repertoires may be moved to the computer system. In this process, the natural human skill may be effected in several ways. The natural skill may atrophy. Other skills may be effected by the atrophy -- and this impact may have undesirable negative side effects. Ideally, the movement of basic tasks to the computer will free up the human to focus on higher level skills. Augmentation is closely associated with the writings of Douglas Engelbart. Engelbart's work on augmentation was closely associated with information and documents, and so are the ideas proposed below. Keep in mind however that augmentation is not restricted in any way shape or form to document and document artifacts.

Conduct an experiment to measure the impact in skills in the human of moving some portion of a skill repertoire to an artifact, e.g. what is the impact on the human ability to calculate of having and making regular use of a calculator
Develop a model of some subprocess of information processing e.g. commenting, showing at a NGOMS or KAT level the activities required to engage the process. Then show how that process may be allocated to human and artifact to minimize the cognitive load on the human. Finally, train the human in the modified/augmented process and measure the impact of the augmentation in comparison to the traditional mode of processing
Develop a set of agents in accord with the Spring classification of agents as executive, collaborative, communications, service, etc., and measure user reaction to the use of these agents to determine if the hypothesis that higher level agents will be more resisted than lower level agents.
Develop and test a set of mechanisms that may provide persona characteristics for agents and measure whether the persona manifest has the predicted impact on user acceptance and trust of a given agent.
Refine the agent architecture, functional and structural proposed by Spring, and see if it bears out as an aid in agent development for CASCADE.
Test the relative merits of augmenting human intellectual activity by the following mechanisms:

provide raw information (data tables)
provide processed information summatively (statistical processing of the tables)
provide processed information visually (visualization of the data tables)
provide the results of action on information (simulation or expert systems approaches based on the data tables and rules)

Assess which of the mechanisms is most productive for humans:

in the short term and long term use of the system
for trained and untrained users
for casual users and expert users

Develop a specific agent for CASCADE which provides some form ofaugmentation. A few of the "agents" that have been discussed for different versions of CASCADE include:

For Standards CASCADE

an agent to manage the balloting of the standard. The agent would send out the ballot, sent out reminders, provide interim and final balloting results.
an agent to watch comments being posted to a document and inform people of the comments that they are most likely to want to look at.

For Education CASCADE

an agent to help instructors translate lecture notes into student reading notes -- this agent would have to be able to parse notes at some level and let the instructor know where more information is needed. This agent might work with the agent described below to gather data about student explicit and implicit reactions to a lecture and identify places in the lecture where elaboration, clarification, examples, etc. might improve the lecture.
an agent to watch questions being asked by students so as to identify answers given to previous questions that might satisfy the information need of the student. More specifically, the agent might keep a table of stemmed and stopped question tokens and match an asked question against this table. If the question matches an entry in the table to some criteria, the answer given to the table question would be presented to the student. The match between the asked and the table question might be rated higher or lower based on other variables such as the lecture that caused the question to be asked, the history of the student, the time in the term, etc. Questions not answered by the system would be passed to a human and the stemmed question and the answer provided by the human would become and additional entry in the table. The agent might also suggest some modifications to the answer to assure that personal information is removed and points are made as clearly as possible. Finally, if questions and answers came in series, the agent would have to pay attention to this. Lastly, the questions for which matched answers were not appropriate might be used to add additional negative weighting terms to the stemmed question for which the inappropriate answer was uncovered.

For Journal CASCADE

an agent to coordinate the reviews of an article, selecting possible reviewers for the editor to select from, reminding reviewers about their commitments, and keeping the editor abreast of the progress toward review. The agent might compose draft letters for the editor to use in sending out an acceptance or rejection letter
an agent to coordinate preparation of a journal article by a group of authors. This might indeed be several interrelated agents -- agents to maintain bibliographic references and to produce articles in different forms and with different citation formats. Agents to do web crawling looking for information pertinent to the subject of the article. Agents to search through raw data looking for methods of analysis that might highlight the findings in the data. etc.

Document Processing

This is perhaps the single biggest area of potential research. The projects in this area include everything from trying to define fundamental terms and reengineered processes, to building prototype systems, to assessing the utility of prototype systems.

Design and conduct a series of surveys and other data gathering experiments focused on developing a model of how users see documents. The goal of this work is to provide system designers with better data about how users see and manipulate documents. For example, Spring conducted a series of informal surveys in the late 1980's that attempted to define the "prototypical" document. Similar studies might be done today. Other studies might include analysis of the prototypical elements of a document -- these might be used as input to the SGML design process, etc.
What do users see as the unit processes that are applied to documents. Spring suggests that they are subprocesses of creation, dissemination, storage and retrieval. Do these hold up today. How do electronic documents compare with paper documents with respect to these processes.
Conduct a study of any category of documents over a span of 20 years. Ask what is changing about them and what is staying the same -- e.g. increase in use of graphics, increase in number of references, increase in number of authors, increase/decrease in average length, etc. For those aspects that are found to be changing, suggest how document processing systems might be modified to aid.
Structured electronic documents have great potential to add new processing capability to document management. At the same time, the specification of a structured electronic document might be too general to be of much use, or too specific to allow appropriate flexibility. Conduct an analysis of existing DTD's to assess the nature of the structure imposed by each and the impact it is likely to have on the automation of processing for those documents
Prototype and test a system for DTD development in a corporate setting. How well does the system enable designers to gather and incorporate user data that makes for a usable (parser processing) and useful (business goal attainment) DTD.
Prototype and test a hypertext system developed in accord with the theoretical specifications of one of the proposed standard models for hypertext systems, e.g. Dexter, Amsterdam, etc.
Carry forward the work of Randy Trigg on link typology proposing an extended hypertext link typology that includes link directionality, link classes and subclasses. The extended typology might include link attributes like creation time, registration, traversals, etc.
Spring has suggested that documents have historically been passive objects, but they will with time become active. Develop and implement a model for active documents and conduct a proof of concept experiment that shows how it might be used in applications such as direct e-mail marketing, voting and balloting, etc.
Develop a system for corporate policy and procedure document development based on a specific DTD. Demonstrate the impact of such as system on the efficiency of the policy development process and quality of products, i.e. the policies and procedures, in terms of such things as creation time, update time, correctness, frequency of use, ease of finding information, etc.
Develop an approach to navigating a hypertext space. Demonstrate the strengths and weaknesses of the system in contrast with existing systems. (This project might well be closely tied to the infospace project.)
Develop a system that does some level of inductive structuring and formatting of a content stream as it is processed.
Develop a visualization tool for one or another level of "document entities":

develop a tool for the visualization of a single large document, finding some way to compactly and intuitively visualize an SGML like data structure including in the visualization both the elements and their attributes.
develop a tool that visualizes a partial document space in a hypertext web. This means that the visualization must represent both the nodes and the links. Find a way to compact the space without loss so as to represent a larger amount of "document territory" than current systems.
develop a tool that visualizes a large document space and that allows for zooming. Thus the visualization has to be computationally and visually scalable from low to high levels of granularity

Patterns

The work on patterns begins with a paper in the late 1980's by Spring and Grahm on engineering the interface. The basic concept is that there must be principles of significant predictive power that can be used to develop interfaces that can be predicted in advance to be workable for a given population in a given domain. The most specific implementation of this approach to date is in the masters thesis of Skogseid.

Extend or refine the set of patterns developed by Skogseid -- this involves the ability to make a preliminary assessment of the utility of the "patterns" as generalized descriptions of interface design.
Develop a preliminary set of patterns for software design in an area other than interface design. Consider for example:

patterns for client server systems,
patterns for secure systems,
patterns for visualization systems, or
patterns for agent design, etc.

Use Skogseid's approach, or some variation of it as a model for verifying the usability of the patterns.

Conduct an experiment in which two groups of students are trained in software development in a particular area -- interactive systems, client-server systems, etc. The control group is taught using a standard approach, the treatment group is taught using a pattern approach with patterns previously verified. As a final test, develop and assess a variety of measures of software quality and determine whether software development by the treatment group is any different than that of the control group. Examine both process variables and outcome variables.
In the initial work on patterns for interface design, the patterns were imagined as the principles that would underlie static design. That is to say, if the system were stable, they could be used to build a predictably reliable system. It is clear that there are a series of dynamics that also underlie system analysis and design. These include: the user population changes over time, individual user grow in expertise, the process which is being modeled on the system will change as a result of being modeled computationally, the process will change as the environment within which it exists changes. Develop and test a theory pertaining to these dynamics
Closely related to the process of system analysis and design is the process of reengineering. To some extent, SAD dictates that we formalize an existing process -- we automate it. Business process reengineering(BPR) says that the process that should be automated may be very very different from the historical process. Theoretically, this may be viewed as the difference between automating and informating as put forward by Zuboff. Develop a set of patterns that dictate how the SAD process should examine the process of "informating". Develop an experiment that seeks to assess the impact of the use of the developed patterns on the SAD process. The assessment should include both formative and summative assessments.

Standards

Standards is a pretty open field. There are lots of things to be done here. Unfortunately, there is little research on which to base the efforts. I have compared much of it to driving pilings in the sand upon which later researchers will be able to build research structures. The department has been responsible for several import pieces of research in this field -- on anticipatory standards, standards for human computer interaction, the standardization process, free ridership in standards, and financing the standards process. Many of the more important pieces of work that have to be done in this area are long term projects.

Take any standard of interest that has been developed recently and conduct a retrospective study of the members of the committee to try to determine the forces that influenced the shape of the final standard -- group process, politics, economics, corporate strategy, technology etc.
Find a committee interested in using electronic methods to speed the standard review process and see if you can get a subset of them to use CASCADE and gather an analyze the data to assess the impact of CASCADE on the speed of the development process and on the quality of the final product.
Review the minutes and documents emanating from JTC-1 over a two year period and conduct an analysis of the process from the point of view of the paper trail making recommendations for how the process might be improved.
Develop a system for metering and billing for access to electronically delivered standards in such a way as to increase access appropriately while assuring the SDO's that their products are secure
Review all the minutes and documents pertaining to one individual standardization effort and develop a model of the real process that might then be codified and later theoretically and operationally modified in some experiments

Virtualization

The work here is begins with the paper on mapping abstract data to virtual reality. The work might be based in part on the work on "infospace" that was done on the Silicon Graphics workstation.

Develop an experiment using the current implementation of infospace and test its functionality against browsing a more conventionally organized set of data.
Develop an input module for infospace that does some level of automatic mapping of the data
Develop a module for infospace that allows the user to examine an extended data object attached to one of the info space objects i.e. a document
Develop an experiment using the lens version of infospace and test its functionality against browsing a more conventionally delivered set of retrieved documents
Develop a module for infospace that maps some subset of the data to a non visual dimension e.g. auditory -- pitch, loudness, melody, etc.
Conduct an experiment on a data set that measures interactions among mapped data dimensions -- e.g. optimal number of mapped dimensions for novices versus individuals trained in system use; interactions between simultaneously mapped pre-attentive stimuli, etc.