Suggestions for Student Research Projects
The ideas and project descriptions below are broken down below in five
major areas. In many cases the work in one area may be combined with the
work in another area, so some of the boundaries are artificially imposed.
. The areas include:
- Augmentation
- Document Processing
- Patterns
- Standards
- Virtualization
Augmentation
Augmentation has to do with the study of how the computer may be used
to augment or supplement human endeavors. Specifically, it examines how
human skill repertoires may be moved to the computer system. In this process,
the natural human skill may be effected in several ways. The natural skill
may atrophy. Other skills may be effected by the atrophy -- and this impact
may have undesirable negative side effects. Ideally, the movement of basic
tasks to the computer will free up the human to focus on higher level skills.
Augmentation is closely associated with the writings of Douglas Engelbart.
Engelbart's work on augmentation was closely associated with information
and documents, and so are the ideas proposed below. Keep in mind however
that augmentation is not restricted in any way shape or form to document
and document artifacts.
- Conduct an experiment to measure the impact in skills in the human
of moving some portion of a skill repertoire to an artifact, e.g. what
is the impact on the human ability to calculate of having and making regular
use of a calculator
- Develop a model of some subprocess of information processing e.g. commenting,
showing at a NGOMS or KAT level the activities required to engage the process.
Then show how that process may be allocated to human and artifact to minimize
the cognitive load on the human. Finally, train the human in the modified/augmented
process and measure the impact of the augmentation in comparison to the
traditional mode of processing
- Develop a set of agents in accord with the Spring classification of
agents as executive, collaborative, communications, service, etc., and
measure user reaction to the use of these agents to determine if the hypothesis
that higher level agents will be more resisted than lower level agents.
- Develop and test a set of mechanisms that may provide persona characteristics
for agents and measure whether the persona manifest has the predicted impact
on user acceptance and trust of a given agent.
- Refine the agent architecture, functional and structural proposed by
Spring, and see if it bears out as an aid in agent development for CASCADE.
- Test the relative merits of augmenting human intellectual activity
by the following mechanisms:
- provide raw information (data tables)
- provide processed information summatively (statistical processing of
the tables)
- provide processed information visually (visualization of the data tables)
- provide the results of action on information (simulation or expert
systems approaches based on the data tables and rules)
Assess which of the mechanisms is most productive for humans:
- in the short term and long term use of the system
- for trained and untrained users
- for casual users and expert users
- Develop a specific agent for CASCADE which provides some form ofaugmentation.
A few of the "agents" that have been discussed for different
versions of CASCADE include:
- For Standards CASCADE
- an agent to manage the balloting of the standard. The agent would send
out the ballot, sent out reminders, provide interim and final balloting
results.
- an agent to watch comments being posted to a document and inform people
of the comments that they are most likely to want to look at.
- For Education CASCADE
- an agent to help instructors translate lecture notes into student reading
notes -- this agent would have to be able to parse notes at some level
and let the instructor know where more information is needed. This agent
might work with the agent described below to gather data about student
explicit and implicit reactions to a lecture and identify places in the
lecture where elaboration, clarification, examples, etc. might improve
the lecture.
- an agent to watch questions being asked by students so as to identify
answers given to previous questions that might satisfy the information
need of the student. More specifically, the agent might keep a table of
stemmed and stopped question tokens and match an asked question against
this table. If the question matches an entry in the table to some criteria,
the answer given to the table question would be presented to the student.
The match between the asked and the table question might be rated higher
or lower based on other variables such as the lecture that caused the question
to be asked, the history of the student, the time in the term, etc. Questions
not answered by the system would be passed to a human and the stemmed question
and the answer provided by the human would become and additional entry
in the table. The agent might also suggest some modifications to the answer
to assure that personal information is removed and points are made as clearly
as possible. Finally, if questions and answers came in series, the agent
would have to pay attention to this. Lastly, the questions for which matched
answers were not appropriate might be used to add additional negative weighting
terms to the stemmed question for which the inappropriate answer was uncovered.
- For Journal CASCADE
- an agent to coordinate the reviews of an article, selecting possible
reviewers for the editor to select from, reminding reviewers about their
commitments, and keeping the editor abreast of the progress toward review.
The agent might compose draft letters for the editor to use in sending
out an acceptance or rejection letter
- an agent to coordinate preparation of a journal article by a group
of authors. This might indeed be several interrelated agents -- agents
to maintain bibliographic references and to produce articles in different
forms and with different citation formats. Agents to do web crawling looking
for information pertinent to the subject of the article. Agents to search
through raw data looking for methods of analysis that might highlight the
findings in the data. etc.
Document Processing
This is perhaps the single biggest area of potential research. The projects
in this area include everything from trying to define fundamental terms
and reengineered processes, to building prototype systems, to assessing
the utility of prototype systems.
- Design and conduct a series of surveys and other data gathering experiments
focused on developing a model of how users see documents. The goal of this
work is to provide system designers with better data about how users see
and manipulate documents. For example, Spring conducted a series of informal
surveys in the late 1980's that attempted to define the "prototypical"
document. Similar studies might be done today. Other studies might include
analysis of the prototypical elements of a document -- these might be used
as input to the SGML design process, etc.
- What do users see as the unit processes that are applied to documents.
Spring suggests that they are subprocesses of creation, dissemination,
storage and retrieval. Do these hold up today. How do electronic documents
compare with paper documents with respect to these processes.
- Conduct a study of any category of documents over a span of 20 years.
Ask what is changing about them and what is staying the same -- e.g. increase
in use of graphics, increase in number of references, increase in number
of authors, increase/decrease in average length, etc. For those aspects
that are found to be changing, suggest how document processing systems
might be modified to aid.
- Structured electronic documents have great potential to add new processing
capability to document management. At the same time, the specification
of a structured electronic document might be too general to be of much
use, or too specific to allow appropriate flexibility. Conduct an analysis
of existing DTD's to assess the nature of the structure imposed by each
and the impact it is likely to have on the automation of processing for
those documents
- Prototype and test a system for DTD development in a corporate setting.
How well does the system enable designers to gather and incorporate user
data that makes for a usable (parser processing) and useful (business goal
attainment) DTD.
- Prototype and test a hypertext system developed in accord with the
theoretical specifications of one of the proposed standard models for hypertext
systems, e.g. Dexter, Amsterdam, etc.
- Carry forward the work of Randy Trigg on link typology proposing an
extended hypertext link typology that includes link directionality, link
classes and subclasses. The extended typology might include link attributes
like creation time, registration, traversals, etc.
- Spring has suggested that documents have historically been passive
objects, but they will with time become active. Develop and implement a
model for active documents and conduct a proof of concept experiment that
shows how it might be used in applications such as direct e-mail marketing,
voting and balloting, etc.
- Develop a system for corporate policy and procedure document development
based on a specific DTD. Demonstrate the impact of such as system on the
efficiency of the policy development process and quality of products, i.e.
the policies and procedures, in terms of such things as creation time,
update time, correctness, frequency of use, ease of finding information,
etc.
- Develop an approach to navigating a hypertext space. Demonstrate the
strengths and weaknesses of the system in contrast with existing systems.
(This project might well be closely tied to the infospace project.)
- Develop a system that does some level of inductive structuring and
formatting of a content stream as it is processed.
- Develop a visualization tool for one or another level of "document
entities":
- develop a tool for the visualization of a single large document, finding
some way to compactly and intuitively visualize an SGML like data structure
including in the visualization both the elements and their attributes.
- develop a tool that visualizes a partial document space in a hypertext
web. This means that the visualization must represent both the nodes and
the links. Find a way to compact the space without loss so as to represent
a larger amount of "document territory" than current systems.
- develop a tool that visualizes a large document space and that allows
for zooming. Thus the visualization has to be computationally and visually
scalable from low to high levels of granularity
Patterns
The work on patterns begins with a paper in the late 1980's by Spring
and Grahm on engineering the interface. The basic concept is that there
must be principles of significant predictive power that can be used to
develop interfaces that can be predicted in advance to be workable for
a given population in a given domain. The most specific implementation
of this approach to date is in the masters thesis of Skogseid.
- Extend or refine the set of patterns developed by Skogseid -- this
involves the ability to make a preliminary assessment of the utility of
the "patterns" as generalized descriptions of interface design.
- Develop a preliminary set of patterns for software design in an area
other than interface design. Consider for example:
- patterns for client server systems,
- patterns for secure systems,
- patterns for visualization systems, or
- patterns for agent design, etc.
Use Skogseid's approach, or some variation of it as a model for verifying
the usability of the patterns.
- Conduct an experiment in which two groups of students are trained in
software development in a particular area -- interactive systems, client-server
systems, etc. The control group is taught using a standard approach, the
treatment group is taught using a pattern approach with patterns previously
verified. As a final test, develop and assess a variety of measures of
software quality and determine whether software development by the treatment
group is any different than that of the control group. Examine both process
variables and outcome variables.
- In the initial work on patterns for interface design, the patterns
were imagined as the principles that would underlie static design. That
is to say, if the system were stable, they could be used to build a predictably
reliable system. It is clear that there are a series of dynamics that also
underlie system analysis and design. These include: the user population
changes over time, individual user grow in expertise, the process which
is being modeled on the system will change as a result of being modeled
computationally, the process will change as the environment within which
it exists changes. Develop and test a theory pertaining to these dynamics
- Closely related to the process of system analysis and design is the
process of reengineering. To some extent, SAD dictates that we formalize
an existing process -- we automate it. Business process reengineering(BPR)
says that the process that should be automated may be very very different
from the historical process. Theoretically, this may be viewed as the difference
between automating and informating as put forward by Zuboff. Develop a
set of patterns that dictate how the SAD process should examine the process
of "informating". Develop an experiment that seeks to assess
the impact of the use of the developed patterns on the SAD process. The
assessment should include both formative and summative assessments.
Standards
Standards is a pretty open field. There are lots of things to be done
here. Unfortunately, there is little research on which to base the efforts.
I have compared much of it to driving pilings in the sand upon which later
researchers will be able to build research structures. The department has
been responsible for several import pieces of research in this field --
on anticipatory standards, standards for human computer interaction, the
standardization process, free ridership in standards, and financing the
standards process. Many of the more important pieces of work that have
to be done in this area are long term projects.
- Take any standard of interest that has been developed recently and
conduct a retrospective study of the members of the committee to try to
determine the forces that influenced the shape of the final standard --
group process, politics, economics, corporate strategy, technology etc.
- Find a committee interested in using electronic methods to speed the
standard review process and see if you can get a subset of them to use
CASCADE and gather an analyze the data to assess the impact of CASCADE
on the speed of the development process and on the quality of the final
product.
- Review the minutes and documents emanating from JTC-1 over a two year
period and conduct an analysis of the process from the point of view of
the paper trail making recommendations for how the process might be improved.
- Develop a system for metering and billing for access to electronically
delivered standards in such a way as to increase access appropriately while
assuring the SDO's that their products are secure
- Review all the minutes and documents pertaining to one individual standardization
effort and develop a model of the real process that might then be codified
and later theoretically and operationally modified in some experiments
Virtualization
The work here is begins with the paper on mapping abstract data to virtual
reality. The work might be based in part on the work on "infospace"
that was done on the Silicon Graphics workstation.
- Develop an experiment using the current implementation of infospace
and test its functionality against browsing a more conventionally organized
set of data.
- Develop an input module for infospace that does some level of automatic
mapping of the data
- Develop a module for infospace that allows the user to examine an extended
data object attached to one of the info space objects i.e. a document
- Develop an experiment using the lens version of infospace and test
its functionality against browsing a more conventionally delivered set
of retrieved documents
- Develop a module for infospace that maps some subset of the data to
a non visual dimension e.g. auditory -- pitch, loudness, melody, etc.
- Conduct an experiment on a data set that measures interactions among
mapped data dimensions -- e.g. optimal number of mapped dimensions for
novices versus individuals trained in system use; interactions between
simultaneously mapped pre-attentive stimuli,
etc.