|
|
|
| |

SIS Faculty Search Candidates ( February
27 - March 31, 2003 ) |
|
| |
|
|
| |
|
|
| |
|
|
Thursday,
February 27, 2003
11:00 am - 12:00 noon
Place: 501 IS Building
|
SIS Faculty
Candidate
Joseph Kabara
Visiting Assistant Professor
Department of Information Science and Telecommunications
"WIRELESS DATA NETWORK
DESIGN"
Abstract: Growth
in cell phone usage clearly illustrates the demand
for Universal connectivity, with users now accustomed
to ubiquitous voice service expecting similar access
to data services. Satisfying this new demand requires
the truly pervasive wireless Internet, which, in
turn, requires a supporting wireless access network
infrastructure. Currently, wireless network planning
tools are aimed at coverage based design; that
is, the design ensures that an adequate signal-to-interface
ratio (SIR) is maintained in the area where service
is provided. The SIR criterion is well suited to
current wireless voice and low speed data rate
services or for sparsely deployed high speed wireless
terminals. However, emerging wireless access networks
require a fundamentally different approach to support
high speed data to many users. The network design
solution must account for user density, expected
user subscriber profiles, traffic models for various
applications, and support for Quality of Service
(QoS) classes, in addition to SIR requirements.
This problem is formulated and solved as a Constraint
Satisfaction Problem.
|
| |
|
|
|
|
Monday,
March 10, 2003
Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 403 IS Building
Meeting with students
Time: 2:00 - 2:30 p.m.
Place: 1st floor conference room, IS Building
|
SIS Faculty
Candidate
Eun G. Park
Ph. D, Department
of Information Studies
University of California, Los Angeles
"ENSURING AUTHENTIC RECORDS
IN ELECTRONIC RECORDKEEPING SYSTEMS:
EUN PARK'S RESEARCH AND FUTURE DIRECTION"
Abstract: Eun Park's
research has focused broadly on digital resource
management; primarily on the authenticity and reliability
of digital information and records in their creation,
management and preservation in terms of different
communicates of practice. Her dissertation identified
the variables for ensuring authenticity in student
records systems and suggested a conceptual framework
of authenticity requirements and authentication
processes in electronic records management. Her
ongoing research interests include curricula development
of digital resources, metadata, language construction
by different communities, and authenticity requirements
of electronic recordkeeping systems.
|
| |
|
|
|
|
Wednesday,
March 12, 2003
Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building
Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 403 IS Building
Meeting with students
Time: 2:00 - 2:30 p.m.
Place: 1st floor conference room, IS Building |
SIS Faculty
Candidate
Madhusudhan Govindaraju
Postdoctoral Fellow, Extreme!
Computing Lab, Indiana University
"XCAT 2.0: COMPONENT-BASED
PROGRAMMING MODEL FOR GRID SERVICES"
Abstract: The most
important recent development in Grid systems is
the adoption of the web services model as a basic
architecture for Grid services. XCAT is a component
framework that is consistent with this model. It
allows application programmers to build distributed
applications by composing software components running
on remote resources. XCAT combines the concepts
of the Common Component Architecture (CCA) and
Open Grid Services Infrastructure (OGSI) specification.
We will discuss the architecture of XCAT and a
scripting based programming model for building
distributed applications.
|
| |
|
|
|
|
Thursday,
March 13, 2003
Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building
Public Colloquium
Time: 10:45 am - 11:45 am
Place: 501 IS Building
Meeting with students
Time: 2:00 - 2:30 pm
Place: 1st floor conference room, IS Building |
SIS Faculty
Candidate
Jiang Li
Ph.D. candidate in Computer
Science Department
Rensselaer Polytechnic Institute
"END-TO-END MULTICAST
CONGESTION CONTROL"
Abstract: IP multicast
is an efficient mechanism to disseminate data to
multiple recipients concurrently, where the source
sends only one copy of the data and each receiver
gets the data individually. One of the challenging
research problems in IP multicast is to provide
good congestion control protocols, i.e. to keep
the sending rate commensurate with the available
bandwidth of the path between the source and each
receiver. It is challenging due to the complicated
patterns of congestion on different paths and the
tremendous heterogeneity among them. My research
focuses on end-to-end solutions that do not require
upgrade of routers. In this talk I will present
two different congestion control schemes for IP
multicast.
The first scheme LE-SBCC (Loss-Event
Oriented Source-based Multicast Congestion Control)
is purely source-based. It is very easy to deploy
because most of the functions are deployed on the
source, while the minimum support of sending back
packet loss indication is required on receiver
side. In this scheme, the source applies a cascade
of filters to receiver feedback and gets a set
of congestion signals approximately equivalent
to those from the most congested path. All receivers
will receive data at the same rate which converges
approximately to the equivalent TCP rate of the
worst congested path.
The second scheme GMCC (Generalized
Multicast Congestion Control) requires support
from both source and receiver sides. However, it
can be easily configured to run as a single-rate
scheme (where all receivers receive rate at the
same rate, like LE-SBCC) or as a multi-rate scheme
(where different receivers can receive data at
different rates) by changing a parameter of the
source. When running in multi-rate mode, GMCC provides
multiple sub-sessions for the original session.
Each sub-session involves an independent single-rate
congestion control. Novel methods are used to decide
when a receiver should join or leave a sub-session
to get data at appropriate speed. |
| |
|
|
|
|
Friday,
March 14, 2003
Faculty Coffee
Time: 9:00-9:30 am
Place: 1st floor conference room, IS Building
Public Colloquium
Time: 11:30 am - 12:30 pm
Place: 403 IS Building
Meeting with students
Time: 9:30 - 10:00 am
Place: 1st floor conference room, IS Building |
SIS Faculty
Candidate
Filippo Menczer
Assistant Professor
"MINING, MAPPING, MODELING
AND CRAWLING THE WEB"
Abstract: Can we
model the scale-free distribution of Web links
under realistic assumptions about the behavior
of page authors? Can a Web crawler efficiently
locate an unknown relevant page? These questions
are receiving much attention due to their potential
impact for understanding the structure of the Web
and for building better search engines. This talk
will discuss the semantic maps obtained by analyzing
the connection between similarity functions based
on text, link and semantic cues across a massive
number of page pairs. These maps uncover some striking
relationships. For example link probability displays
a phase transition between a region where it is
not determined by content and one where it decays
with textual distance according to a power law.
This relationship suggests a novel Web growth model
that is shown to accurately predict the distribution
of page degree, based on textual content and assuming
only local knowledge of degree for existing pages.
A similar phase transition is found between link
probability and semantic distance, and both results
indicate that efficient paths can be discovered
by Web crawling algorithms based on textual and/or
categorical cues. I will conclude by surveying
a number of applications of these findings to the
evaluation and design of more efficient, effective,
and scalable search engines and crawlers.
|
| |
|
|
|
|
Monday,
March 17, 2003
Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building
Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 403 IS Building
Meeting with students
Time: 2:00 - 2:30 pm
Place: 1st floor conference room, IS Building |
SIS Faculty
Candidate
Miles Efron
PhD Candidate, School
of Information and Library Science
University of North Carolina, Chapel Hill
"A STATISTICAL APPROACH
TO DIMENSIONALITY ESTIMATION FOR
INFORMATION RETRIEVAL UNDER THE
LATENT SEMANTIC INDEXING MODEL"
Abstract: Latent
Semantic Indexing (LSI) is an extension of Salton's
Vector Space Model (VSM) of information
retrieval (IR). LSI uses dimensionality reduction
to improve the VSM similarity function. This dimensionality
reduction derives a statistical model of term-document
relationships. In a number of empirical studies
these statistical models have improved retrieval
performance over traditional, key word-based approaches.
However, LSI saddles researchers with an important
question: given a corpus with n documents and p
terms, how many dimensions should we retain? A
model with too few dimensions will lead to near-random
similarity judgments. On the other hand, admitting
too much complexity into a model incurs an overfitting
effect. According to Deerwester et al., choosing
the optimal dimensionality is "crucial" to
successful retrieval under LSI. However, in the
unsupervised learning environment of IR, a rigorous
definition of optimality or model goodness of fit
has remained elusive.
This talk compares several statistical methods
for identifying the optimal dimensionality of an
LSI system. I will report research on techniques
that seek models that are optimal in a practical
sense. But this research also tries to put LSI's
dimensionality truncation on firm theoretical ground.
Towards this goal, I will introduce amended parallel
analysis (APA), a novel dimensionality estimation
approach whose rationale suggests that LSI's dimensionality
reduction is merited to the extent that the indexing
terms depart from statistical independence.
|
| |
|
|
|
|
Tuesday & Wednesday
March 18 & 19, 2003
Faculty Coffee
Tuesday, March 18
Time: 3:00-4:00 pm
Place: 503, IS Building
Public Colloquium
Wednesday, March 19
Time: 11:00 am - 12:00 noon
Place: 403 IS Building
Meeting with students
Wednesday, March 19
Time: 9:30 - 10:00 am
Place: 1st floor conference room, IS
Building |
SIS Faculty
Candidate
P. Bryan Heidorn
University of Illinois
"BEYOND PAPER ON THE
WEB: HIGHLY FUNCTIONAL FLORA"
Abstract: A Flora
is not a book but an abstract representation of
the flora of an area. The Flora is a collection
of information about individual plant species and
their interrelationships. Unlike a novel or a journal
article, published Flora should be open documents
and should never be said to be bound and finished.
The Flora should be more like a serial publication
where new contributions can be made on a regular
basis, sometimes supporting and adding to previous
contributions and sometimes refuting and modifying
previous contributions. Electronics publishing
can afford us the flexibility to create these living
documents. This is not to deny the continuing value
of paper documents but the electronic medium can
also allow us to make paper copies at will of any
section of the Flora that is needed.
Published flora have evolved to meet
many uses, both purely scientific and of immediate
practicality. It is the need to easily find information
and easily use information that has shaped the
structure of paper Flora over the past centuries.
These same information needs should shape the structure
and function of electronic Flora but free of some
of the constraints of paper (and acknowledging
the unique limits of digital documents or their
delivery systems.) If we simply convert the paper
documents to digital format we do nothing to capitalize
on the time tested internal structure of multi-volume,
paper-oriented documents which have evolved to
address specific information needs. These structures
can be more fully exploited in the process of digital
conversion of the material. While some structural
aspects of paper-based publishing are difficult
to bring to the electronic medium other structural
aspects of paper documents are artifacts of limitations
of the paper medium. Why do we need to flip to
another page or volume to find the definition of
a word? Why must we serially flip through pages
looking for taxonomically related categories? Why
do we need to resort to tables of contents and
back of book indices to find entries? Frequently
these processes can be simplified or eliminated
with improved computer support tools.
This talk will describes a project
to improve information system functionality by
exploiting the natural structure within text as
well as the inherent information structure of the
domain of Flora. This talk describes the process
of digital conversion and integration of taxonomic
(morphological) descriptions, glossaries and categorical
thesauri. The Biological Information Browsing Environment
team (http://www.biobrowser.org)
developed programs for the digitization included
automatic text segmentation, automated XML markup,
taxonomic browsing, structure-based indexing, automatic
thesaurus extraction for query expansion, on-line
definitions and finally web-based visualization
tools. The last part of the talk addresses the
evaluation of the system.
The object of the design is not to
return facsimiles of traditional paper-based publications
but to alter the information structure to allow
more natural access the materials.
|
| |
|
|
|
|
Friday,
March 21, 2003
Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building
Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 501 IS Building
Meeting with students
Time: 2:00 - 2:30 pm
Place: 1st floor conference room, IS Building |
SIS Faculty
Candidate
James B. D. Joshi
PhD Candidate, School of
Electrical and Computer Engineering Purdue University
"A GENERALIZED TEMPORAL
ROLE BASED ACCESS CONTROL FRAMEWORK FOR DEVELOPING
SECURE APPLICATIONS"
Abstract: A key
issue in computer system security is to protect
information against unauthorized access. Emerging
workflow-based applications in healthcare, manufacturing,
the financial sector, and e-commerce inherently
have complex, time-based access control requirements.
To address the diverse security needs of these
applications, a Role Based Access Control (RBAC)
approach can be used as a viable alternative to
traditional discretionary and mandatory access
control approaches. The key features of RBAC include
policy neutrality, support for least privilege,
and efficient access control management. However,
existing RBAC approaches do not address the growing
need for supporting time-based access control requirements
for these applications.
In this talk, I will present a Generalized Temporal
Role Based Access Control (GTRBAC) model that combines
the key features of the RBAC model with a powerful
temporal framework. The proposed GTRBAC model allows
specification of a comprehensive set of time-based
access control policies, including temporal constraints
on role enabling, user-role and role-permission
assignments, and role activations. The model provides
an event-based mechanism for supporting context
based access control, as well as expressing dynamic
access control policies, which are crucial for
developing secure workflow-based enterprise applications.
In addition, the temporal hierarchies and separation
of duty constraints facilitated by GTRBAC allow
the development of security policies for commercial
enterprises. I will discuss various design guidelines
for managing complexity and building secure systems
based on this model, as well as an XML-based GTRBAC
policy specification language that is currently
being developed for building secure XML-based distributed
applications. |
| |
|
|
|
|
Monday,
March 24, 2003
Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building
Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 403 IS Building
Meeting with students
Time: 2:00 - 2:30 pm
Place: 1st floor conference room, IS
Building |
SIS Faculty
Candidate
Amanda Spink
Associate Professor
School of Information Sciences and Technology
The Pennsylvania State University
"WEB SEARCHING 1997-2002:
ISSUES & TRENDS"
Abstract: The past
decade has seen huge growth in Web search engine
usage on a global scale. Studies identifying search
trends and model user interaction with Web search
engines are key to improving search engine performance
and evaluation. This presentation will discuss
key trends and issues from a major ongoing series
of longitudinal studies of public Web searching
from 1997 to 2002. The large Web query data sets
examined in these ongoing studies were supplied
by the Web search companies – Excite, Ask
Jeeves, Fast and Alta Vista.
This research identifies: (1) major
trends in public Web searching from 1997 to 2002,
including changes in search topics and search characteristics
related to queries, terms, sessions, pages viewed
and search duration, question and request querying,
multimedia searches, medical/health searching,
e-commerce searching, sexual searches, agent Web
interaction and global regional differences in
Web search, (2) significant new and more complex
models of human information related behaviors associated
with Web search engine interaction, including successive
searching and multitasking. A human behavior approach
to search is discussed, including implications
for search engine design and further research.
This research is part of a large
collaborative project with Jim Jansen, and the
Web companies – Fast and Alta Vista.
|
| |
|
|
|
|
Tuesday,
March 25, 2003
Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building
Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 404 IS Building
Meeting with students
Time: 2:00 - 2:30 pm
Place: 1st floor conference room, IS Building |
SIS Faculty
Candidate
Jaudelice C. de Oliveira
"NEW PREEMPTION POLICIES
FOR DIFFSERV-AWARE TRAFFIC ENGINEERING
TO MINIMIZE REROUTING IN MPLS NETWORKS"
Abstract: In the
Multiprotocol Label Switching (MPLS) network context,
preemption is the act of selecting a Label Switched
Path (LSP) which will be removed from a given path
in order to give room to another LSP with
a higher priority. More specifically, preemption
attributes determine whether an LSP with a certain setup
preemption priority can preempt another LSP with
a lower holding preemption priority from a given
path, when there is a competition for available
resources. The preempted LSP may then be
rerouted.
In this talk, I will present a new preemption
policy which is complemented with an adaptive
scheme that aims to minimize rerouting. The
preemption policy (V-PREPT) is versatile, simple, and
robust, combining the three main preemption optimization criteria:
number of LSPs to be preempted, priority of LSPs
to be preempted, and amount of bandwidth
to be preempted. Using V PREPT, a service provider
can balance the objective function that will be
optimized in order to stress the desired criteria.
V-PREPT is complemented by an adaptive scheme and
is then called Adapt-V-PREPT. The new adaptive
policy selects lower priority LSPs that can afford
to have their rate reduced. The selected LSPs will
fairly reduce their rate in order to accommodate
the new high-priority LSP setup request. Heuristics
for both simple preemption policy and adaptive
preemption scheme are derived. Simulation results
show the heuristics' accuracy. Performance comparisons
among a non-preemptive approach, V-PREPT, Adapt-V-PREPT,
and a policy purely based on priority and holding
time are also provided.
|
| |
|
|
|
|
Wednesday,
March 26, 2003
Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building
Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 403 IS Building
Meeting with students
Time: 2:00 - 2:30 pm
Place: 1st floor conference room, IS Building |
SIS Faculty
Candidate
William W. Cohen
Visiting Associate
Professor, Center for Automated Learning
and Discovery
Carnegie Mellon University
"SIMILARITY-BASED QUERYING
OF HETEROGENEOUS DATABASES: AN OVERVIEW OF THE
WHIRL SYSTEM"
Abstract: The vision
of a large, widely-distributed knowledge base that
can be created collectively by many autonomous
contributors is being pursued by several technical
communities, including the communities associated
with the semantic web; peer-to-peer databases;
large-scale data integration across the "deep
web"; and large-scale information extraction
from the web. However, information that is drawn
from many different sources is usually very hard
to reason with. Typical problems include terminological
differences, as well as the interleaving of unstructured
textual information with structured data-like information.
WHIRL is a representation system that addresses
some of these problems. WHIRL incorporates ideas
from both relational databases systems and statistical
information retrieval: specifically, it extends
(a subset of) SQL by adding special features for
reasoning about the similarity of fragments of
text. WHIRL strictly generalizes both logical deduction
and ranked retrieval of documents; it can be implemented
efficiently; and it greatly facilitates the construction
of question-answering systems that use information
found at multiple Web sites. Certain types of WHIRL
queries can be interpreted as nearest-neighbor-like
machine learning algorithms, and WHIRL can be used
to automatically collect background knowledge for
certain learning tasks from the web. Recent experiments
also suggest that the similarity metrics used in
WHIRL are very competitive with more computationally
expensive metrics.
In this talk I will survey previous work using
WHIRL, and also present some recent work comparing
the underlying similarity metrics. Parts of this
work are joint with Wei Fan, Stephen Fienberg,
Haym Hirsh, Pradeep Ravikumar, and Sarah Zelikovitz.
|
| |
|
|
|
|
Friday,
March 28, 2003
Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building
Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 403 IS Building
Meeting with students
Time: 2:00 - 2:30 pm
Place: 1st floor conference room, IS Building |
SIS Faculty
Candidate
Jeffrey Pomerantz
Ph. D Candidate,
School of Information Studies,
Syracuse
University, Syracuse, New York
"QUESTION TAXONOMIES
FOR DIGITAL REFERENCE TRIAGE"
Abstract: The growth
in the past decade of both the infrastructure and
the number of users of the Internet has enabled
a corresponding growth in the number of users of
digital reference services on the Internet. This
increase has led to an increase in the number of
questions received by these services, putting a
strain on the human intermediaries employed therein.
The ability of a digital reference service to scale
up to handle an increasingly large number of questions
is directly affected by the amount of automation
employed by that service:
the more processes that are automated, the more
of human intermediaries time and effort can be
dedicated to tasks that cannot yet be automated.
There is, now more than ever, an increased and
immediate need in digital reference for automation.
This
study identifies (1) the types of questions that
are received by digital reference services, according
to several taxonomies of questions at different
levels of linguistic analysis, and (2) the rules
by which questions are triaged within and between
services (triage being the process of routing and
assigning questions to expert digital reference
question answerers and other services). Taxonomies
of questions
are identified though an extensive review of literature
that deals with questions, from several fields:
desk and digital reference, question answering,
and linguistics. The rules by which questions are
triaged are identified through a think-aloud study
of digital reference triagers performing the task
of triage. The goal of this study is to develop
specifications according to which an automated
triage system can
be built. These specifications will take into account
the question type, as well as other attributes
of questions that affect triage decisions. These
taxonomies of questions may also prove to be useful
as the
basis for algorithms for automating other steps
in the digital reference process.
|
| |
|
|
|
|
Monday,
March 31, 2003
Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building
Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 403 IS Building
Meeting with students
Time: 10:00 - 10:30 am
Place: 1st floor conference room, IS Building |
SIS Faculty
Candidate
Peter Y. Wu
Visiting Assistant
Professor
Department of Information Science and Telecommunications
"AN OBJECT-ORIENTED
SYSTEM FOR DISTRIBUTED WORKFLOW AUTOMATION"
Abstract: When
solving problems, we often apply the notion of
vicinity: when things are far apart, they do
not affect one another. On the other hand, related
information, we would gather into one place.
We examine our solution of building space/time
sub-division to form discrete neighborhoods in
two problems: cartographic map overlay and industrial
production planning. Motivated by our approach
to solution, we describe an object-oriented system
for workflow automation.
For each business process, we capture into one
logical unit the protocol knowledge along with
all relevant information. We call it the workflow
object. The workflow object can then serve as
a model-driven application in a distributed environment,
coordinating interactivities among the various
agents in the network. The workflow system comprises
two sub-systems: the workflow definition desktop,
and the distributed workflow servers. The workflow
definition desktop provides a visual editor and
a simulator for design and validation, producing
workflow templates kept in a repository. We fill
out a template to create a workflow object. The
workflow server initiates the workflow instance
and monitors its execution by coordinating agent
activities accordingly.
The object-oriented design of the workflow object
facilitates for locality of reference, and provides
ease of administering. Based on the interoperability
in the design of workflow objects, we argue that
the system offers flexibility of model-driven
application architecture, and versatility since
the system scales up well, being built on the
object-oriented composition of workflows. With
products and materials availability accessible
through internet services, we may apply workflow
automation to compose services for supply chain
management on the internet.
|
| |
|
|
|
| |
|
|
| |
|
|
|
|
|