School of Information Sciences Banner print this page

 

  Events
SIS Faculty Search Candidates ( February 27 - March 31, 2003 )
 
     
     
 
 

Thursday,
February 27, 2003
11:00 am - 12:00 noon
Place: 501 IS Building

SIS Faculty Candidate

Joseph Kabara
Visiting Assistant Professor
Department of Information Science and Telecommunications

"WIRELESS DATA NETWORK DESIGN"

Abstract: Growth in cell phone usage clearly illustrates the demand for Universal connectivity, with users now accustomed to ubiquitous voice service expecting similar access to data services. Satisfying this new demand requires the truly pervasive wireless Internet, which, in turn, requires a supporting wireless access network infrastructure. Currently, wireless network planning tools are aimed at coverage based design; that is, the design ensures that an adequate signal-to-interface ratio (SIR) is maintained in the area where service is provided. The SIR criterion is well suited to current wireless voice and low speed data rate services or for sparsely deployed high speed wireless terminals. However, emerging wireless access networks require a fundamentally different approach to support high speed data to many users. The network design solution must account for user density, expected user subscriber profiles, traffic models for various applications, and support for Quality of Service (QoS) classes, in addition to SIR requirements. This problem is formulated and solved as a Constraint Satisfaction Problem.

   

 

Monday,
March 10, 2003

Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 403 IS Building

Meeting with students
Time: 2:00 - 2:30 p.m.
Place: 1st floor conference room, IS Building

SIS Faculty Candidate

Eun G. Park
Ph. D, Department of Information Studies
University of California, Los Angeles

"ENSURING AUTHENTIC RECORDS
IN ELECTRONIC RECORDKEEPING SYSTEMS:
EUN PARK'S RESEARCH AND FUTURE DIRECTION"

Abstract: Eun Park's research has focused broadly on digital resource management; primarily on the authenticity and reliability of digital information and records in their creation, management and preservation in terms of different communicates of practice. Her dissertation identified the variables for ensuring authenticity in student records systems and suggested a conceptual framework of authenticity requirements and authentication processes in electronic records management. Her ongoing research interests include curricula development of digital resources, metadata, language construction by different communities, and authenticity requirements of electronic recordkeeping systems.

   

 

Wednesday,
March 12, 2003

Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building

Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 403 IS Building

Meeting with students
Time: 2:00 - 2:30 p.m.
Place: 1st floor conference room, IS Building

SIS Faculty Candidate

Madhusudhan Govindaraju
Postdoctoral Fellow, Extreme! Computing Lab, Indiana University

"XCAT 2.0: COMPONENT-BASED
PROGRAMMING MODEL FOR GRID SERVICES"

Abstract: The most important recent development in Grid systems is the adoption of the web services model as a basic architecture for Grid services. XCAT is a component framework that is consistent with this model. It allows application programmers to build distributed applications by composing software components running on remote resources. XCAT combines the concepts of the Common Component Architecture (CCA) and Open Grid Services Infrastructure (OGSI) specification. We will discuss the architecture of XCAT and a scripting based programming model for building distributed applications.

   

 

Thursday,
March 13, 2003

Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building

Public Colloquium
Time: 10:45 am - 11:45 am
Place: 501 IS Building

Meeting with students
Time: 2:00 - 2:30 pm
Place: 1st floor conference room, IS Building

SIS Faculty Candidate

Jiang Li
Ph.D. candidate in Computer Science Department
Rensselaer Polytechnic Institute

"END-TO-END MULTICAST CONGESTION CONTROL"

Abstract: IP multicast is an efficient mechanism to disseminate data to multiple recipients concurrently, where the source sends only one copy of the data and each receiver gets the data individually. One of the challenging research problems in IP multicast is to provide good congestion control protocols, i.e. to keep the sending rate commensurate with the available bandwidth of the path between the source and each receiver. It is challenging due to the complicated patterns of congestion on different paths and the tremendous heterogeneity among them. My research focuses on end-to-end solutions that do not require upgrade of routers. In this talk I will present two different congestion control schemes for IP multicast.

The first scheme LE-SBCC (Loss-Event Oriented Source-based Multicast Congestion Control) is purely source-based. It is very easy to deploy because most of the functions are deployed on the source, while the minimum support of sending back packet loss indication is required on receiver side. In this scheme, the source applies a cascade of filters to receiver feedback and gets a set of congestion signals approximately equivalent to those from the most congested path. All receivers will receive data at the same rate which converges approximately to the equivalent TCP rate of the worst congested path.

The second scheme GMCC (Generalized Multicast Congestion Control) requires support from both source and receiver sides. However, it can be easily configured to run as a single-rate scheme (where all receivers receive rate at the same rate, like LE-SBCC) or as a multi-rate scheme (where different receivers can receive data at different rates) by changing a parameter of the source. When running in multi-rate mode, GMCC provides multiple sub-sessions for the original session. Each sub-session involves an independent single-rate congestion control. Novel methods are used to decide when a receiver should join or leave a sub-session to get data at appropriate speed.

   

 

Friday,
March 14, 2003

Faculty Coffee
Time: 9:00-9:30 am
Place: 1st floor conference room, IS Building

Public Colloquium
Time: 11:30 am - 12:30 pm
Place: 403 IS Building

Meeting with students
Time: 9:30 - 10:00 am
Place: 1st floor conference room, IS Building

SIS Faculty Candidate

Filippo Menczer
Assistant Professor

"MINING, MAPPING, MODELING
AND CRAWLING THE WEB"

Abstract: Can we model the scale-free distribution of Web links under realistic assumptions about the behavior of page authors? Can a Web crawler efficiently locate an unknown relevant page? These questions are receiving much attention due to their potential impact for understanding the structure of the Web and for building better search engines. This talk will discuss the semantic maps obtained by analyzing the connection between similarity functions based on text, link and semantic cues across a massive number of page pairs. These maps uncover some striking relationships. For example link probability displays a phase transition between a region where it is not determined by content and one where it decays with textual distance according to a power law. This relationship suggests a novel Web growth model that is shown to accurately predict the distribution of page degree, based on textual content and assuming only local knowledge of degree for existing pages. A similar phase transition is found between link probability and semantic distance, and both results indicate that efficient paths can be discovered by Web crawling algorithms based on textual and/or categorical cues. I will conclude by surveying a number of applications of these findings to the evaluation and design of more efficient, effective, and scalable search engines and crawlers.

   

 

Monday,
March 17, 2003

Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building

Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 403 IS Building

Meeting with students
Time: 2:00 - 2:30 pm
Place: 1st floor conference room, IS Building

SIS Faculty Candidate

Miles Efron
PhD Candidate, School of Information and Library Science
University of North Carolina, Chapel Hill

"A STATISTICAL APPROACH TO DIMENSIONALITY ESTIMATION FOR INFORMATION RETRIEVAL UNDER THE LATENT SEMANTIC INDEXING MODEL"

Abstract: Latent Semantic Indexing (LSI) is an extension of Salton's Vector Space Model (VSM) of information retrieval (IR). LSI uses dimensionality reduction to improve the VSM similarity function. This dimensionality reduction derives a statistical model of term-document relationships. In a number of empirical studies these statistical models have improved retrieval performance over traditional, key word-based approaches.

However, LSI saddles researchers with an important question: given a corpus with n documents and p terms, how many dimensions should we retain? A model with too few dimensions will lead to near-random similarity judgments. On the other hand, admitting too much complexity into a model incurs an overfitting effect. According to Deerwester et al., choosing the optimal dimensionality is "crucial" to successful retrieval under LSI. However, in the unsupervised learning environment of IR, a rigorous definition of optimality or model goodness of fit has remained elusive.

This talk compares several statistical methods for identifying the optimal dimensionality of an LSI system. I will report research on techniques that seek models that are optimal in a practical sense. But this research also tries to put LSI's dimensionality truncation on firm theoretical ground. Towards this goal, I will introduce amended parallel analysis (APA), a novel dimensionality estimation approach whose rationale suggests that LSI's dimensionality reduction is merited to the extent that the indexing terms depart from statistical independence.

   

 

Tuesday & Wednesday
March 18 & 19, 2003

Faculty Coffee
Tuesday, March 18
Time: 3:00-4:00 pm
Place: 503, IS Building

Public Colloquium
Wednesday, March 19
Time: 11:00 am - 12:00 noon
Place: 403 IS Building

Meeting with students
Wednesday, March 19
Time: 9:30 - 10:00 am
Place: 1st floor conference room, IS Building

SIS Faculty Candidate

P. Bryan Heidorn
University of Illinois

"BEYOND PAPER ON THE WEB: HIGHLY FUNCTIONAL FLORA"

Abstract: A Flora is not a book but an abstract representation of the flora of an area. The Flora is a collection of information about individual plant species and their interrelationships. Unlike a novel or a journal article, published Flora should be open documents and should never be said to be bound and finished. The Flora should be more like a serial publication where new contributions can be made on a regular basis, sometimes supporting and adding to previous contributions and sometimes refuting and modifying previous contributions. Electronics publishing can afford us the flexibility to create these living documents. This is not to deny the continuing value of paper documents but the electronic medium can also allow us to make paper copies at will of any section of the Flora that is needed.

Published flora have evolved to meet many uses, both purely scientific and of immediate practicality. It is the need to easily find information and easily use information that has shaped the structure of paper Flora over the past centuries. These same information needs should shape the structure and function of electronic Flora but free of some of the constraints of paper (and acknowledging the unique limits of digital documents or their delivery systems.) If we simply convert the paper documents to digital format we do nothing to capitalize on the time tested internal structure of multi-volume, paper-oriented documents which have evolved to address specific information needs. These structures can be more fully exploited in the process of digital conversion of the material. While some structural aspects of paper-based publishing are difficult to bring to the electronic medium other structural aspects of paper documents are artifacts of limitations of the paper medium. Why do we need to flip to another page or volume to find the definition of a word? Why must we serially flip through pages looking for taxonomically related categories? Why do we need to resort to tables of contents and back of book indices to find entries? Frequently these processes can be simplified or eliminated with improved computer support tools.

This talk will describes a project to improve information system functionality by exploiting the natural structure within text as well as the inherent information structure of the domain of Flora. This talk describes the process of digital conversion and integration of taxonomic (morphological) descriptions, glossaries and categorical thesauri. The Biological Information Browsing Environment team (http://www.biobrowser.org) developed programs for the digitization included automatic text segmentation, automated XML markup, taxonomic browsing, structure-based indexing, automatic thesaurus extraction for query expansion, on-line definitions and finally web-based visualization tools. The last part of the talk addresses the evaluation of the system.

The object of the design is not to return facsimiles of traditional paper-based publications but to alter the information structure to allow more natural access the materials.

   

 

Friday,
March 21, 2003

Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building

Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 501 IS Building

Meeting with students
Time: 2:00 - 2:30 pm
Place: 1st floor conference room, IS Building

SIS Faculty Candidate

James B. D. Joshi
PhD Candidate, School of Electrical and Computer Engineering Purdue University

"A GENERALIZED TEMPORAL ROLE BASED ACCESS CONTROL FRAMEWORK FOR DEVELOPING SECURE APPLICATIONS"

Abstract: A key issue in computer system security is to protect information against unauthorized access. Emerging workflow-based applications in healthcare, manufacturing, the financial sector, and e-commerce inherently have complex, time-based access control requirements. To address the diverse security needs of these applications, a Role Based Access Control (RBAC) approach can be used as a viable alternative to traditional discretionary and mandatory access control approaches. The key features of RBAC include policy neutrality, support for least privilege, and efficient access control management. However, existing RBAC approaches do not address the growing need for supporting time-based access control requirements for these applications.

In this talk, I will present a Generalized Temporal Role Based Access Control (GTRBAC) model that combines the key features of the RBAC model with a powerful temporal framework. The proposed GTRBAC model allows specification of a comprehensive set of time-based access control policies, including temporal constraints on role enabling, user-role and role-permission assignments, and role activations. The model provides an event-based mechanism for supporting context based access control, as well as expressing dynamic access control policies, which are crucial for developing secure workflow-based enterprise applications. In addition, the temporal hierarchies and separation of duty constraints facilitated by GTRBAC allow the development of security policies for commercial enterprises. I will discuss various design guidelines for managing complexity and building secure systems based on this model, as well as an XML-based GTRBAC policy specification language that is currently being developed for building secure XML-based distributed applications.

   

 

Monday,
March 24, 2003

Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building

Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 403 IS Building

Meeting with students
Time: 2:00 - 2:30 pm
Place: 1st floor conference room, IS Building

SIS Faculty Candidate

Amanda Spink
Associate Professor
School of Information Sciences and Technology
The Pennsylvania State University

"WEB SEARCHING 1997-2002: ISSUES & TRENDS"

Abstract: The past decade has seen huge growth in Web search engine usage on a global scale. Studies identifying search trends and model user interaction with Web search engines are key to improving search engine performance and evaluation. This presentation will discuss key trends and issues from a major ongoing series of longitudinal studies of public Web searching from 1997 to 2002. The large Web query data sets examined in these ongoing studies were supplied by the Web search companies – Excite, Ask Jeeves, Fast and Alta Vista.

This research identifies: (1) major trends in public Web searching from 1997 to 2002, including changes in search topics and search characteristics related to queries, terms, sessions, pages viewed and search duration, question and request querying, multimedia searches, medical/health searching, e-commerce searching, sexual searches, agent Web interaction and global regional differences in Web search, (2) significant new and more complex models of human information related behaviors associated with Web search engine interaction, including successive searching and multitasking. A human behavior approach to search is discussed, including implications for search engine design and further research.

This research is part of a large collaborative project with Jim Jansen, and the Web companies – Fast and Alta Vista.

   

 

Tuesday,
March 25, 2003

Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building

Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 404 IS Building

Meeting with students
Time: 2:00 - 2:30 pm
Place: 1st floor conference room, IS Building

SIS Faculty Candidate

Jaudelice C. de Oliveira

"NEW PREEMPTION POLICIES FOR DIFFSERV-AWARE TRAFFIC ENGINEERING
TO MINIMIZE REROUTING IN MPLS NETWORKS"

Abstract: In the Multiprotocol Label Switching (MPLS) network context, preemption is the act of selecting a Label Switched Path (LSP) which will be removed from a given path in order to give room  to another LSP with a higher priority. More specifically,  preemption attributes determine whether an LSP with a certain   setup preemption priority can preempt another LSP with a lower holding preemption priority from a given path, when there is a  competition for available resources. The preempted LSP may  then be rerouted.

In this talk, I will present a new preemption policy which is  complemented with an adaptive scheme that aims to minimize  rerouting. The preemption policy (V-PREPT) is versatile, simple,   and robust, combining the three main preemption optimization  criteria: number of LSPs to be preempted, priority of LSPs to  be preempted, and amount of bandwidth to be preempted. Using V PREPT, a service provider can balance the objective function that will be optimized in order to stress the desired criteria. V-PREPT is complemented by an adaptive scheme and is then called Adapt-V-PREPT. The new adaptive policy selects lower priority LSPs that can afford to have their rate reduced. The selected LSPs will fairly reduce their rate in order to accommodate the new high-priority LSP setup request. Heuristics for both simple preemption policy and adaptive preemption scheme are derived. Simulation results show the heuristics' accuracy. Performance comparisons among a non-preemptive approach, V-PREPT, Adapt-V-PREPT, and a policy purely based on priority and holding time are also provided.

   

 

Wednesday,
March 26, 2003

Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building

Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 403 IS Building

Meeting with students
Time: 2:00 - 2:30 pm
Place: 1st floor conference room, IS Building

SIS Faculty Candidate

William W. Cohen
Visiting Associate Professor, Center for Automated Learning and Discovery
Carnegie Mellon University

"SIMILARITY-BASED QUERYING OF HETEROGENEOUS DATABASES: AN OVERVIEW OF THE WHIRL SYSTEM"

Abstract: The vision of a large, widely-distributed knowledge base that can be created collectively by many autonomous contributors is being pursued by several technical communities, including the communities associated with the semantic web; peer-to-peer databases; large-scale data integration across the "deep web"; and large-scale information extraction from the web. However, information that is drawn from many different sources is usually very hard to reason with. Typical problems include terminological differences, as well as the interleaving of unstructured textual information with structured data-like information.

WHIRL is a representation system that addresses some of these problems. WHIRL incorporates ideas from both relational databases systems and statistical information retrieval: specifically, it extends (a subset of) SQL by adding special features for reasoning about the similarity of fragments of text. WHIRL strictly generalizes both logical deduction and ranked retrieval of documents; it can be implemented efficiently; and it greatly facilitates the construction of question-answering systems that use information found at multiple Web sites. Certain types of WHIRL queries can be interpreted as nearest-neighbor-like machine learning algorithms, and WHIRL can be used to automatically collect background knowledge for certain learning tasks from the web. Recent experiments also suggest that the similarity metrics used in WHIRL are very competitive with more computationally expensive metrics.

In this talk I will survey previous work using WHIRL, and also present some recent work comparing the underlying similarity metrics. Parts of this work are joint with Wei Fan, Stephen Fienberg, Haym Hirsh, Pradeep Ravikumar, and Sarah Zelikovitz.

   

 

Friday,
March 28, 2003

Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building

Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 403 IS Building

Meeting with students
Time: 2:00 - 2:30 pm
Place: 1st floor conference room, IS Building

SIS Faculty Candidate

Jeffrey Pomerantz
Ph. D Candidate, School of Information Studies,
Syracuse University, Syracuse, New York

"QUESTION TAXONOMIES FOR DIGITAL REFERENCE TRIAGE"

Abstract: The growth in the past decade of both the infrastructure and the number of users of the Internet has enabled a corresponding growth in the number of users of digital reference services on the Internet. This increase has led to an increase in the number of questions received by these services, putting a strain on the human intermediaries employed therein. The ability of a digital reference service to scale up to handle an increasingly large number of questions is directly affected by the amount of automation employed by that service: the more processes that are automated, the more of human intermediaries time and effort can be dedicated to tasks that cannot yet be automated. There is, now more than ever, an increased and immediate need in digital reference for automation.

This study identifies (1) the types of questions that are received by digital reference services, according to several taxonomies of questions at different levels of linguistic analysis, and (2) the rules by which questions are triaged within and between services (triage being the process of routing and assigning questions to expert digital reference question answerers and other services). Taxonomies of questions are identified though an extensive review of literature that deals with questions, from several fields: desk and digital reference, question answering, and linguistics. The rules by which questions are triaged are identified through a think-aloud study of digital reference triagers performing the task of triage. The goal of this study is to develop specifications according to which an automated triage system can be built. These specifications will take into account the question type, as well as other attributes of questions that affect triage decisions. These taxonomies of questions may also prove to be useful as the basis for algorithms for automating other steps in the digital reference process.

   

 

Monday,
March 31, 2003

Faculty Coffee
Time: 9:00-9:45 am
Place: 503, IS Building

Public Colloquium
Time: 11:00 am - 12:00 noon
Place: 403 IS Building

Meeting with students
Time: 10:00 - 10:30 am
Place: 1st floor conference room, IS Building

SIS Faculty Candidate

Peter Y. Wu
Visiting Assistant Professor
Department of Information Science and Telecommunications

"AN OBJECT-ORIENTED SYSTEM FOR DISTRIBUTED WORKFLOW AUTOMATION"

Abstract: When solving problems, we often apply the notion of vicinity: when things are far apart, they do not affect one another. On the other hand, related information, we would gather into one place. We examine our solution of building space/time sub-division to form discrete neighborhoods in two problems: cartographic map overlay and industrial production planning. Motivated by our approach to solution, we describe an object-oriented system for workflow automation.

For each business process, we capture into one logical unit the protocol knowledge along with all relevant information. We call it the workflow object. The workflow object can then serve as a model-driven application in a distributed environment, coordinating interactivities among the various agents in the network. The workflow system comprises two sub-systems: the workflow definition desktop, and the distributed workflow servers. The workflow definition desktop provides a visual editor and a simulator for design and validation, producing workflow templates kept in a repository. We fill out a template to create a workflow object. The workflow server initiates the workflow instance and monitors its execution by coordinating agent activities accordingly.

The object-oriented design of the workflow object facilitates for locality of reference, and provides ease of administering. Based on the interoperability in the design of workflow objects, we argue that the system offers flexibility of model-driven application architecture, and versatility since the system scales up well, being built on the object-oriented composition of workflows. With products and materials availability accessible through internet services, we may apply workflow automation to compose services for supply chain management on the internet.

   
 
   
     

 

  | webmaster


School of Information Sciences, University of Pittsburgh,
135 North Bellefield Avenue, Pittsburgh, PA 15260
Tel: 412.624.3988 | Fax: 412.624.5231 
For information about Admissions & Financial Aid, please contact
Shabana Reza at 800.672.9435

Information Science & Technology Email: isinq@sis.pitt.edu
Telecommunications Email: teleinq@sis.pitt.edu
Library & Information Science Email: lisinq@sis.pitt.edu

Design inspired by Carnegie Mellon's School of Computer Science

Newsletter News Calendar Colloquia Site Map SIS Home School of Information Sciences University of Pittsburgh