[Home] [Curriculum Vitae] [Research] [Teaching] [Publications] [Links]
|
|
Goals and
Methodology
My research
addresses the challenges inherent in the detection and extraction of
relevant information from a large corpus of information, most of which is
irrelevant to a particular need or interest.
My research focuses on the development of
methodologies to enable people to access relevant information regardless of
their personal experience or proficiency in foreign languages. It crosses
media barriers (text, image, audio, etc.), and informs the design and
development of advanced information retrieval (IR) systems. With such IR
systems, users would be able to look for relevant information comfortably
and effectively, to search for documents in multiple languages without
necessarily knowing those languages, and to access relevant information in
various forms of media, including text, image, audio, and video.
I approach my research goal
from an interdisciplinary perspective. I believe that computer science
research, especially information retrieval, can and should leverage several
thousands years of knowledge for organizing, accessing and managing
information. This legacy strongly affects our vision for and understanding
of libraries and other information centers, and leverages it to further
improve the computer-aided information access process.
Research Projects
Work with our CMU partner, the objective of this project is to develop new technologies for Adaptive Filtering (AF).
 | GALE Project: Adaptive,Robust,and Distilled Information Access
Funded by DARPA 2005-2010 |
We propose to develop and demonstrate an integrated set of techniques for building adaptive, robust and
distilled information access module for the Distillation Engine. Our
module will be able to handle electronic newswire, transcriptions from
broadcast news, telephone conversations, and talk shows, and web
documents from newsgroups and weblogs, no mater whether they are
originally in English or in Chinese, Arabic or surprise languages.
 | DiLearn Project: A
Digital Library for Learning Digital Library
Funded by University of Pittsburgh, Provost's Innovation of
Teaching Award Project Duration: 2005-2006 |
This project will
provide students an interactive, integrated and active
learning environment. The goals of the proposed project are: 1) to
review and organize current research and practice activities in the
DL area with the aim of helping students to quickly understand and
master the core concepts, challenges, and approaches of the field;
2) to design and develop a digital library that stores the outcome
of our study in goal 1, and more importantly, to provide easy access
to the stored information; 3) to evaluate rigorously the usefulness
of the developed DL in the context of teaching DL courses; and 4) to
raise awareness of the advantages of teaching courses with the help
of a DL system in different departments and universities. The
proposed DL system can improve the quality of the student’s learning
experience and increase the effectiveness of that experience.
 | Evidence Combination in Enterprise Searches Project
Funded by School of Information Sciences Dean's Research Award
Project Duration: 2005-2006
|
The goals of this project are 1) to explore the
problem space in multiple evidence identification and combination for improving search effectiveness.
2) using both CLEF and TREC experiments as the realistic evaluation frameworks to test the ideas and systems developed
for evidence identification and combination.
 | Smart Intermediary (SIM) Project |
Having reference librarians or human intermediaries in the
information retrieval process is one of the big successful stories in the
development of libraries and library science. However, with widely use of
Internet and the Web, more and more searches are performed without the help of human
intermediaries. The goal of the SIM project is to transform the knowledge
people developed for reference process into modern automatic retrieval process,
and design a retrieval system that would be a useful assistant when working
with librarians, and a smart automatic intermediary when acting alone.
 | Interactive Cross-Language Information Access (ICLIA) |
Even though the advancement of computer and network
technology makes it possible that people can access information globally, our such abilities are still dragged down by the fact that
most of us lack of proficiency
in foreign languages. Research on Cross-Language Information Retrieval (CLIR)
helps developing algorithms for identifying relevance information in foreign
languages, but many more difficult research problems are to be solved before a
useful interactive cross-language information access system can be delivered. The
ICLIA research project aims at attacking those problems, and developing tools
based on natural language processing techniques to facilitate people to access information
regardless of languages, and eventually to develop an efficient, user friendly
cross-language information access system.
 | High Accuracy Retrieval from Documents (HARD) project duration: 2003-2005 |
With the Web as the document collection having vast amount
of information for almost every possible topic, the key for the success of a retrieval
process is not to identify the relevant information, but to make the relevant
information easy accessible to the user. HARD project tries to address this problem by inviting user
in the loop. Our research interests in this project are mainly concentrated on
using the HARD framework to explore various means to interact with users during
the search process, and identify chunks of text (called passages) rather than
the whole documents to satisfy users?information need.
|
|