Research

3/4/08

[Home]
[Curriculum Vitae]
[Research]
[Teaching]
[Publications]
[Links]

 

Goals and Methodology

My research addresses the challenges inherent in the detection and extraction of relevant information from a large corpus of information, most of which is irrelevant to a particular need or interest.

My research focuses on the development of methodologies to enable people to access relevant information regardless of their personal experience or proficiency in foreign languages. It crosses media barriers (text, image, audio, etc.), and informs the design and development of advanced information retrieval (IR) systems. With such IR systems, users would be able to look for relevant information comfortably and effectively, to search for documents in multiple languages without necessarily knowing those languages, and to access relevant information in various forms of media, including text, image, audio, and video.

I approach my research goal from an interdisciplinary perspective. I believe that computer science research, especially information retrieval, can and should leverage several thousands years of knowledge for organizing, accessing and managing information.  This legacy strongly affects our vision for and understanding of libraries and other information centers, and leverages it to further improve the computer-aided information access process.

Research Projects

bullet

Collaborative Project: User-centric, Adaptive and Collaborative Information Filtering

Funded by NSF 2007-2009, Partner: CMU CLAIR group

Work with our CMU partner, the objective of this project is to develop new technologies for Adaptive Filtering (AF).

bullet

GALE Project: Adaptive,Robust,and Distilled Information Access

Funded by DARPA 2005-2010

We propose to develop and demonstrate an integrated set of techniques for building adaptive, robust and distilled information access module for the Distillation Engine. Our module will be able to handle electronic newswire, transcriptions from broadcast news, telephone conversations, and talk shows, and web documents from newsgroups and weblogs, no mater whether they are originally in English or in Chinese, Arabic or surprise languages.

bullet

DiLearn Project: A Digital Library for Learning Digital Library

Funded by University of Pittsburgh, Provost's Innovation of Teaching Award

Project Duration: 2005-2006

This project will provide students an interactive, integrated and active learning environment. The goals of the proposed project are: 1) to review and organize current research and practice activities in the DL area with the aim of helping students to quickly understand and master the core concepts, challenges, and approaches of the field; 2) to design and develop a digital library that stores the outcome of our study in goal 1, and more importantly, to provide easy access to the stored information; 3) to evaluate rigorously the usefulness of the developed DL in the context of teaching DL courses; and 4) to raise awareness of the advantages of teaching courses with the help of a DL system in different departments and universities. The proposed DL system can improve the quality of the student’s learning experience and increase the effectiveness of that experience.

bullet

Evidence Combination in Enterprise Searches Project

Funded by School of Information Sciences Dean's Research Award

Project Duration: 2005-2006  

The goals of this project are 1) to explore the problem space in multiple evidence identification and combination for improving search effectiveness. 2) using both CLEF and TREC experiments as the realistic evaluation frameworks to test the ideas and systems developed for evidence identification and combination.      

bullet

Smart Intermediary (SIM) Project

Having reference librarians or human intermediaries in the information retrieval process is one of the big successful stories in the development of libraries and library science. However, with widely use of Internet and the Web, more and more searches are performed without the help of human intermediaries. The goal of the SIM project is to transform the knowledge people developed for reference process into modern automatic retrieval process, and design a retrieval system that would be a useful assistant when working with librarians, and a smart automatic intermediary when acting alone.      

bullet

Interactive Cross-Language Information Access (ICLIA)

Even though the advancement of computer and network technology makes it possible that people can access information globally, our such abilities are still dragged down by the fact that most of us lack of proficiency in foreign languages. Research on Cross-Language Information Retrieval (CLIR) helps developing algorithms for identifying relevance information in foreign languages, but many more difficult research problems are to be solved before a useful interactive cross-language information access system can be delivered. The ICLIA research project aims at attacking those problems, and developing tools based on natural language processing techniques to facilitate people to access information regardless of languages, and eventually to develop an efficient, user friendly cross-language information access system.

bullet

High Accuracy Retrieval from Documents (HARD)

project duration: 2003-2005

With the Web as the document collection having vast amount of information for almost every possible topic, the key for the success of a retrieval process is not to identify the relevant information, but to make the relevant information easy accessible to the user. HARD project tries to address this problem by inviting user in the loop. Our research interests in this project are mainly concentrated on using the HARD framework to explore various means to interact with users during the search process, and identify chunks of text (called passages) rather than the whole documents to satisfy users?information need.  

Home | Curriculum Vitae | Research | Teaching | Publications | Links

This site was last updated 3/4/08