Title: Extending CLARANS to High-dimensional
Spaces
When: Monday, February 23, 2004
12:00 noon
Where: IS Rm 503 (5th Floor Conference
Hall)
Who: Denis Nkweteyim
Abstract: Cluster analysis is a data
mining activity that aims at finding structure in data
by grouping together, or clustering, data items that are
highly similar to each other, and very dissimilar from
data items in other clusters. Traditional approaches to
clustering are limited by the fact that they are only efficient
when dealing with small amounts of data, typically with
very few dimensions. And although more recent approaches
can deal with large datasets, most of them are limited to low dimensional
data. In information retrieval and related domains, documents are usually
represented in high-dimensional spaces and clustering approaches that deal
with high dimensional data would be useful.
In this talk, I will present the results of a simulation
study that shows that the CLARANS clustering algorithm
traditionally used to cluster low dimensional data can
be extended cluster high dimensional data too. Comparisons
will be made with the PAM and CLARA algorithms, which form
the basis of CLARANS.
Speaker Bio: Denis Nkweteyim is a Ph.D.
candidate in the Information Science program of the School
of Information Sciences, University of Pittsburgh. He holds
a BS degree in Electrical & Electronic Engineering from
the University of Wales, Cardiff, and an MS degree in Digital
Communication Systems from Loughborough University. His
current research interests are in the area of personalization
systems.
|