Mihael Ankerst, Gabi Kastenmüller, Hans-Peter Kriegel, Thomas Seidl
Abstract
In molecular databases, structural classification is a basic task
that can be successfully approached by nearest neighbor methods.
The
underlying similarity models consider spatial properties such
as shape and extension as well as thematic attributes. We introduce
3D shape
histograms as an intuitive and powerful approach to model similarity
for solid objects such as molecules. Errors of measurement, sampling,
and numerical rounding may result in small displacements of atomic
coordinates. These effects may be handled by using quadratic form
distance functions. An efficient processing of similarity queries
based on quadratic forms is supported by a filter-refinement architecture.
Experiments on our 3D protein database demonstrate the high classification
accuracy of more than 90% and the good performance of the technique.