
CAREER: Efficient I/O for Modern Database Applications
Sponsor: National Science Foundation
This material is based upon work supported by the National Science Foundation
under Grant No: IIS-9985019
Project Summary
Recent trends in hardware development and data-intensive applications have
resulted in a performance bottleneck for storing and retrieving data. The
goal of this project is to develop a broad class of innovative techniques
to alleviate the I/O bottleneck for modern database applications. The project
focuses on data-intensive applications that handle multi-dimensional and
multimedia data. The research has two major directions. The first is the
development of declustering schemes for the efficient execution of range
and nearest-neighbor queries over large multi-dimensional datasets under
realistic assumptions such as non-constant disk I/O times, and non-uniform
data and query distributions. The second addresses the storage and content-based
retrieval of multiple-quality, multimedia documents. This component of
the project investigates integrated techniques for placement, scheduling,
migration, and reliability of continuous media data on secondary and tertiary
storage. The approach is to design, develop, implement, and test the schemes
on real datasets. In this manner the effects of the simplifying assumptions
typically made to ease analysis or development can be identified and addressed.
The project will result in a collection of new techniques as well a prototype
implementation and test results on real applications. These will be made
available for public access over the world-wide-web. The expected impact
is improved performance for a broad class of applications. The education
component aims to integrate I/O related issues for modern systems into
the graduate curriculum. This involves the development of new web-based
tools and projects that will enable students to understand, experiment
with I/O issues and solutions, and facilitate distance learning.
Goals, Objectives, and Targeted Activities
The goals of the project are to develop novel I/O management techniques
for large-scale multimedia and multi-dimensional data. The activities of
the project have centered on the development of data placement, migration,
and indexing techniques for video and multi-dimensional spatio-temporal
data. In particular: 1) Data management schemes for very large amounts
of video on hierarchical storage. We have developed a novel caching scheme
for secondary storage that when coupled with replication on tertiary storage
yields significant reductions in start-up latency for continuous multimedia
objects such as video. We have also developed a new placement scheme for
tertiary storage that takes into account relationships between objects
to reduce expensive swapping of media. Current evaluation has been based
upon simulation. 2) We are currently investigating data placement schemes
for the efficient retrieval of multi-resolution video from disks. Alternative
schemes have been developed and are being tested using a simulation setup
that has been developed on top of available disk simulators. In the coming
months, we will evaluate the proposed schemes, first through simulation
and later, implementation on the testbed. 3) We have developed two new
indexing techniques for spatio-temporal data to efficiently process large
numbers of concurrent, ongoing queries over moving objects. In the upcoming
months we will build upon our earlier work and investigate more efficient
I/O management techniques for the moving objects environment.
Publications
Disclaimer
Any opinions, findings and conclusions or recomendations expressed in this
material are those of the author(s) and do not necessarily reflect
the views of the National Science Foundation
(NSF).
Last Modified by Sunil Prabhakar
on 19th June 2001.