CAREER: Efficient I/O for Modern Database Applications

Sponsor: National Science Foundation

This material is based upon work supported by the National Science Foundation under Grant No: IIS-9985019

Principal Investigator: Sunil Prabhakhar

Graduate Students: Dmitri V. Kalashnikov, Yuni Xia, Rahul Chari, Jiangtao Li

Project Summary

Recent trends in hardware development and data-intensive applications have resulted in a performance bottleneck for storing and retrieving data. The goal of this project is to develop a broad class of innovative techniques to alleviate the I/O bottleneck for modern database applications. The project focuses on data-intensive applications that handle multi-dimensional and multimedia data. The research has two major directions. The first is the development of declustering schemes for the efficient execution of range and nearest-neighbor queries over large multi-dimensional datasets under realistic assumptions such as non-constant disk I/O times, and non-uniform data and query distributions. The second addresses the storage and content-based retrieval of multiple-quality, multimedia documents. This component of the project investigates integrated techniques for placement, scheduling, migration, and reliability of continuous media data on secondary and tertiary storage. The approach is to design, develop, implement, and test the schemes on real datasets. In this manner the effects of the simplifying assumptions typically made to ease analysis or development can be identified and addressed. The project will result in a collection of new techniques as well a prototype implementation and test results on real applications. These will be made available for public access over the world-wide-web. The expected impact is improved performance for a broad class of applications. The education component aims to integrate I/O related issues for modern systems into the graduate curriculum.  This involves the development of new web-based tools and projects that will enable students to understand, experiment with I/O issues and solutions, and facilitate distance learning.
 

Goals, Objectives, and Targeted Activities

The goals of the project are to develop novel I/O management techniques for large-scale multimedia and multi-dimensional data. The activities of the project have centered on the development of data placement, migration, and indexing techniques for video and multi-dimensional spatio-temporal data. In particular: 1) Data management schemes for very large amounts of video on hierarchical storage. We have developed a novel caching scheme for secondary storage that when coupled with replication on tertiary storage yields significant reductions in start-up latency for continuous multimedia objects such as video. We have also developed a new placement scheme for tertiary storage that takes into account relationships between objects to reduce expensive swapping of media. Current evaluation has been based upon simulation. 2) We are currently investigating data placement schemes for the efficient retrieval of multi-resolution video from disks. Alternative schemes have been developed and are being tested using a simulation setup that has been developed on top of available disk simulators. In the coming months, we will evaluate the proposed schemes, first through simulation and later, implementation on the testbed. 3) We have developed two new indexing techniques for spatio-temporal data to efficiently process large numbers of concurrent, ongoing queries over moving objects. In the upcoming months we will build upon our earlier work and investigate more efficient I/O management techniques for the moving objects environment.

Publications
 

Disclaimer

Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect
     the views of the National Science Foundation (NSF).
 


Last Modified by Sunil Prabhakar on 19th June 2001.