Research Interests
In general, I'm interested in new database systems for non-traditional architecture and non-traditional data. Currently, I'm leading a team of students working on Database Systems for the Cloud and Generative AI, including Disaggregated Databases, Vector Databases, and Unified Databases.
-
Disaggregated Database Systems (for the Cloud)
- Memory-Disaggregated Databases
- Storage-Disaggregated Databases
- Disaggregated Databases with Multi-Masters
- Distributed Shared-Memory & Shared-Storage Databases
- [Papers: SIGMOD'24a, SIGMOD'24b, VLDBJ'24, VLDB'23, SIGMOD'23, ICDE'23]
- [Systems: We built OpenAurora, an open-source version of Amazon Aurora, based on PostgreSQL v13.0. OpenAurora is a cloud-native database prototype optimized for the storage-disaggregated infrastructure. We are currently working on memory disaggregation and multi-masters within OpenAurora. We hope it will be used by the broader database system research community.]
- [External Grants: NSF CAREER Award]
-
Vector Database Systems (for Large Language Models)
-
Unified Database Systems (for Structured/Unstructured Data and Generative AI)
- We believe that the era of generative AI calls for a unified database system that seamlessly integrates the management of structured and unstructured data, while also natively supporting GenAI capabilities and its ecosystem.
- Such a database will (1) efficiently support, at the very least, relational tables, texts, documents, images, videos, vectors, and GenAI embedding/inference/finetuning in a real-time fashion; and (2) enable efficient processing of hybrid multimodal queries, which combine traditional SQL queries and new LLM queries, for advanced data analytics.
- Our goal is three-fold: (1) build a unified data infrastructure that provides all the data needed for generative AI (including multimodal GenAI); (2) address critical limitations of LLMs, such as hallucination, lack of real-time data, and high costs; and (3) enable interesting queries that were not possible before (through the unification of data management and GenAI).
|