Jianguo Wang

Assistant Professor

Department of Computer Science
Purdue University
West Lafayette, Indiana

Email: csjgwang@purdue.edu
Office: LWSN 1123H
Phone: (765) 496-0726





Biography

Jianguo Wang is currently a Tenure-Track Assistant Professor in the Department of Computer Science at Purdue University.

Prior to joining Purdue, he worked at Zilliz on Milvus, a purpose-built vector database system, which has been used in many data science applications including ChatGPT. Before that, he worked at Amazon Web Services (AWS) on Amazon Aurora, a cloud-native database system. He also interned at Microsoft Research, Oracle, and Samsung on various database systems.

He obtained his PhD degree in Computer Science from the University of California, San Diego, his MPhil degree from The Hong Kong Polytechnic University, and his Bachelor's degree from Zhengzhou University, China.




Research Interests

In general, I'm interested in new database systems for non-traditional architecture and non-traditional data. Currently, I'm leading a team of students working on Database Systems for the Cloud and Generative AI, including Disaggregated Databases, Vector Databases, and Unified Databases.

  • Disaggregated Database Systems (for the Cloud)
    • Memory-Disaggregated Databases
    • Storage-Disaggregated Databases
    • Disaggregated Databases with Multi-Masters
    • Distributed Shared-Memory & Shared-Storage Databases
    • [Papers: SIGMOD'24a, SIGMOD'24b, VLDBJ'24, VLDB'23, SIGMOD'23, ICDE'23]
    • [Systems: We built OpenAurora, an open-source version of Amazon Aurora, based on PostgreSQL v13.0. OpenAurora is a cloud-native database prototype optimized for the storage-disaggregated infrastructure. We are currently working on memory disaggregation and multi-masters within OpenAurora. We hope it will be used by the broader database system research community.]
    • [External Grants: NSF CAREER Award]
  • Vector Database Systems (for Large Language Models)
  • Unified Database Systems (for Structured/Unstructured Data and Generative AI)
    • We believe that the era of generative AI calls for a unified database system that seamlessly integrates the management of structured and unstructured data, while also natively supporting GenAI capabilities and its ecosystem.
    • Such a database will (1) efficiently support, at the very least, relational tables, texts, documents, images, videos, vectors, and GenAI embedding/inference/finetuning in a real-time fashion; and (2) enable efficient processing of hybrid multimodal queries, which combine traditional SQL queries and new LLM queries, for advanced data analytics.
    • Our goal is three-fold: (1) build a unified data infrastructure that provides all the data needed for generative AI (including multimodal GenAI); (2) address critical limitations of LLMs, such as hallucination, lack of real-time data, and high costs; and (3) enable interesting queries that were not possible before (through the unification of data management and GenAI).



Working Experience




Recent Publications




Team




Hiring

I'm always looking for highly motivated students in database systems (hiring info). Please feel free to contact me if you're interested.




Teaching




Honors and Awards




Services

Academic Services

University Services

National Services

  • NSF Panelist: 2024