Zhang earns NSF CAREER award

06-07-2024

Assistant Professor Tianyi Zhang won a National Science Foundation (NSF) CAREER award for his proposed work titled, “Regularizing Large Language Models for Safe and Reliable Program Generation."

Strengthening LLMs for Reliable Programming

Large Language Models (LLMs) represent a leap in artificial intelligence, offering the ability to process and generate text that closely resembles human language. Across diverse industries such as healthcare, automotive, and finance, LLMs can improve operations and drive innovation.

Whether it is facilitating rapid analysis of medical literature, enabling the development of advanced driver assistance systems in autonomous vehicles, or empowering institutions with sophisticated data analysis for informed decision-making and risk management–the versatility and adaptability of LLMs are poised to transform countless facets of modern society.

While LLMs show promise in generating code from natural language, they often produce code with functional errors, security vulnerabilities, sub-optimal implementations, and robustness issues. This raises concerns about software quality in the era of LLMs and may lead to severe consequences such as unreliable web services and data leakage. .

Tianyi Zhang, assistant professor in the Department of Computer Science won a National Science Foundation (NSF) CAREER award for his proposed work titled, “Regularizing Large Language Models for Safe and Reliable Program Generation.” His project aims to address the challenges associated with using Large Language Models (LLMs) for program generation and develop approaches to enhance the correctness, safety, and robustness of LLM-generated code.

Goals of the project

This project aims to enhance the reliability and effectiveness of LLMs for program generation by addressing the existing knowledge gap surrounding the errors they produce. By conducting an in-depth analysis of these errors, the project aims to uncover the symptoms and characteristics of code generation errors across different LLMs. This analysis will be guided by grounded theory, a research methodology that emphasizes building understanding from the ground up, rooted in empirical evidence.

This project will also develop root cause analysis methods to uncover the reasons behind the generation of these errors. By identifying the root causes, Zhang hopes to gain insights into the internal mechanisms of LLMs that contribute to error generation. Based on these insights, he aims to develop new targeted methods that leverage the internal states of LLMs to localize and mitigate code generation errors.

“Ideally, this approach will not only help us understand the inherent limitations of using LLMs in programming but also shed insights on developing new methods that precisely address these limitations beyond simple fine-tuning or prompting” said Zhang.

He added, “By addressing these challenges, this could unlock the full potential of LLMs in program generation tasks, ultimately benefiting industries ranging from software development to automation and beyond.”

NSF CAREER Awards

NSF CAREER awards are the organization’s most prestigious awards given to junior faculty who embody the role of teacher-scholars through research, education and the integration of those concepts within the mission of their organizations. CAREER awards support promising and talented researchers in building a foundation for a lifetime of leadership. Receiving this award reflects this project’s merit of the NSF statutory mission and its worthiness of financial support.

Tianyi Zhang is an assistant professor of computer science at Purdue University. He is affiliated with CERIAS, Purdue’s Center for Education and Research in Information Assurance and Security. Previously he completed a postdoctoral fellowship at Harvard University. Zhang earned his PhD from University of California, Los Angeles in 2019 and his Bachelor's degree from Huazhong University of Science and Technology in 2013. His research interests include software engineering, human-computer interaction, and artificial intelligence. In particular, his research focuses on building interactive systems that improve programming productivity and reduce coding barriers using AI-based technologies.

About the Department of Computer Science at Purdue University

Founded in 1962, the Department of Computer Science was created to be an innovative base of knowledge in the emerging field of computing as the first degree-awarding program in the United States. The department continues to advance the computer science industry through research. US News & Reports ranks Purdue CS #20 and #18 overall in graduate and undergraduate programs respectively, 6th in cybersecurity, 8th in software engineering, 13th in programming languages and systems, 15th in data analytics, and 18th in theory. Graduates of the program are able to solve complex and challenging problems in many fields. Our consistent success in an ever-changing landscape is reflected in the record undergraduate enrollment, increased faculty hiring, innovative research projects, and the creation of new academic programs. The increasing centrality of computer science in academic disciplines and society, and new research activities - centered around data science, artificial intelligence, programming languages, theoretical computer science, machine learning, and cybersecurity - are the future focus of the department. cs.purdue.edu

Writer: Emily Kinsell, emily@purdue.edu
Source: Tianyi Zhang, tianyi@purdue.edu

Last Updated: Jul 7, 2025 6:14 PM

Zhang earns NSF CAREER award

Strengthening LLMs for Reliable Programming

Goals of the project

NSF CAREER Awards

Follow Us