Zhuo Zhang  张倬


Ph.D. student
Department of Computer Science
College of Science
Purdue University

Oops, your browser doesn't support this application.

I am a third year Ph.D. student at Purdue University, advised by Professor Xiangyu Zhang. I previously obtained my B.Sc. degree with Zhiyuan Honours from Shanghai Jiao Tong University (SJTU) in 2018.

I am also a core member of the CTF team 0ops . Sometimes I play with A*0*E, too.

Click to view my up-to-date CV and Twitter timeline.

I am a third year Ph.D. student at Purdue University, advised by Professor Xiangyu Zhang. I previously obtained my B.Sc. degree with Zhiyuan Honours from Shanghai Jiao Tong University (SJTU) in 2018.

I am also a core member of the CTF team 0ops. Sometimes I play with A*0*E, too.

Click to view my up-to-date CV.

My research interest mainly lies in software engineering and program analysis, especially for native code without sources.

Run the following command in a terminal (GNU):

$ echo "$(echo "ghNsgnm" | md5sum - | xxd -r -p | base64 | cut -c3-10)@purdue.edu" 

Or calculate it online:

If needed, click here to get my GPG key.

Please refer to my publications. If needed, click here to get my GPG key.
  • PMP: Cost-Effective Forced Execution with Probabilistic Memory Pre-Planning
    Wei You, Zhuo Zhang, Yonghwi Kwon, Yousra Aafer, Fei Peng, Yu Shi, Carson Makena Harmon, Xiangyu Zhang
    Proceedings of the 41th IEEE Symposiums on Security and Privacy (Oakland 2020)
    San Francisco, CA, May 2020
  • Key Words: Malware Analysis, Forced Execution, Probabilistic Program Analysis
    Abstract:
         Malware is a prominent security threat and exposing malware behavior is a critical challenge. Recent malware often has payload that is only released when certain conditions are satisfied. It is hence difficult to fully disclose the payload by simply executing the malware. In addition, malware samples may be equipped with cloaking techniques such as VM detectors that stop execution once detecting that the malware is being monitored. Forced execution is a highly effective method to penetrate malware self-protection and expose hidden behavior, by forcefully setting certain branch outcomes. However, an existing state-of-the-art forced execution technique X-Force is very heavy-weight, requiring tracing individual instructions, reasoning about pointer alias relations on-the-fly, and repairing invalid pointers by on demand memory allocation.
         We develop a light-weight and practical forced execution technique. Without losing analysis precision, it avoids tracking individual instructions and on demand allocation. Under our scheme, a forced execution is very similar to a native one. It features a novel memory pre-planning phase that pre-allocates a large memory buffer, and then initializes the buffer, and variables in the subject binary, with carefully crafted values in a random fashion before the real execution. The pre-planning is designed in such a way that dereferencing an invalid pointer has a very large chance to fall into the pre-allocated region and hence does not cause any exception, and semantically unrelated invalid pointer dereferences highly likely access disjoint (pre-allocated) memory regions, avoiding state corruptions with probabilistic guarantees. Our experiments show that our technique is 84 times faster than X-Force, has 6.5X and 10% fewer false positives and negatives for program dependence detection, respectively, and can expose 98% more malicious behaviors in 400 recent malware samples.

  • BDA: Practical Dependence Analysis for Binary Executables by Unbiased Whole-Program Path Sampling and Per-Path Abstract Interpretation
    Zhuo Zhang, Wei You, Guanhong Tao, Guannan Wei, Yonghwi Kwon, Xiangyu Zhang
    🏆   ACM SIGPLAN Distinguished Paper Award
    Proceedings of the ACM on Programming Languages Volume 3 Issue OOPSLA (OOPSLA 2019)
    Athens, Greece, October 2019
    [Supplementary Material]
  • Key Words: Path Sampling, Abstract Interpretation, Binary Analysis, Data Dependence
    Abstract:
         Binary program dependence analysis determines dependence between instructions and hence is important for many applications that have to deal with executables without any symbol information. A key challenge is to identify if multiple memory read/write instructions access the same memory location. The state-of-the-art solution is the value set analysis (VSA) that uses abstract interpretation to determine the set of addresses that are possibly accessed by memory instructions. However, VSA is conservative and hence leads to a large number of bogus dependences and then substantial false positives in downstream analyses such as malware behavior analysis. Furthermore, existing public VSA implementations have difficulty scaling to complex binaries.
         In this paper, we propose a new binary dependence analysis called BDA enabled by a randomized abstract interpretation technique. It features a novel whole program path sampling algorithm that is not biased by path length, and a per-path abstract interpretation avoiding precision loss caused by merging paths in traditional analyses. It also provides probabilistic guarantees. Our evaluation on SPECINT2000 programs shows that it can handle complex binaries such as gcc whereas VSA implementations from the-state-of-art platforms have difficulty producing results for many SPEC binaries. In addition, the dependences reported by BDA are 75 and 6 times smaller than Alto, a scalable binary dependence analysis tool, and VSA, respectively, with only 0.19% of true dependences observed during dynamic execution missed (by BDA). Applying BDA to call graph generation and malware analysis shows that BDA substantially supersedes the commercial tool IDA in recovering indirect call targets and outperforms a state-of-the-art malware analysis tool Cuckoo by disclosing 3 times more hidden payloads.

  • Probabilistic Disassembly
    Kenneth Miller, Yonghwi Kwon, Yi Sun, Zhuo Zhang, Xiangyu Zhang, Zhiqiang Lin
    Proceedings of the 41st ACM/IEEE International Conference on Software Engineering (ICSE 2019)
    Montréal, QC, Canada, May 2019
  • Key Words: Disassembly, Binary Rewrite, Probabilistic Program Analysis
    Abstract:
         Disassembling stripped binaries is a prominent challenge for binary analysis, due to the interleaving of code segments and data, and the difficulties of resolving control transfer targets of indirect calls and jumps. As a result, most existing disassemblers have both false positives (FP) and false negatives (FN). We observe that uncertainty is inevitable in disassembly due to the information loss during compilation and code generation. Therefore, we propose to model such uncertainty using probabilities and propose a novel disassembly technique, which computes a probability for each address in the code space, indicating its likelihood of being a true positive instruction. The probability is computed from a set of features that are reachable to an address, including control flow and data flow features. Our experiments with more than two thousands binaries show that our technique does not have any FN and has only 3.7% FP. In comparison, a state-of-the-art superset disassembly technique has 85% FP. A rewriter built on our disassembly can generate binaries that are only half of the size of those by superset disassembly and run 3% faster. While many widely-used disassemblers such as IDA and BAP suffer from missing function entries, our experiment also shows that even without any function entry information, our disassembler can still achieve 0 FN and 6.8% FP.

  • PMP: Cost-effective Forced Execution with Probabilistic Memory Pre-planning [slides]
    S&P 2020, Virtual Conference, May 2020
  • BDA: Practical Dependence Analysis for Binary Executables by Unbiased Whole-Program Path Sampling and Per-Path Abstract Interpretation [slides]
    SPLASH 2019 OOPSLA, Athens, Greece, October 2019
    Renmin University of China, Beijing, China, November 2019
  • Hello World: A Brief Introduction to Capture-The-Flag [slides]
    Renmin University of China, Beijing, China, November 2019

Academic Awards

  • OOPSLA 2019 Distinguished Paper Award, ACM SIGPLAN, October 2019
  • ACM Student Travel Award, ACM SIGPLAN Professional Activities Committee, October 2019
  • Zhiyuan Honor Degree of B.Sc. in Computer Science, Shanghai Jiao Tong University, June 2018
  • Natinoal Scholarship (Top 2%), Ministry of Education of China, November 2016

Selected Capture-The-Flag (CTF)

  • 1st place at the 40th IEEE S&P Celebration Scavenger Hunt
  • 4th place (w/ A*0*E) at DEFCON CTF 2018
  • 3rd place (w/ A*0*E) at DEFCON CTF 2017
  • Sub-reviewer
  • International Symposium on the Foundations of Software Engineering (FSE), 2019, 2020
    International Conference on Automated Software Engineering (ASE), 2020
    International Symposium on Software Testing and Analysis (ISSTA), 2020

August 2018 - Present

Mentor: Xiangyu Zhang
Research Assistant: Purdue University, West Lafayette, IN

August 2017 - June 2019

Mentor: Anton Kochkov
Collaborator: radeco ( ★)
Radare2-based binary analysis framework

June 2016 - July 2017

Mentor: Sen Nie
Research Intern: Keen Security Lab of Tencent, Shanghai, China
FREE-FALL: HACKING TESLA FROM WIRELESS TO CAN BUS  [video]   [writeup]   [slides]