Collections

2025-10-28

LMFuzz: Program repair fuzzing based on large language models

Generating programs using large language models (LLMs) for fuzz testing has emerged as a significant testing methodology. While traditional fuzzers can produce correct programs, their effectiveness is limited by excessive constraints and restricted API combinations, resulting in insufficient coverage of the target system’s code and impacting testing efficiency. Unlike traditional methods, large language model based fuzzers can generate more diverse code, effectively addressing key issues of conventional fuzzers. However, the lack of constraints on API combinations during the generation process often leads to reduced program validity. Therefore, a crucial challenge is to enhance the validity of generated code while maintaining its diversity. To address this issue, we propose a novel and universal fuzzer, LMFuzz. To ensure the fuzzer’s generation capability, we utilize a large language model as the primary generator and model the operator selection problem within the fuzzing loop as a multi-armed bandit problem. We introduce the Thompson Sampling algorithm to enhance both the diversity and validity of program generation. To improve the validity of the generated code, we incorporate a program repair loop that iteratively corrects the generated programs, thereby reducing errors caused by the lack of API combination constraints. Experimental results demonstrate that LMFuzz significantly surpasses existing state-of-the-art large language model based fuzzers in terms of coverage and validity, and also exhibits notable advantages in generating diverse programs. Furthermore, LMFuzz has identified 24 bugs across five popular programming languages and their corresponding systems.

Journal Data Ref: EP2JV2WH

Journal

2022-12-01

AMSFuzz: An adaptive mutation schedule for fuzzing

Zhao, Xiaoqi Qu, Haipeng Xu, Jianliang +2

Expert Systems with Applications

Abstract

Mutation-based fuzzing is one of the most popular software testing techniques. After allocating a specific amount of energy (i.e., the number of testcases generated by the seed) for the seed, it uses existing mutation operators to continuously mutate the seed to generate new testcases and feed them into the target program to discover unexpected behaviors, such as bugs, crashes, and vulnerabilities. However, the random selection of mutation operators and sequential selection of mutation positions in existing fuzzers affect path discovery and bug detection. In this paper, a novel adaptive mutation schedule framework, AMSFuzz is proposed. For the random selection of mutation operators, AMSFuzz has the ability to adaptively adjust the probability distribution of mutation operators to select mutation operators. Aiming at the sequential selection of mutation positions, seeds are dynamically sliced with different sizes during the fuzzing process and giving more seeds the opportunity to preferentially mutate, improving the efficiency of fuzzing. AMSFuzz is implemented and evaluated in 12 real-world programs and LAVA-M dataset. The results show that AMSFuzz substantially outperforms state-of-the-art fuzzers in terms of path discovery and bug detection. Additionally, AMSFuzz has detected 17 previously unknown bugs in several projects, 15 of which were assigned CVE IDs.

Journal Data Ref: D3UAX9ZU

Conference

2023

DARWIN: Survival of the Fittest Fuzzing Mutators

Jauernig, Patrick Jakobovic, Domagoj Picek, Stjepan +2

Abstract

Fuzzing is an automated software testing technique broadly adopted by the industry. A popular variant is mutation-based fuzzing, which discovers a large number of bugs in practice. While the research community has studied mutation-based fuzzing for years now, the algorithms' interactions within the fuzzer are highly complex and can, together with the randomness in every instance of a fuzzer, lead to unpredictable effects. Most efforts to improve this fragile interaction focused on optimizing seed scheduling. However, real-world results like Google's FuzzBench highlight that these approaches do not consistently show improvements in practice. Another approach to improve the fuzzing process algorithmically is optimizing mutation scheduling. Unfortunately, existing mutation scheduling approaches also failed to convince because of missing real-world improvements or too many user-controlled parameters whose configuration requires expert knowledge about the target program. This leaves the challenging problem of cleverly processing test cases and achieving a measurable improvement unsolved. We present DARWIN, a novel mutation scheduler and the first to show fuzzing improvements in a realistic scenario without the need to introduce additional user-configurable parameters, opening this approach to the broad fuzzing community. DARWIN uses an Evolution Strategy to systematically optimize and adapt the probability distribution of the mutation operators during fuzzing. We implemented a prototype based on the popular general-purpose fuzzer AFL. DARWIN significantly outperforms the state-of-the-art mutation scheduler and the AFL baseline in our own coverage experiment, in FuzzBench, and by finding 15 out of 21 bugs the fastest in the MAGMA benchmark. Finally, DARWIN found 20 unique bugs (including one novel bug), 66% more than AFL, in widely-used real-world applications.

Conference Data Ref: R3L4W5XB

Blog

FirmAgent: Leveraging Fuzzing to Assist LLM Agents with IoT Firmware Vulnerability Discovery

Abstract

No abstract available

Blog Data Ref: 88P9Y9UZ

Conference

2024-04-12

Fuzz4All: Universal Fuzzing with Large Language Models

Xia, Chunqiu Steven Paltenghi, Matteo Tian, Jia Le +2

Abstract

Fuzzing has achieved tremendous success in discovering bugs and vulnerabilities in various software systems. Systems under test (SUTs) that take in programming or formal language as inputs, e.g., compilers, runtime engines, constraint solvers, and software libraries with accessible APIs, are especially important as they are fundamental building blocks of software development. However, existing fuzzers for such systems often target a specific language, and thus cannot be easily applied to other languages or even other versions of the same language. Moreover, the inputs generated by existing fuzzers are often limited to specific features of the input language, and thus can hardly reveal bugs related to other or new features. This paper presents Fuzz4All, the first fuzzer that is universal in the sense that it can target many different input languages and many different features of these languages. The key idea behind Fuzz4All is to leverage large language models (LLMs) as an input generation and mutation engine, which enables the approach to produce diverse and realistic inputs for any practically relevant language. To realize this potential, we present a novel autoprompting technique, which creates LLM prompts that are wellsuited for fuzzing, and a novel LLM-powered fuzzing loop, which iteratively updates the prompt to create new fuzzing inputs. We evaluate Fuzz4All on nine systems under test that take in six different languages (C, C++, Go, SMT2, Java and Python) as inputs. The evaluation shows, across all six languages, that universal fuzzing achieves higher coverage than existing, language-specific fuzzers. Furthermore, Fuzz4All has identified 98 bugs in widely used systems, such as GCC, Clang, Z3, CVC5, OpenJDK, and the Qiskit quantum computing platform, with 64 bugs already confirmed by developers as previously unknown.

Conference Data Ref: Y3KAUUIV

Preprint

2023-04-04

Large Language Models are Edge-Case Fuzzers: Testing Deep Learning Libraries via FuzzGPT

Deng, Yinlin Xia, Chunqiu Steven Yang, Chenyuan +3

Abstract

Deep Learning (DL) library bugs affect downstream DL applications, emphasizing the need for reliable systems. Generating valid input programs for fuzzing DL libraries is challenging due to the need for satisfying both language syntax/semantics and constraints for constructing valid computational graphs. Recently, the TitanFuzz work demonstrates that modern Large Language Models (LLMs) can be directly leveraged to implicitly learn all the constraints to generate valid DL programs for fuzzing. However, LLMs tend to generate ordinary programs following similar patterns seen in their massive training corpora, while fuzzing favors unusual inputs that cover edge cases or are unlikely to be manually produced. To fill this gap, this paper proposes FuzzGPT, the first technique to prime LLMs to synthesize unusual programs for fuzzing. FuzzGPT is built on the well-known hypothesis that historical bug-triggering programs may include rare/valuable code ingredients important for bug finding. Traditional techniques leveraging such historical information require intensive human efforts to design dedicated generators and ensure the validity of generated programs. FuzzGPT demonstrates that this process can be fully automated via the intrinsic capabilities of LLMs (including fine-tuning and in-context learning), while being generalizable and applicable to challenging domains. While FuzzGPT can be applied with different LLMs, this paper focuses on the powerful GPT-style models: Codex and CodeGen. Moreover, FuzzGPT also shows the potential of directly leveraging the instruct-following capability of the recent ChatGPT for effective fuzzing. Evaluation on two popular DL libraries (PyTorch and TensorFlow) shows that FuzzGPT can substantially outperform TitanFuzz, detecting 76 bugs, with 49 already confirmed as previously unknown bugs, including 11 high-priority bugs or security vulnerabilities.

Preprint Data Ref: VMBHEQR5

Journal

2024

Machine Learning-Based Fuzz Testing Techniques: A Survey

Zhang, Ao Zhang, Yiying Xu, Yao +2

IEEE Access

Abstract

Fuzz testing is a vulnerability discovery technique that tests the robustness of target programs by providing them with unconventional data. With the rapid increase in software quantity, scale and complexity, traditional fuzzing has revealed issues such as incomplete logic coverage, low automation level and insufficient test cases. Machine learning, with its exceptional capabilities in data analysis and classification prediction, presents a promising approach for improve fuzzing. This paper investigates the latest research results in fuzzing and provides a systematic review of machine learning-based fuzzing techniques. Firstly, by outlining the workflow of fuzzing, it summarizes the optimization of different stages of fuzzing using machine learning. Specifically, it focuses on the application of machine learning in the preprocessing phase, test case generation phase, input selection phase and result analysis phase. Secondly, it mentally focuses on the optimization methods of machine learning in the process of mutation, generation and filtering of test cases and compares and analyzes its technical principles. Furthermore, it analyzes the performance gains brought by applying machine learning techniques to fuzzing, mainly including coverage, vulnerability detection capability, efficiency and effectiveness of test cases. Lastly, it concludes by summarizing the challenges and difficulties in combining machine learning with fuzzing and presents prospects for future trends in this field.

Journal Data Ref: EHMSVSD8

Conference

2014

FLUSH+RELOAD: A High Resolution, Low Noise, L3 Cache Side-Channel Attack

Yarom, Yuval Falkner, Katrina

Abstract

No abstract available

Conference Data Ref: S9K5MWW8

Preprint

January 2018

Spectre Attacks: Exploiting Speculative Execution

Kocher, Paul Genkin, Daniel Gruss, Daniel +7

Abstract

Modern processors use branch prediction and speculative execution to maximize performance. For example, if the destination of a branch depends on a memory value that is in the process of being read, CPUs will try guess the destination and attempt to execute ahead. When the memory value finally arrives, the CPU either discards or commits the speculative computation. Speculative logic is unfaithful in how it executes, can access to the victim's memory and registers, and can perform operations with measurable side effects. Spectre attacks involve inducing a victim to speculatively perform operations that would not occur during correct program execution and which leak the victim's confidential information via a side channel to the adversary. This paper describes practical attacks that combine methodology from side channel attacks, fault attacks, and return-oriented programming that can read arbitrary memory from the victim's process. More broadly, the paper shows that speculative execution implementations violate the security assumptions underpinning numerous software security mechanisms, including operating system process separation, static analysis, containerization, just-in-time (JIT) compilation, and countermeasures to cache timing/side-channel attacks. These attacks represent a serious threat to actual systems, since vulnerable speculative execution capabilities are found in microprocessors from Intel, AMD, and ARM that are used in billions of devices. While makeshift processor-specific countermeasures are possible in some cases, sound solutions will require fixes to processor designs as well as updates to instruction set architectures (ISAs) to give hardware architects and software developers a common understanding as to what computation state CPU implementations are (and are not) permitted to leak.

Preprint Data Ref: AK9CIS8G

Preprint

January 2018

Meltdown

Lipp, Moritz Schwarz, Michael Gruss, Daniel +7

Abstract

The security of computer systems fundamentally relies on memory isolation, e.g., kernel address ranges are marked as non-accessible and are protected from user access. In this paper, we present Meltdown. Meltdown exploits side effects of out-of-order execution on modern processors to read arbitrary kernel-memory locations including personal data and passwords. Out-of-order execution is an indispensable performance feature and present in a wide range of modern processors. The attack works on different Intel microarchitectures since at least 2010 and potentially other processors are affected. The root cause of Meltdown is the hardware. The attack is independent of the operating system, and it does not rely on any software vulnerabilities. Meltdown breaks all security assumptions given by address space isolation as well as paravirtualized environments and, thus, every security mechanism building upon this foundation. On affected systems, Meltdown enables an adversary to read memory of other processes or virtual machines in the cloud without any permissions or privileges, affecting millions of customers and virtually every user of a personal computer. We show that the KAISER defense mechanism for KASLR has the important (but inadvertent) side effect of impeding Meltdown. We stress that KAISER must be deployed immediately to prevent large-scale exploitation of this severe information leakage.

Preprint Data Ref: 2NZ66TQM

Artwork

映像研には手を出すな！公式ガイド映像研活動報告

小学館

Abstract

No abstract available

Artwork Data Ref: ML83AQHP

Journal

return-to-csu: A New Method to Bypass 64-bit Linux ASLR

Hector, Dr

Abstract

No abstract available

Journal Data Ref: X2BQQICJ

Standard

2000

PDF reference: Adobe portable document format, version 1.3

Adobe Systems

Addison-Wesley

Abstract

No abstract available

Standard Data Ref: LBWNSLQW

Book

2009

琢石成器——Windows 环境下 32 位汇编语言程序设计

罗云彬

电子工业出版社

Abstract

No abstract available

Book Data Ref: 6I9A2KLU

Book

2014

逆向工程核心原理

李承远

人民邮电出版社

Abstract

No abstract available

Book Data Ref: 8QIYEBAX

Book

2023

x86 汇编语言——从实模式到保护模式

李忠王晓波余洁

电子工业出版社

Abstract

No abstract available

Book Data Ref: GMU7DWXA

Book

2019

汇编语言（第 4 版）

王爽

清华大学出版社

Abstract

No abstract available

Book Data Ref: EWD6T75Q

Book

2019

C 程序设计语言

Brian W. Kernighan Dennis M. Ritchie

机械工业出版社

Abstract

No abstract available

Book Data Ref: TM78WUKI

Book

2018

高效能人士的七个习惯

Stephen R.Covey

中国青年出版社

Abstract

No abstract available

Book Data Ref: YM9ZXW9G

Book

2019

Introduction to Linear Algebra

Gilbert Strang

Wellesley-Cambridge Press

Abstract

No abstract available

Book Data Ref: TW2XXDXN

Book

2018

理想国

Πλάτων, Plato

研究出版社

Abstract

No abstract available

Book Data Ref: 675SIZY9

Book

2012

思考，快与慢

Daniel Kahneman

中信出版集团

Abstract

No abstract available

Book Data Ref: VX55BWZ5

Book

凉宫春日的剧场

谷川流

百花川文艺出版社

Abstract

No abstract available

Book Data Ref: 5LUJFYUE

Book

呼啸山庄

Emily Bronte

Abstract

No abstract available

Book Data Ref: CYSNIJZX

Book

2012

Different Seasons

Stephen King

Hodder

Abstract

No abstract available

Book Data Ref: GXK63ER6

Book

2014

人类群星闪耀时

Stefan Zweig

中国友谊出版公司

Abstract

No abstract available

Book Data Ref: 7ML6GT9Y

Book

2021

哲学小史：西方哲学四十讲

Nigel Warburton

北京出版社

Abstract

No abstract available

Book Data Ref: CAA7R92J

Book

2025

飞鸟集

Rabindranath Tagore

西苑出版社

Abstract

No abstract available

Book Data Ref: 9EDMZ86P

Book

2023

Thomas' Calculus

Joel R. Hass Christopher D. Heil Maurice D. Weir

Pearson

Abstract

No abstract available

Book Data Ref: 56HIQ3FF

Book

2024

C Primer Plus

Stephen Prata

Pearson

Abstract

No abstract available

Book Data Ref: M83VCTNC

Book

2025

物联网安全漏洞挖掘实战

崔洪权

人民邮电出版社

Abstract

No abstract available

Book Data Ref: S97XI4T3

Book

天紀

倪海厦

Abstract

No abstract available

Book Data Ref: MP7V7KH3

Book

少有人走的路

M, Scott Peck

北京联合出版公司

Abstract

No abstract available

Book Data Ref: YBAUMYWB

Book

悲惨世界

Hugo, V.

云南人民出版社

Abstract

No abstract available

Book Data Ref: 58FDV8AQ

Book

西西弗神话

Albert Camus

重庆大学出版社

Abstract

No abstract available

Book Data Ref: UUYHB3V9

Book

鼠疫

Albert Camus

重庆大学出版社

Abstract

No abstract available

Book Data Ref: 7N4A4NQ7

Book

2021

西方哲学史

Bertrand Russell

万卷出版公司

Abstract

No abstract available

Book Data Ref: 7MW9RN7R

Book

程序员的自我修养：链接、装载与库

俞甲子石凡潘爱民

电子工业出版社

Abstract

No abstract available

Book Data Ref: KED3PF6J

Book

2016

深入理解计算机系统

Randal E. Bryant David R. O’Hallaron

机械工业出版社

Abstract

No abstract available

Book Data Ref: JBDIVR9C

Book

2016

操作系统真象还原

郑纲

人民邮电出版社

Abstract

No abstract available

Book Data Ref: P7UGSPIS

Book

2020

CTF竞赛权威指南（Pwn篇）

杨超

电子工业出版社

Abstract

No abstract available

Book Data Ref: UEHPE9FQ

Book

2017

Linux 二进制分析

Ryan,O'Neill

人民邮电出版社

Abstract

No abstract available

Book Data Ref: 2YXBE3PA

Book

局外人

Albert Camus

重庆大学出版社

Abstract

No abstract available

Book Data Ref: MQTEQKDF

Conference

May 2014

Hacking Blind

Bittau, Andrea Belay, Adam Mashtizadeh, Ali +2

Abstract

We show that it is possible to write remote stack buffer overflow exploits without possessing a copy of the target binary or source code, against services that restart after a crash. This makes it possible to hack proprietary closed-binary services, or open-source servers manually compiled and installed from source where the binary remains unknown to the attacker. Traditional techniques are usually paired against a particular binary and distribution where the hacker knows the location of useful gadgets for Return Oriented Programming (ROP). Our Blind ROP (BROP) attack instead remotely finds enough ROP gadgets to perform a write system call and transfers the vulnerable binary over the network, after which an exploit can be completed using known techniques. This is accomplished by leaking a single bit of information based on whether a process crashed or not when given a particular input string. BROP requires a stack vulnerability and a service that restarts after a crash. We implemented Braille, a fully automated exploit that yielded a shell in under 4,000 requests (20 minutes) against a contemporary nginx vulnerability, yaSSL + MySQL, and a toy proprietary server written by a colleague. The attack works against modern 64-bit Linux with address space layout randomization (ASLR), no-execute page protection (NX) and stack canaries.

Conference Data Ref: KLG7CTKY

Book

August 2023

智能汽车网络安全权威指南（下册）

李程

机械工业出版社

Abstract

《智能汽车网络安全权威指南》由国内知名电动汽车厂商安全团队负责人带领核心团队成员撰写，以“安全左移”为指导思想，围绕安全合规、安全标准、安全体系、安全测试、安全研发、安全运营、网络攻防、威胁评估、自动驾驶安全等9大核心主题对汽车的网络安全进行了全面且透彻的阐述，是汽车网络安全领域的标准性著作。本书为下册（第11~21章）：详细总结了汽车黑客的攻击思维和方法，并列举了汽车网络安全架构视角和汽车功能应用视角下的常用攻击手法和防御措施；系统讲解了覆盖整车研发周期的网络安全策略；前瞻性地讲解了高级辅助驾驶安全和汽车充电网络安全。

Book Data Ref: DP69PQ8G

Book

July 2023

智能汽车网络安全权威指南（上册）

李程陈楠王仲宇

机械工业出版社

Abstract

《智能汽车网络安全权威指南》由国内知名电动汽车厂商安全团队负责人带领核心团队成员撰写，以“安全左移”为指导思想，围绕安全合规、安全标准、安全体系、安全测试、安全研发、安全运营、网络攻防、威胁评估、自动驾驶安全等9大核心主题对汽车的网络安全进行了全面且透彻的阐述，是汽车网络安全领域的标准性著作。本书为上册（第1~10章）：梳理了汽车安全的发展脉络，以及汽车的功能安全、预期功能安全、网络安全3大安全主题；详细讲解了汽车的网络组成、网络通信协议、电子电气架构以及架构视角和功能视角的网络安全；重点解读了汽车网络安全的合规体系；从攻防的视角讲解了黑客如何零门槛破解一辆汽车以及针对各种不同场景的网络安全测试工具的用法。

Book Data Ref: QJFVPTB4

Book

April 2013