Collections

Document

No abstract available

2026-01-15

AgentGuardian: Learning Access Control Policies to Govern AI Agent Behavior

Abaev, Nadya, Klimov, Denis +4

Artificial intelligence (AI) agents are increasingly used in a variety of domains to automate tasks, interact with users, and make decisions based on data inputs. Ensuring that AI agents perform only authorized actions and handle inputs appropriately is essential for maintaining system integrity and preventing misuse. In this study, we introduce the AgentGuardian, a novel security framework that governs and protects AI agent operations by enforcing context-aware access-control policies. During a controlled staging phase, the framework monitors execution traces to learn legitimate agent behaviors and input patterns. From this phase, it derives adaptive policies that regulate tool calls made by the agent, guided by both real-time input context and the control flow dependencies of multi-step agent actions. Evaluation across two real-world AI agent applications demonstrates that AgentGuardian effectively detects malicious or misleading inputs while preserving normal agent functionality. Moreover, its control-flow-based governance mechanism mitigates hallucination-driven errors and other orchestrationlevel malfunctions.

Open

2025-04-18

Preprint

Large Language Models for Validating Network Protocol Parsers

Zheng, Mingwei, Xie, Danning +1

Network protocol parsers are essential for enabling correct and secure communication between devices. Bugs in these parsers can introduce critical vulnerabilities, including memory corruption, information leakage, and denial-of-service attacks. An intuitive way to assess parser correctness is to compare the implementation with its official protocol standard. However, this comparison is challenging because protocol standards are typically written in natural language, whereas implementations are in source code. Existing methods like model checking, fuzzing, and differential testing have been used to find parsing bugs, but they either require significant manual effort or ignore the protocol standards, limiting their ability to detect semantic violations. To enable more automated validation of parser implementations against protocol standards, we propose PARVAL, a multi-agent framework built on large language models (LLMs). PARVAL leverages the capabilities of LLMs to understand both natural language and code. It transforms both protocol standards and their implementations into a unified intermediate representation, referred to as format specifications, and performs a differential comparison to uncover inconsistencies. We evaluate PARVAL on the Bidirectional Forwarding Detection (BFD) protocol. Our experiments demonstrate that PARVAL successfully identifies inconsistencies between the implementation and its RFC standard, achieving a low false positive rate of 5.6%. PARVAL uncovers seven unique bugs, including five previously unknown issues.

Open

October 27, 2025

Journal

OmniFuzz: A Multi-agent Reinforcement Learning Framework for Protocol-Aware Fuzzing in Power IoT Devices

Song, Yubo, Chen, Weiwei +6

Power IoT devices, as critical components in industrial control systems, often operate in heterogeneous environments and support multiple communication protocols such as Modbus TCP, EtherNet/IP, and Siemens S7. Unlike general embedded systems, these devices have strict real-time constraints, safety-critical characteristics, and a large protocol surface exposed to external networks, making them vulnerable to protocol-level attacks. However, most existing fuzzing tools only test individual protocols independently, making it difficult to detect protocol-stack-level or multi-interface vulnerabilities. To address this, we propose OmniFuzz, a protocol-aware fuzzing framework based on multi-agent reinforcement learning, specifically designed for power IoT devices. For the multi-protocol scenarios supported by the devices, the framework constructs a dedicated agent array for each protocol. Each agent mutates specific protocol fields through an independently learned policy network and collaborates via a shared value network, forming a directed multi-protocol concurrent testing mechanism. The framework incorporates a domain-specific reward function cluster (covering vulnerability severity, code path depth, and input diversity), which effectively improves testing efficiency and code coverage. OmniFuzz supports concurrent multi-protocol fuzzing during runtime, enabling comprehensive vulnerability discovery across concurrent heterogeneous protocol interfaces. Although the current implementation does not explicitly model inter-protocol behavior sequences, it lays the foundation for future exploration of cross-protocol attack paths. Experiments on real-world PLC devices from multiple vendors show that OmniFuzz outperforms baseline fuzzers by approximately 10% in terms of time to first vulnerability, exception triggering rate, and effective recognition rate. Through this framework, we discovered 5 high-risk buffer overflow vulnerabilities in the State Grid’s Smart-distribution-transformer-combine-terminal-unit, with relevant demonstration videos published on GitHub. Detailed descriptions of this will be provided in the discussion section.

J. Netw. Syst. Manage.

Open

2025-08-19

Preprint

MultiFuzz: A Dense Retrieval-based Multi-Agent System for Network Protocol Fuzzing

Maklad, Youssef, Wael, Fares +3

Traditional protocol fuzzing techniques, such as those employed by AFL-based systems, often lack effectiveness due to a limited semantic understanding of complex protocol grammars and rigid seed mutation strategies. Recent works, such as ChatAFL, have integrated Large Language Models (LLMs) to guide protocol fuzzing and address these limitations, pushing protocol fuzzers to wider exploration of the protocol state space. But ChatAFL still faces issues like unreliable output, LLM hallucinations, and assumptions of LLM knowledge about protocol specifications. This paper introduces MultiFuzz, a novel dense retrieval-based multi-agent system designed to overcome these limitations by integrating semantic-aware context retrieval, specialized agents, and structured tool-assisted reasoning. MultiFuzz utilizes agentic chunks of protocol documentation (RFC Documents) to build embeddings in a vector database for a retrieval-augmented generation (RAG) pipeline, enabling agents to generate more reliable and structured outputs, enhancing the fuzzer in mutating protocol messages with enhanced state coverage and adherence to syntactic constraints. The framework decomposes the fuzzing process into modular groups of agents that collaborate through chain-of-thought reasoning to dynamically adapt fuzzing strategies based on the retrieved contextual knowledge. Experimental evaluations on the Real-Time Streaming Protocol (RTSP) demonstrate that MultiFuzz significantly improves branch coverage and explores deeper protocol states and transitions over state-of-the-art (SOTA) fuzzers such as NSFuzz, AFLNet, and ChatAFL. By combining dense retrieval, agentic coordination, and language model reasoning, MultiFuzz establishes a new paradigm in autonomous protocol fuzzing, offering a scalable and extensible foundation for future research in intelligent agentic-based fuzzing systems.

Open

2025-10-28

Journal

LMFuzz: Program repair fuzzing based on large language models

Lin, Renze, Wang, Ran +2

Generating programs using large language models (LLMs) for fuzz testing has emerged as a significant testing methodology. While traditional fuzzers can produce correct programs, their effectiveness is limited by excessive constraints and restricted API combinations, resulting in insufficient coverage of the target system’s code and impacting testing efficiency. Unlike traditional methods, large language model based fuzzers can generate more diverse code, effectively addressing key issues of conventional fuzzers. However, the lack of constraints on API combinations during the generation process often leads to reduced program validity. Therefore, a crucial challenge is to enhance the validity of generated code while maintaining its diversity. To address this issue, we propose a novel and universal fuzzer, LMFuzz. To ensure the fuzzer’s generation capability, we utilize a large language model as the primary generator and model the operator selection problem within the fuzzing loop as a multi-armed bandit problem. We introduce the Thompson Sampling algorithm to enhance both the diversity and validity of program generation. To improve the validity of the generated code, we incorporate a program repair loop that iteratively corrects the generated programs, thereby reducing errors caused by the lack of API combination constraints. Experimental results demonstrate that LMFuzz significantly surpasses existing state-of-the-art large language model based fuzzers in terms of coverage and validity, and also exhibits notable advantages in generating diverse programs. Furthermore, LMFuzz has identified 24 bugs across five popular programming languages and their corresponding systems.

Automated Software Engineering

Open

2022-12-01

Journal

AMSFuzz: An adaptive mutation schedule for fuzzing

Zhao, Xiaoqi, Qu, Haipeng +3

Mutation-based fuzzing is one of the most popular software testing techniques. After allocating a specific amount of energy (i.e., the number of testcases generated by the seed) for the seed, it uses existing mutation operators to continuously mutate the seed to generate new testcases and feed them into the target program to discover unexpected behaviors, such as bugs, crashes, and vulnerabilities. However, the random selection of mutation operators and sequential selection of mutation positions in existing fuzzers affect path discovery and bug detection. In this paper, a novel adaptive mutation schedule framework, AMSFuzz is proposed. For the random selection of mutation operators, AMSFuzz has the ability to adaptively adjust the probability distribution of mutation operators to select mutation operators. Aiming at the sequential selection of mutation positions, seeds are dynamically sliced with different sizes during the fuzzing process and giving more seeds the opportunity to preferentially mutate, improving the efficiency of fuzzing. AMSFuzz is implemented and evaluated in 12 real-world programs and LAVA-M dataset. The results show that AMSFuzz substantially outperforms state-of-the-art fuzzers in terms of path discovery and bug detection. Additionally, AMSFuzz has detected 17 previously unknown bugs in several projects, 15 of which were assigned CVE IDs.

Expert Systems with Applications

Open

2023

Conference

DARWIN: Survival of the Fittest Fuzzing Mutators

Jauernig, Patrick, Jakobovic, Domagoj +3

Fuzzing is an automated software testing technique broadly adopted by the industry. A popular variant is mutation-based fuzzing, which discovers a large number of bugs in practice. While the research community has studied mutation-based fuzzing for years now, the algorithms' interactions within the fuzzer are highly complex and can, together with the randomness in every instance of a fuzzer, lead to unpredictable effects. Most efforts to improve this fragile interaction focused on optimizing seed scheduling. However, real-world results like Google's FuzzBench highlight that these approaches do not consistently show improvements in practice. Another approach to improve the fuzzing process algorithmically is optimizing mutation scheduling. Unfortunately, existing mutation scheduling approaches also failed to convince because of missing real-world improvements or too many user-controlled parameters whose configuration requires expert knowledge about the target program. This leaves the challenging problem of cleverly processing test cases and achieving a measurable improvement unsolved. We present DARWIN, a novel mutation scheduler and the first to show fuzzing improvements in a realistic scenario without the need to introduce additional user-configurable parameters, opening this approach to the broad fuzzing community. DARWIN uses an Evolution Strategy to systematically optimize and adapt the probability distribution of the mutation operators during fuzzing. We implemented a prototype based on the popular general-purpose fuzzer AFL. DARWIN significantly outperforms the state-of-the-art mutation scheduler and the AFL baseline in our own coverage experiment, in FuzzBench, and by finding 15 out of 21 bugs the fastest in the MAGMA benchmark. Finally, DARWIN found 20 unique bugs (including one novel bug), 66% more than AFL, in widely-used real-world applications.

Open

Blog

FirmAgent: Leveraging Fuzzing to Assist LLM Agents with IoT Firmware Vulnerability Discovery

No abstract available

2024-04-12

Conference

Fuzz4All: Universal Fuzzing with Large Language Models

Xia, Chunqiu Steven, Paltenghi, Matteo +3

Fuzzing has achieved tremendous success in discovering bugs and vulnerabilities in various software systems. Systems under test (SUTs) that take in programming or formal language as inputs, e.g., compilers, runtime engines, constraint solvers, and software libraries with accessible APIs, are especially important as they are fundamental building blocks of software development. However, existing fuzzers for such systems often target a specific language, and thus cannot be easily applied to other languages or even other versions of the same language. Moreover, the inputs generated by existing fuzzers are often limited to specific features of the input language, and thus can hardly reveal bugs related to other or new features. This paper presents Fuzz4All, the first fuzzer that is universal in the sense that it can target many different input languages and many different features of these languages. The key idea behind Fuzz4All is to leverage large language models (LLMs) as an input generation and mutation engine, which enables the approach to produce diverse and realistic inputs for any practically relevant language. To realize this potential, we present a novel autoprompting technique, which creates LLM prompts that are wellsuited for fuzzing, and a novel LLM-powered fuzzing loop, which iteratively updates the prompt to create new fuzzing inputs. We evaluate Fuzz4All on nine systems under test that take in six different languages (C, C++, Go, SMT2, Java and Python) as inputs. The evaluation shows, across all six languages, that universal fuzzing achieves higher coverage than existing, language-specific fuzzers. Furthermore, Fuzz4All has identified 98 bugs in widely used systems, such as GCC, Clang, Z3, CVC5, OpenJDK, and the Qiskit quantum computing platform, with 64 bugs already confirmed by developers as previously unknown.

Open

2023-04-04

Preprint

Large Language Models are Edge-Case Fuzzers: Testing Deep Learning Libraries via FuzzGPT

Deng, Yinlin, Xia, Chunqiu Steven +4

Deep Learning (DL) library bugs affect downstream DL applications, emphasizing the need for reliable systems. Generating valid input programs for fuzzing DL libraries is challenging due to the need for satisfying both language syntax/semantics and constraints for constructing valid computational graphs. Recently, the TitanFuzz work demonstrates that modern Large Language Models (LLMs) can be directly leveraged to implicitly learn all the constraints to generate valid DL programs for fuzzing. However, LLMs tend to generate ordinary programs following similar patterns seen in their massive training corpora, while fuzzing favors unusual inputs that cover edge cases or are unlikely to be manually produced. To fill this gap, this paper proposes FuzzGPT, the first technique to prime LLMs to synthesize unusual programs for fuzzing. FuzzGPT is built on the well-known hypothesis that historical bug-triggering programs may include rare/valuable code ingredients important for bug finding. Traditional techniques leveraging such historical information require intensive human efforts to design dedicated generators and ensure the validity of generated programs. FuzzGPT demonstrates that this process can be fully automated via the intrinsic capabilities of LLMs (including fine-tuning and in-context learning), while being generalizable and applicable to challenging domains. While FuzzGPT can be applied with different LLMs, this paper focuses on the powerful GPT-style models: Codex and CodeGen. Moreover, FuzzGPT also shows the potential of directly leveraging the instruct-following capability of the recent ChatGPT for effective fuzzing. Evaluation on two popular DL libraries (PyTorch and TensorFlow) shows that FuzzGPT can substantially outperform TitanFuzz, detecting 76 bugs, with 49 already confirmed as previously unknown bugs, including 11 high-priority bugs or security vulnerabilities.

Open

2024

Journal

Machine Learning-Based Fuzz Testing Techniques: A Survey

Zhang, Ao, Zhang, Yiying +3

Fuzz testing is a vulnerability discovery technique that tests the robustness of target programs by providing them with unconventional data. With the rapid increase in software quantity, scale and complexity, traditional fuzzing has revealed issues such as incomplete logic coverage, low automation level and insufficient test cases. Machine learning, with its exceptional capabilities in data analysis and classification prediction, presents a promising approach for improve fuzzing. This paper investigates the latest research results in fuzzing and provides a systematic review of machine learning-based fuzzing techniques. Firstly, by outlining the workflow of fuzzing, it summarizes the optimization of different stages of fuzzing using machine learning. Specifically, it focuses on the application of machine learning in the preprocessing phase, test case generation phase, input selection phase and result analysis phase. Secondly, it mentally focuses on the optimization methods of machine learning in the process of mutation, generation and filtering of test cases and compares and analyzes its technical principles. Furthermore, it analyzes the performance gains brought by applying machine learning techniques to fuzzing, mainly including coverage, vulnerability detection capability, efficiency and effectiveness of test cases. Lastly, it concludes by summarizing the challenges and difficulties in combining machine learning with fuzzing and presents prospects for future trends in this field.

FLUSH+RELOAD: A High Resolution, Low Noise, L3 Cache Side-Channel Attack

Yarom, Yuval, Falkner, Katrina

No abstract available

Open

January 2018

Preprint

Spectre Attacks: Exploiting Speculative Execution

Kocher, Paul, Genkin, Daniel +8

Modern processors use branch prediction and speculative execution to maximize performance. For example, if the destination of a branch depends on a memory value that is in the process of being read, CPUs will try guess the destination and attempt to execute ahead. When the memory value finally arrives, the CPU either discards or commits the speculative computation. Speculative logic is unfaithful in how it executes, can access to the victim's memory and registers, and can perform operations with measurable side effects. Spectre attacks involve inducing a victim to speculatively perform operations that would not occur during correct program execution and which leak the victim's confidential information via a side channel to the adversary. This paper describes practical attacks that combine methodology from side channel attacks, fault attacks, and return-oriented programming that can read arbitrary memory from the victim's process. More broadly, the paper shows that speculative execution implementations violate the security assumptions underpinning numerous software security mechanisms, including operating system process separation, static analysis, containerization, just-in-time (JIT) compilation, and countermeasures to cache timing/side-channel attacks. These attacks represent a serious threat to actual systems, since vulnerable speculative execution capabilities are found in microprocessors from Intel, AMD, and ARM that are used in billions of devices. While makeshift processor-specific countermeasures are possible in some cases, sound solutions will require fixes to processor designs as well as updates to instruction set architectures (ISAs) to give hardware architects and software developers a common understanding as to what computation state CPU implementations are (and are not) permitted to leak.

Open

January 2018

Preprint

Meltdown

Lipp, Moritz, Schwarz, Michael +8

The security of computer systems fundamentally relies on memory isolation, e.g., kernel address ranges are marked as non-accessible and are protected from user access. In this paper, we present Meltdown. Meltdown exploits side effects of out-of-order execution on modern processors to read arbitrary kernel-memory locations including personal data and passwords. Out-of-order execution is an indispensable performance feature and present in a wide range of modern processors. The attack works on different Intel microarchitectures since at least 2010 and potentially other processors are affected. The root cause of Meltdown is the hardware. The attack is independent of the operating system, and it does not rely on any software vulnerabilities. Meltdown breaks all security assumptions given by address space isolation as well as paravirtualized environments and, thus, every security mechanism building upon this foundation. On affected systems, Meltdown enables an adversary to read memory of other processes or virtual machines in the cloud without any permissions or privileges, affecting millions of customers and virtually every user of a personal computer. We show that the KAISER defense mechanism for KASLR has the important (but inadvertent) side effect of impeding Meltdown. We stress that KAISER must be deployed immediately to prevent large-scale exploitation of this severe information leakage.

Open

Artwork

映像研には手を出すな！公式ガイド映像研活動報告

小学館

No abstract available

Journal

return-to-csu: A New Method to Bypass 64-bit Linux ASLR

Hector, Dr

No abstract available

Open

2000

Standard

PDF reference: Adobe portable document format, version 1.3

Adobe Systems

No abstract available

Addison-Wesley

Open

2009

Book

琢石成器——Windows 环境下 32 位汇编语言程序设计

罗云彬

No abstract available

电子工业出版社

2014

Book

逆向工程核心原理

李承远

No abstract available

人民邮电出版社

2023

Book

x86 汇编语言——从实模式到保护模式

李忠, 王晓波 +1

No abstract available

电子工业出版社

2019

Book

汇编语言（第 4 版）

王爽

No abstract available

清华大学出版社

2019

Book

C 程序设计语言

Brian W. Kernighan, Dennis M. Ritchie

No abstract available

机械工业出版社

2018

Book

高效能人士的七个习惯

Stephen R.Covey

No abstract available

中国青年出版社

2019

Book

Introduction to Linear Algebra

Gilbert Strang

No abstract available

Wellesley-Cambridge Press

2018

Book

理想国

Πλάτων, Plato

No abstract available

研究出版社

2012

Book

思考，快与慢

Daniel Kahneman

No abstract available

中信出版集团

Book

凉宫春日的剧场

谷川流

No abstract available

百花川文艺出版社

Book

呼啸山庄

Emily Bronte

No abstract available

2012

Book

Different Seasons

Stephen King

No abstract available

Hodder

2014

Book

人类群星闪耀时

Stefan Zweig

No abstract available

中国友谊出版公司

2021

Book

哲学小史：西方哲学四十讲

Nigel Warburton

No abstract available

北京出版社

2025

Book

飞鸟集

Rabindranath Tagore

No abstract available

西苑出版社

2023

Book

Thomas' Calculus

Joel R. Hass, Christopher D. Heil +1

No abstract available

Pearson

2024

Book

C Primer Plus

Stephen Prata

No abstract available

Pearson

2025

Book

物联网安全漏洞挖掘实战

崔洪权

No abstract available

人民邮电出版社

Book

天紀

倪海厦

No abstract available

Book

少有人走的路

M, Scott Peck

No abstract available

北京联合出版公司

Book

悲惨世界

Hugo, V.

No abstract available

云南人民出版社

Book

西西弗神话

Albert Camus

No abstract available

重庆大学出版社

Book

鼠疫

Albert Camus

No abstract available

重庆大学出版社

2021

Book

西方哲学史

Bertrand Russell

No abstract available

万卷出版公司

Book

程序员的自我修养：链接、装载与库

俞甲子, 石凡 +1

No abstract available

电子工业出版社

2016

Book

深入理解计算机系统

Randal E. Bryant, David R. O’Hallaron

No abstract available

机械工业出版社

2016

Book

操作系统真象还原

郑纲

No abstract available

人民邮电出版社

2020

Book

CTF竞赛权威指南（Pwn篇）

杨超

No abstract available

电子工业出版社

2017

Book

Linux 二进制分析

Ryan,O'Neill

No abstract available

人民邮电出版社

Book

局外人

Albert Camus

No abstract available

重庆大学出版社

May 2014

Conference

Hacking Blind

Bittau, Andrea, Belay, Adam +3

We show that it is possible to write remote stack buffer overflow exploits without possessing a copy of the target binary or source code, against services that restart after a crash. This makes it possible to hack proprietary closed-binary services, or open-source servers manually compiled and installed from source where the binary remains unknown to the attacker. Traditional techniques are usually paired against a particular binary and distribution where the hacker knows the location of useful gadgets for Return Oriented Programming (ROP). Our Blind ROP (BROP) attack instead remotely finds enough ROP gadgets to perform a write system call and transfers the vulnerable binary over the network, after which an exploit can be completed using known techniques. This is accomplished by leaking a single bit of information based on whether a process crashed or not when given a particular input string. BROP requires a stack vulnerability and a service that restarts after a crash. We implemented Braille, a fully automated exploit that yielded a shell in under 4,000 requests (20 minutes) against a contemporary nginx vulnerability, yaSSL + MySQL, and a toy proprietary server written by a colleague. The attack works against modern 64-bit Linux with address space layout randomization (ASLR), no-execute page protection (NX) and stack canaries.

Open

August 2023

Book

智能汽车网络安全权威指南（下册）

李程

《智能汽车网络安全权威指南》由国内知名电动汽车厂商安全团队负责人带领核心团队成员撰写，以“安全左移”为指导思想，围绕安全合规、安全标准、安全体系、安全测试、安全研发、安全运营、网络攻防、威胁评估、自动驾驶安全等9大核心主题对汽车的网络安全进行了全面且透彻的阐述，是汽车网络安全领域的标准性著作。本书为下册（第11~21章）：详细总结了汽车黑客的攻击思维和方法，并列举了汽车网络安全架构视角和汽车功能应用视角下的常用攻击手法和防御措施；系统讲解了覆盖整车研发周期的网络安全策略；前瞻性地讲解了高级辅助驾驶安全和汽车充电网络安全。

机械工业出版社

July 2023

Book

智能汽车网络安全权威指南（上册）

李程, 陈楠 +1

《智能汽车网络安全权威指南》由国内知名电动汽车厂商安全团队负责人带领核心团队成员撰写，以“安全左移”为指导思想，围绕安全合规、安全标准、安全体系、安全测试、安全研发、安全运营、网络攻防、威胁评估、自动驾驶安全等9大核心主题对汽车的网络安全进行了全面且透彻的阐述，是汽车网络安全领域的标准性著作。本书为上册（第1~10章）：梳理了汽车安全的发展脉络，以及汽车的功能安全、预期功能安全、网络安全3大安全主题；详细讲解了汽车的网络组成、网络通信协议、电子电气架构以及架构视角和功能视角的网络安全；重点解读了汽车网络安全的合规体系；从攻防的视角讲解了黑客如何零门槛破解一辆汽车以及针对各种不同场景的网络安全测试工具的用法。

机械工业出版社

April 2013

Book