Layer 1 (L1) protocols, the backbone of blockchain systems, are crucial for consensus algorithms, transaction formats, and data structures. Testing L1 protocols, however, is a complex and challenging task due to their intricate, interdependent components. In this article, we will discuss the difficulties of testing L1 protocols and explore how fuzzing can be an effective solution for discovering and mitigating security vulnerabilities in the protocol layer of blockchain systems.
Fuzzing is an automated testing technique for blockchain protocols and smart contracts. It generates an extreme number of semi-random inputs and feeds them to the tested system. The goal of fuzzing is to find security vulnerabilities by triggering unexpected or invalid behavior in the system.
Fuzzing is an amazing tool for layer 1 audits, especially when testing a virtual machine, as it automates many steps to ensure the security and stability of L1s. The best part is that the data is not as random as it seems. Fuzzer uses code coverage to check which inputs are working and which are not. By doing this, it can uncover really valuable insights that are impossible to spot with the human eye.
For smart contracts, fuzzing involves analyzing the ABI or bytecode, generating randomized valid inputs for their functions, executing the contract with these inputs, and examining the results to identify and report any weaknesses.
Fuzzing is a powerful testing method, especially when used at scale, but it requires a lot of resources and time. Building a fuzz target is the biggest challenge because it requires a lot of manual work:
To ensure high consistency and save us some time, we’ve automated target building with a tool called Hacken-Fuzz.
Our Hacken-Fuzz tool can automate all the processes to set up a fuzz target. It works with C, C+, Rust, Go, JavaScript, JVM, Swift, and Python. Hacken-Fuzz is effective because it runs 50 servers at once thanks to our infrastructure based on OSS-Fuzz.
With the right infrastructure, you can do amazing things with fuzzing, i.e., use it to detect code errors that are impossible to detect manually. These include critical weaknesses that result in node crashes, such as infinite loops, buffer overflows, uncontrolled memory allocations, etc. Finding and resolving these vulnerabilities can protect blockchains from a wide range of protocol layer attacks, such as long-range, 51%, and race attacks.
For smart contracts, fuzzing can be especially helpful in the areas of integer overflows/underflows, denial-of-service possibilities, tokenomics verification, gas-related issues, delegate call verification, and reentrancy.
Fuzzing can be particularly useful in executing large, time-consuming operations or verifying manual analysis to identify false positives and overlooked issues. While fuzzing is a valuable testing tool, it should not be considered a comprehensive substitute for a thorough audit.
At Hacken, we believe that fuzzing is an effective method for exploring security vulnerabilities in L1 protocols for the following reasons.
Simulating user behavior. Fuzzing can simulate real-world scenarios by generating random inputs that mimic user behavior. This means that fuzzing can test the system’s robustness against various attacks, including those that are difficult to anticipate.
Finding hidden vulnerabilities. Fuzzing can find security vulnerabilities that other testing methods might miss. For example, manual testing or static analysis might not be able to detect certain types of logic errors or race conditions that take extreme inputs.
Scalability in L1. Fuzzing can scale to handle the complexity of L1 protocols with their interdependent components. Fuzzing can generate many inputs and explore various execution paths, increasing the chance of finding security vulnerabilities.
Hacken fuzzing proves its worth in our daily work as auditors. I will share some details from a blockchain protocol audit for one of our clients, Minima. Minima is a Java project building a blockchain for smartphones entirely from scratch. They developed everything on their own, including a virtual machine and a language for it. We found a dozen critical issues with the fuzzer, which were later verified manually. Our client fixed all the vulnerabilities, so we can safely discuss them here.
As a result, we used fuzzing to uncover the following critical vulnerabilities:
See the full audit report with detailed descriptions and processes here.
These were critical errors (i.e., dangerous enough to crash the virtual machine). Manually it couldn`t have been done because only the machine can take testing to the extreme, such as concating a string that takes up 2GB of memory or checking infinite numbers.
Despite all the advantages of fuzzing, it’s not a magic pill; you need a full audit and a knowledgeable developer to review and verify code errors. On top of that, the fuzzer won’t find all the errors to begin with. Here are some of the most pressing challenges of fuzzing:
You need to fix issues after their discovery, or else fuzzing wouldn’t find anything else. The thing with fuzzing, after it finds a critical issue, you have to fix it, or else the fuzzer will stumble on the same issue again and again, not revealing all the issues that the codebase may have. Obviously, we don’t want that, so you have to fix each detected item as you go. So you fix the issue, and then you fuzz again. You may consider it a problem, but it’s also a good push factor actually to solve the found weaknesses.
Problematic with complex bugs. Fuzzing is hard to apply to bugs that require many steps to execute. For example, if a bug requires three transactions to execute, fuzzing might not be able to detect it because generating such inputs is hard.
Time-consuming. Generating many inputs and exploring various execution paths can take a lot of time, especially for complex L1 protocols. Also, we don’t know precisely how long each type of vulnerability will take.
May miss logic errors (unless specifically designed for it). Fuzzing might not find logic errors if the system is not designed to detect them. For example, if the system doesn’t have proper error-handling mechanisms, fuzzing might be unable to detect logic errors that lead to unexpected behavior.
Problematic with stateful components. Fuzzing stateful components requires keeping track of the system’s state, which can be challenging, especially for complex L1 protocols. It is hard to tell if fuzzer has what it takes to detect an issue.
Problematic with multi-threaded applications. Finally, fuzzing is hard to apply to multi-threaded applications as it requires keeping track of the system’s concurrency, which can be challenging, especially for complex L1 protocols.
Fuzzing is an amazing tool for blockchain audits that’s especially useful when testing a virtual machine, as it automates many processes. No single person can beat fuzzer in semi-random testing. Nevertheless, you still need 50-100+ servers to make it effective and, of course, the expertise to set up the target, but thankfully we at Hacken have built the necessary capabilities.
Yet, don’t get me wrong – fuzzing doesn’t substitute manual review, only makes it more effective. You still need a knowledgeable, experienced, and trusted blockchain auditor to verify issues found during fuzz testing, detect issues that cannot be found by random input generation, and recommend appropriate code fixes for all errors.
Be the first to receive our latest company updates, Web3 security insights, and exclusive content curated for the blockchain enthusiasts.
Table of contents
Tell us about your project
15 min read
Discover