Finding Bugs Faster Than Hackers

USC Information Sciences Institute researchers have developed a novel approach to quickly identify security vulnerabilities.

by Julia Cohen

August 8, 2022

binary code with an error — Photo credit: andriano_cz/Getty Images

Malware, viruses, spyware, bots and more! Hackers have many tools at their disposal to ruin your day through your vulnerable technology. As we become increasingly dependent on internet-driven products (ie, phone, computer, smart home), and everything from toasters to toothbrushes can be connected to the internet, we must be ever vigilant against malicious attacks.

Preventing such attacks is the goal of a group of researchers in the Binary Analysis and Systems Security (BASS) group at USC Viterbi’s Information Sciences Institute (ISI). They will be presenting their new paper — written in collaboration with Arizona State University, Cisco Systems Inc. and EURECOM — at the upcoming 31st Annual USENIX Security Symposium, one of the premier conferences in the cybersecurity space, held August 10-12 in Boston, Mass.

“This paper is about vulnerability discovery, which is finding security bugs in software that attackers or hackers could exploit to get control of remote systems, leak information, or any number of bad things,” said co-author and co-advisor Christophe Hauser, a research computer scientist at ISI and research lead.

Co-author Nicolaas Weideman adds that, in particular, it’s about automated vulnerability discovery. “Because computer programs are so large and complicated these days, we’d like to automatically detect these vulnerabilities instead of having a human expert analyzing the program to find them.”

Searching for bugs in the zeros and ones

The paper proposes a novel technique for automated vulnerability discovery at the binary level. Hauser explains, “One of the specificities of this research is that we analyzed software not at the source code level, but we actually analyzed it at the binary level, the executable code. These are instructions that talk directly to the machine, they’re not instructions meant for humans to understand.”

Current state-of-the-art binary program analysis approaches are limited by inherent trade-offs between accuracy and scalability. Static vulnerability detection techniques – the analysis of a program without actually running it – are limited in how accurate they can be. While dynamic vulnerability detection techniques – the analysis of a program while it’s running – are difficult to scale up in size and therefore speed.

Introducing ARBITER

In their paper, the researchers propose a hybrid method that uses both static and dynamic vulnerability detection techniques to improve the precision of the former and the scalability of the latter. The team implemented their technique, creating a prototype called ARBITER, and found that they could make several advancements in the automatic analysis of binary code.

Hauser said, “This improves the security of software by giving security analysts the ability to scale up, so we can essentially detect security bugs that hackers could try to exploit before they find them. ARBITER can find bugs fast so that they can be fixed quickly by the developers, which means more security.”

They demonstrated the effectiveness of ARBITER with a large-scale evaluation on four common vulnerability classes. The “four common vulnerability classes” is notable. Weideman said, “In the past when static and dynamic execution were combined it was for only one very specific type of vulnerability. ARBITER, on the other hand, allows us to specify multiple vulnerabilities.”

ARBITER takes on the real world

The team put ARBITER to the test in a real-world application. “Essentially what we did was analyze all the packages of one of the most common Linux distributions. It’s an operating system that’s used in servers and desktops all around the world.” Hauser continued, “So this is not just a research prototype that we tried on a small scale experiment in the corner of a lab somewhere; we’ve applied it to large pieces of software that people use every day.”

Why Linux? Weidman said, “Linux is free, which is ideal for repeatability. It allows anyone to set up the experimental environment and verify the results. There are also many open source programs created for Linux. Even though ARBITER only uses the binary instructions, the source code is available for us to verify the results.”

When ARBITER ran on the Linux distributions, it did in fact find vulnerabilities. Weideman said, “Now that these vulnerabilities have been discovered, they get reported to the developers who then fix the vulnerabilities to secure the software, so I would say ARBITER has already made an impact in the real world.”

What’s next for ARBITER?

These results pave the road for future research in this area. When asked what the next steps will be, Hauser responded, “There are still things we need to scale up for because the models that we’re using are sometimes hitting hard limits, theoretical limits that we can’t go beyond unless we try to approach things slightly differently.”

One approach is to use recent advances in artificial intelligence — in particular, machine learning models — as a way to bring additional, external knowledge into the computation. “One way to do it is to actually look at it in a more probabilistic manner, for example, using machine learning to help push these limits further. By leveraging machine learning we can automatically determine the best soundness trade-offs,” said Hauser. “That is part of future research, in fact, we are actually exploring these directions at the BASS group.”

Hauser, Weideman and the research team are part of the BASS research group (Binary Analysis and Systems Security) in the Networking and Cybersecurity division at ISI. Their research focuses on binary program analysis for automated and semi-automated reverse engineering and vulnerability discovery, as well as other aspects of systems security. As they are doing with future research around ARBITER, the BASS group often leverages machine learning where appropriate through collaboration with ISI’s Artificial Intelligence division.

The paper will be presented at the upcoming 31st Annual USENIX Security Symposium, which has an acceptance rate of 14.5%, down from last year’s 18.7%. This year only 79 of the 546 papers submitted were accepted. Says Hauser, “USENIX Security is one of the top four conferences in systems security.”

Published on August 8th, 2022

Last updated on May 16th, 2024