Blacklists assemble: Aggregating blacklists for accuracy

Abstract

IP address blacklists are a useful defense against various cyberattacks. Because they contain IP addresses of known offenders, they can be used to preventively filter unwanted traffic, and reduce the load on more resource intensive defenses. Yet, blacklists today suffer from several drawbacks. First, they are compiled and updated using proprietary methods, and thus it is hard to evaluate accuracy and freshness of their information. Second, blacklists often focus on a single attack type, eg, spam, while compromised machines are constantly and indiscriminately reused for many attacks. Finally, blacklists contain IP addresses, which lowers their accuracy in networks that use dynamic addressing.
We propose BLAG, a sophisticated approach to select, aggregate and selectively expand only the accurate pieces of information from multiple blacklists. BLAG calculates information about accuracy of each blacklist over regions of address space, and uses recommendation systems to select most reputable and accurate pieces of information to aggregate into its master blacklist. This aggregation increases recall by 3–14%, compared to the best-performing blacklist, while preserving high specificity. After aggregation, BLAG identifies networks that have dynamic addressing or a high degree of mismanagement. IP addresses from such networks are selectively expanded into/24 prefixes. This further increases offender detection by 293–411%, with minimal loss in specificity. Overall, BLAG achieves high specificity 85–89% and high recall 26–61%, which makes it a promising approach for blacklist generation.

Date: 2018
Authors: Sivaramakrishnan Ramanthan, Jelena Mirkovic, Minlan Yu
Publisher: Technical Report ISI-TR-730. Information Sciences Institute

View Paper