LANDER:long flows D1 2 days-20090605 From Predict README version: 2617, last modified: 2012-03-16. This file describes the trace dataset "long_flows_D1_2_days-20090605" provided by the LANDER project. The most recent version of this file can be found on-line at http://wiki.isi.edu/predict/index.php/LANDER:long_flows_D1_2_days-20090605. LANDER Metadata http://wiki.isi.edu/predict/index.php/LANDER:long_flows_D1_2_days-20090605/landermeta) +--------------------------------------------------------------------------------------------------------+ |dataSetName |long_flows_D1_2_days-20090605 | |--------------------------+-----------------------------------------------------------------------------| |status |usc-web-and-predict | |--------------------------+-----------------------------------------------------------------------------| |shortDesc |2 days of IP flows in 2009. | |--------------------------+-----------------------------------------------------------------------------| |longDesc |This dataset contains IP flow records spanning two days in a modified Argus | | |format. The durations of flows are ranging from seconds to hours and days. | | |Different durations of flows are organized into different directories, and | | |the flow duration increases exponentially. All IP addresses in this dataset | | |are host-only anonymized. | |--------------------------+-----------------------------------------------------------------------------| |datasetCategory |Traffic Flow Data | |--------------------------+-----------------------------------------------------------------------------| |datasetSubCategory |Long-lived Flow Summarization-Host Only Anon | |--------------------------+-----------------------------------------------------------------------------| |requestReviewRequired |true | |--------------------------+-----------------------------------------------------------------------------| |productReviewRequired |false | |--------------------------+-----------------------------------------------------------------------------| |ongoingMeasurement |false | |--------------------------+-----------------------------------------------------------------------------| |collectionStartDate |2009-06-05 | |--------------------------+-----------------------------------------------------------------------------| |collectionStartTime |12:56:38 | |--------------------------+-----------------------------------------------------------------------------| |collectionEndDate |2009-06-07 | |--------------------------+-----------------------------------------------------------------------------| |collectionEndTime |07:37:12 | |--------------------------+-----------------------------------------------------------------------------| |availabilityStartDate | | |--------------------------+-----------------------------------------------------------------------------| |availabilityStartTime | | |--------------------------+-----------------------------------------------------------------------------| |availabilityEndDate | | |--------------------------+-----------------------------------------------------------------------------| |availabilityEndTime | | |--------------------------+-----------------------------------------------------------------------------| |anonymization |true | |--------------------------+-----------------------------------------------------------------------------| |archivingAllowed | | |--------------------------+-----------------------------------------------------------------------------| |keywords |host-only-ip-anonymization, long-flows, flow-statistics | |--------------------------+-----------------------------------------------------------------------------| |format |text | |--------------------------+-----------------------------------------------------------------------------| |access |https | |--------------------------+-----------------------------------------------------------------------------| |hostName |USC-LANDER | |--------------------------+-----------------------------------------------------------------------------| |privateAccessInstructions |See http://www.isi.edu/ant/traces/index.html#getting_datasets for information| | |on obtaining this dataset. | | |See | | |http://wiki.isi.edu/predict/index.php/LANDER:long_flows_D1_2_days-20090605 | | |for details on this dataset. | +--------------------------------------------------------------------------------------------------------+ Dataset Contents This dataset contains IP flow records spanning two days, from 2009-06-05 to 2009-06-07, in a modified Argus format. The durations of flows range from seconds to hours and days. Different durations of flows are organized into different directories, as shown below. r0/ level 0 flow files, up to 10 minute long r1/ level 1 flow files, up to 20 minute long r2/ level 2 flow files, up to 40 minute long ... r8/ level 8 flow files, up to 2 days long The flow duration increases exponentially, with a base duration of 10 minutes. Two level i flow files (numbered 2n and 2n+1) are merged into a level i+1 file (numbered n). All IP addresses are prefix-preserving, host-only anonymized, so the top 24 bits are correct, and the lowest 8 bits are scrambled. We also compress all the flow files using bzip2. The flow record format is (after uncompression): start_timestamp end_timestamp sourceIP.sourcePort protocol destinationIP.destinationPort num_packets num_bytes state sigma_bytes_square bytes_avg N_timebins (last three are used to calculate burstiness of flows, which is defined as variance of bytes over a time bin of 10 minutes. burstiness = math.sqrt(sigma_bytes_square/N - bytes_avg*bytes_avg)) A sample record: 20090606:02:15:48.049447 20090606:03:37:13.873638 194.177.210.209.41157 udp 224.2.127.195.sapv1 3822 1048400 INT 12133392238 10920.8333333 96 For a longer description of our dataset, please see here. Citation If you use this trace to conduct additional research, please cite it as: Long-lived Internet flows, PREDICT ID: USC-LANDER/long_flows_D1_2_days-20090605. Traces taken 2009-06-05 to 2009-06-07. Provided by the USC/LANDER project http://www.isi.edu/ant/lander. Results Using This Dataset Traces similar to this one have been used the following previously published work: * Lin Quan and John Heidemann. On the Characteristics and Reasons of Long-lived Internet Flows. In Proceedings of the ACM Internet Measurement Conference, p. 444-450. Melbourne, Australia, ACM. November, 2010. http://www.isi.edu/~johnh/PAPERS/Quan10a.html. User Annotations Suggestion: Edit the annotations at http://wiki.isi.edu/predict/index.php?title=LANDERNOTES:long_flows_D1_2_days-20090605action=edit The fully anonymized version of this dataset is LANDER:long_flows_D1_2_days-anonymized-20090605. Some statistics: number of source IPs (in flows longer than 640 minutes) = 1170 number of destination IPs (in flows longer than 640 minutes) = 683 number of flows (longer than 640 minutes) = 1441 Categories Retrieved from "http://wiki.isi.edu/predict/index.php?title=LANDER:long_flows_D1_2_days-20090605oldid=2617" Categories: * LANDER:PredictCategory:Traffic Flow Data * LANDER:PredictCategory:Traffic Flow Data/Long-lived Flow Summarization-Host Only Anon * LANDER * LANDER:Datasets * LANDER:Datasets:TrafficFlowData * Datasets