LANDER:long flows D1 2 days-anonymized-20090605 From Predict README version: 2612, last modified: 2012-03-16. This file describes the trace dataset "long_flows_D1_2_days-anonymized-20090605" provided by the LANDER project. The most recent version of this file can be found on-line at http://wiki.isi.edu/predict/index.php/LANDER:long_flows_D1_2_days-anonymized-20090605. LANDER Metadata http://wiki.isi.edu/predict/index.php/LANDER:long_flows_D1_2_days-anonymized-20090605/landermeta) +-------------------------------------------------------------------------------------------------------------------+ |dataSetName |long_flows_D1_2_days-anonymized-20090605 | |--------------------------+----------------------------------------------------------------------------------------| |status |usc-web-and-predict | |--------------------------+----------------------------------------------------------------------------------------| |shortDesc |2 days of IP flows in 2009. | |--------------------------+----------------------------------------------------------------------------------------| |longDesc |This dataset contains IP flow records spanning two days in a modified Argus format. The | | |durations of flows are ranging from seconds to hours and days. Different durations of | | |flows are organized into different directories, and the flow duration increases | | |exponentially. All IP addresses in this dataset are fully anonymized. | |--------------------------+----------------------------------------------------------------------------------------| |datasetCategory |Traffic Flow Data | |--------------------------+----------------------------------------------------------------------------------------| |datasetSubCategory |Long-lived Flow Summarization-Full IP Anon | |--------------------------+----------------------------------------------------------------------------------------| |requestReviewRequired |true | |--------------------------+----------------------------------------------------------------------------------------| |productReviewRequired |false | |--------------------------+----------------------------------------------------------------------------------------| |ongoingMeasurement |false | |--------------------------+----------------------------------------------------------------------------------------| |collectionStartDate |2009-06-05 | |--------------------------+----------------------------------------------------------------------------------------| |collectionStartTime |12:56:38 | |--------------------------+----------------------------------------------------------------------------------------| |collectionEndDate |2009-06-07 | |--------------------------+----------------------------------------------------------------------------------------| |collectionEndTime |07:37:12 | |--------------------------+----------------------------------------------------------------------------------------| |availabilityStartDate | | |--------------------------+----------------------------------------------------------------------------------------| |availabilityStartTime | | |--------------------------+----------------------------------------------------------------------------------------| |availabilityEndDate | | |--------------------------+----------------------------------------------------------------------------------------| |availabilityEndTime | | |--------------------------+----------------------------------------------------------------------------------------| |anonymization |true | |--------------------------+----------------------------------------------------------------------------------------| |archivingAllowed | | |--------------------------+----------------------------------------------------------------------------------------| |keywords |full-ip-anonymization, long-flows, flow-statistics | |--------------------------+----------------------------------------------------------------------------------------| |format |text | |--------------------------+----------------------------------------------------------------------------------------| |access |https | |--------------------------+----------------------------------------------------------------------------------------| |hostName |USC-LANDER | |--------------------------+----------------------------------------------------------------------------------------| |privateAccessInstructions |See http://www.isi.edu/ant/traces/index.html#getting_datasets for information on | | |obtaining this dataset. | | |See | | |http://wiki.isi.edu/predict/index.php/LANDER:long_flows_D1_2_days-anonymized-20090605 | | |for details on this dataset. | +-------------------------------------------------------------------------------------------------------------------+ Dataset Contents This dataset contains IP flow records spanning two days, from 2009-06-05 to 2009-06-07, in a modified Argus format. The durations of flows range from seconds to hours and days. Different durations of flows are organized into different directories, as shown below. r0/ level 0 flow files, up to 10 minute long r1/ level 1 flow files, up to 20 minute long r2/ level 2 flow files, up to 40 minute long ... r8/ level 8 flow files, up to 2 days long The flow duration increases exponentially, with a base duration of 10 minutes. Two level i flow files (numbered 2n and 2n+1) are merged into a level i+1 file (numbered n). All IP addresses are fully anonymized. We also compress all the flow files using bzip2. The flow record format is (after uncompression): start_timestamp end_timestamp sourceIP.sourcePort protocol destinationIP.destinationPort num_packets num_bytes state sigma_bytes_square bytes_avg N_timebins (last three are used to calculate burstiness of flows, which is defined as variance of bytes over a time bin of 10 minutes. burstiness = math.sqrt(sigma_bytes_square/N - bytes_avg*bytes_avg)) A sample record: 20090606:02:15:48.049447 20090606:03:37:13.873638 194.177.210.209.41157 udp 224.2.127.195.sapv1 3822 1048400 INT 12133392238 10920.8333333 96 For a longer description of our dataset, please see here. Citation If you use this trace to conduct additional research, please cite it as: Long-lived Internet flows, PREDICT ID: USC-LANDER/long_flows_D1_2_days-anonymized-20090605. Traces taken 2009-06-05 to 2009-06-07. Provided by the USC/LANDER project http://www.isi.edu/ant/lander. Results Using This Dataset Traces similar to this one have been used the following previously published work: * Lin Quan and John Heidemann. On the Characteristics and Reasons of Long-lived Internet Flows. In Proceedings of the ACM Internet Measurement Conference, p. 444-450. Melbourne, Australia, ACM. November, 2010. http://www.isi.edu/~johnh/PAPERS/Quan10a.html. User Annotations Suggestion: Edit the annotations at http://wiki.isi.edu/predict/index.php?title=LANDERNOTES:long_flows_D1_2_days-anonymized-20090605action=edit The host-only anonymized version of this dataset is LANDER:long_flows_D1_2_days-20090605 Some statistics: number of source IPs (in flows longer than 640 minutes) = 1170 number of destination IPs (in flows longer than 640 minutes) = 683 number of flows (longer than 640 minutes) = 1441 Categories Retrieved from "http://wiki.isi.edu/predict/index.php?title=LANDER:long_flows_D1_2_days-anonymized-20090605oldid=2612" Categories: * LANDER:PredictCategory:Traffic Flow Data * LANDER:PredictCategory:Traffic Flow Data/Long-lived Flow Summarization-Full IP Anon * LANDER * LANDER:Datasets * LANDER:Datasets:TrafficFlowData * Datasets