LANDER:long flows D8 2 weeks-anonymized-20100221 From Predict README version: 2621, last modified: 2012-03-16. This file describes the trace dataset "long_flows_D8_2_weeks-anonymized-20100221" provided by the LANDER project. The most recent version of this file can be found on-line at http://wiki.isi.edu/predict/index.php/LANDER:long_flows_D8_2_weeks-anonymized-20100221. LANDER Metadata http://wiki.isi.edu/predict/index.php/LANDER:long_flows_D8_2_weeks-anonymized-20100221/landermeta) +--------------------------------------------------------------------------------------------------------------------+ |dataSetName |long_flows_D8_2_weeks-anonymized-20100221 | |--------------------------+-----------------------------------------------------------------------------------------| |status |usc-web-and-predict | |--------------------------+-----------------------------------------------------------------------------------------| |shortDesc |2 weeks of IP flows in 2010. | |--------------------------+-----------------------------------------------------------------------------------------| |longDesc |This dataset contains IP flow records spanning two weeks in a modified Argus format. The | | |durations of flows are ranging from seconds to hours and weeks. Different durations of | | |flows are organized into different directories, and the flow duration increases | | |exponentially. All IP addresses in this dataset are fully anonymized. | |--------------------------+-----------------------------------------------------------------------------------------| |datasetCategory |Traffic Flow Data | |--------------------------+-----------------------------------------------------------------------------------------| |datasetSubCategory |Long-lived Flow Summarization-Full IP Anon | |--------------------------+-----------------------------------------------------------------------------------------| |requestReviewRequired |true | |--------------------------+-----------------------------------------------------------------------------------------| |productReviewRequired |false | |--------------------------+-----------------------------------------------------------------------------------------| |ongoingMeasurement |false | |--------------------------+-----------------------------------------------------------------------------------------| |collectionStartDate |2010-02-21 | |--------------------------+-----------------------------------------------------------------------------------------| |collectionStartTime |19:56:45 | |--------------------------+-----------------------------------------------------------------------------------------| |collectionEndDate |2010-03-08 | |--------------------------+-----------------------------------------------------------------------------------------| |collectionEndTime |01:17:32 | |--------------------------+-----------------------------------------------------------------------------------------| |availabilityStartDate | | |--------------------------+-----------------------------------------------------------------------------------------| |availabilityStartTime | | |--------------------------+-----------------------------------------------------------------------------------------| |availabilityEndDate | | |--------------------------+-----------------------------------------------------------------------------------------| |availabilityEndTime | | |--------------------------+-----------------------------------------------------------------------------------------| |anonymization |true | |--------------------------+-----------------------------------------------------------------------------------------| |archivingAllowed | | |--------------------------+-----------------------------------------------------------------------------------------| |keywords |full-ip-anonymization, long-flows, flow-statistics | |--------------------------+-----------------------------------------------------------------------------------------| |format |text | |--------------------------+-----------------------------------------------------------------------------------------| |access |https | |--------------------------+-----------------------------------------------------------------------------------------| |hostName |USC-LANDER | |--------------------------+-----------------------------------------------------------------------------------------| |privateAccessInstructions |See http://www.isi.edu/ant/traces/index.html#getting_datasets for information on | | |obtaining this dataset. | | |See | | |http://wiki.isi.edu/predict/index.php/LANDER:long_flows_D8_2_weeks-anonymized-20100221 | | |for details on this dataset. | +--------------------------------------------------------------------------------------------------------------------+ Dataset Contents This dataset contains IP flow records spanning two weeks, from 2010-02-21 to 2010-03-08, in a modified Argus format. The durations of flows range from seconds to days and weeks. Different durations of flows are organized into different directories, as shown below. r0/ level 0 flow files, up to 10 minute long r1/ level 1 flow files, up to 20 minute long r2/ level 2 flow files, up to 40 minute long ... r11/ level 8 flow files, up to 2 weeks long The flow duration increases exponentially, with a base duration of 10 minutes. Two level i flow files (numbered 2n and 2n+1) are merged into a level i+1 file (numbered n). All IP addresses are fully anonymized. We also compress all the flow files using bzip2. The flow record format is (after uncompression): start_timestamp end_timestamp sourceIP.sourcePort protocol destinationIP.destinationPort num_packets num_bytes state sigma_bytes_square bytes_avg N_timebins (last three are used to calculate burstiness of flows, which is defined as variance of bytes over a time bin of 10 minutes. burstiness = math.sqrt(sigma_bytes_square/N - bytes_avg*bytes_avg)) A sample record: 20100225:09:16:32.927457 20100228:22:37:04.393181 2001:660:3001:401*.32776 udp ff7e:230:2001:660*.10010 1051662 120330435 INT 1983992704601 14364.3828339 8377 For a longer description of our dataset, please see here. Citation If you use this trace to conduct additional research, please cite it as: Long-lived Internet flows, PREDICT ID: USC-LANDER/long_flows_D8_2_weeks-anonymized-20100221. Traces taken 2010-02-21 to 2010-03-08. Provided by the USC/LANDER project http://www.isi.edu/ant/lander. Results Using This Dataset Traces similar to this one have been used the following previously published work: * Lin Quan and John Heidemann. On the Characteristics and Reasons of Long-lived Internet Flows. In Proceedings of the ACM Internet Measurement Conference, p. 444-450. Melbourne, Australia, ACM. November, 2010. http://www.isi.edu/~johnh/PAPERS/Quan10a.html. User Annotations Suggestion: Edit the annotations at http://wiki.isi.edu/predict/index.php?title=LANDERNOTES:long_flows_D8_2_weeks-anonymized-20100221action=edit The host-only anonymized version of this dataset is LANDER:long_flows_D8_2_weeks-20100221 Some statistics: number of source IPs (in flows longer than 5120 minutes) = 1164 number of destination IPs (in flows longer than 5120 minutes) = 860 number of flows (longer than 5120 minutes) = 1845 Categories Retrieved from "http://wiki.isi.edu/predict/index.php?title=LANDER:long_flows_D8_2_weeks-anonymized-20100221oldid=2621" Categories: * LANDER:PredictCategory:Traffic Flow Data * LANDER:PredictCategory:Traffic Flow Data/Long-lived Flow Summarization-Full IP Anon * LANDER * LANDER:Datasets * LANDER:Datasets:TrafficFlowData * Datasets