LANDER:long flows D8 2 weeks-20100221 From Predict README version: 2619, last modified: 2012-03-16. This file describes the trace dataset "long_flows_D8_2_weeks-20100221" provided by the LANDER project. The most recent version of this file can be found on-line at http://wiki.isi.edu/predict/index.php/LANDER:long_flows_D8_2_weeks-20100221. LANDER Metadata http://wiki.isi.edu/predict/index.php/LANDER:long_flows_D8_2_weeks-20100221/landermeta) +---------------------------------------------------------------------------------------------------------+ |dataSetName |long_flows_D8_2_weeks-20100221 | |--------------------------+------------------------------------------------------------------------------| |status |usc-web-and-predict | |--------------------------+------------------------------------------------------------------------------| |shortDesc |2 weeks of IP flows in 2010. | |--------------------------+------------------------------------------------------------------------------| |longDesc |This dataset contains IP flow records spanning two weeks in a modified Argus | | |format. The durations of flows are ranging from seconds to hours and weeks. | | |Different durations of flows are organized into different directories, and the| | |flow duration increases exponentially. All IP addresses in this dataset are | | |host-only anonymized. | |--------------------------+------------------------------------------------------------------------------| |datasetCategory |Traffic Flow Data | |--------------------------+------------------------------------------------------------------------------| |datasetSubCategory |Long-lived Flow Summarization-Host Only Anon | |--------------------------+------------------------------------------------------------------------------| |requestReviewRequired |true | |--------------------------+------------------------------------------------------------------------------| |productReviewRequired |false | |--------------------------+------------------------------------------------------------------------------| |ongoingMeasurement |false | |--------------------------+------------------------------------------------------------------------------| |collectionStartDate |2010-02-21 | |--------------------------+------------------------------------------------------------------------------| |collectionStartTime |19:56:45 | |--------------------------+------------------------------------------------------------------------------| |collectionEndDate |2010-03-08 | |--------------------------+------------------------------------------------------------------------------| |collectionEndTime |01:17:32 | |--------------------------+------------------------------------------------------------------------------| |availabilityStartDate | | |--------------------------+------------------------------------------------------------------------------| |availabilityStartTime | | |--------------------------+------------------------------------------------------------------------------| |availabilityEndDate | | |--------------------------+------------------------------------------------------------------------------| |availabilityEndTime | | |--------------------------+------------------------------------------------------------------------------| |anonymization |true | |--------------------------+------------------------------------------------------------------------------| |archivingAllowed | | |--------------------------+------------------------------------------------------------------------------| |keywords |host-only-ip-anonymization, long-flows, flow-statistics | |--------------------------+------------------------------------------------------------------------------| |format |text | |--------------------------+------------------------------------------------------------------------------| |access |https | |--------------------------+------------------------------------------------------------------------------| |hostName |USC-LANDER | |--------------------------+------------------------------------------------------------------------------| |privateAccessInstructions |See http://www.isi.edu/ant/traces/index.html#getting_datasets for information | | |on obtaining this dataset. | | |See | | |http://wiki.isi.edu/predict/index.php/LANDER:long_flows_D8_2_weeks-20100221 | | |for details on this dataset. | +---------------------------------------------------------------------------------------------------------+ Dataset Contents This dataset contains IP flow records spanning two weeks, from 2010-02-21 to 2010-03-08, in a modified Argus format. The durations of flows range from seconds to days and weeks. Different durations of flows are organized into different directories, as shown below. r0/ level 0 flow files, up to 10 minute long r1/ level 1 flow files, up to 20 minute long r2/ level 2 flow files, up to 40 minute long ... r11/ level 8 flow files, up to 2 weeks long The flow duration increases exponentially, with a base duration of 10 minutes. Two level i flow files (numbered 2n and 2n+1) are merged into a level i+1 file (numbered n). All IP addresses are prefix-preserving, host-only anonymized, so the top 24 bits are correct, and the lowest 8 bits are scrambled. We also compress all the flow files using bzip2. The flow record format is (after uncompression): start_timestamp end_timestamp sourceIP.sourcePort protocol destinationIP.destinationPort num_packets num_bytes state sigma_bytes_square bytes_avg N_timebins (last three are used to calculate burstiness of flows, which is defined as variance of bytes over a time bin of 10 minutes. burstiness = math.sqrt(sigma_bytes_square/N - bytes_avg*bytes_avg)) A sample record: 20100225:09:16:32.927457 20100228:22:37:04.393181 2001:660:3001:401*.32776 udp ff7e:230:2001:660*.10010 1051662 120330435 INT 1983992704601 14364.3828339 8377 For a longer description of our dataset, please see here. Citation If you use this trace to conduct additional research, please cite it as: Long-lived Internet flows, PREDICT ID: USC-LANDER/long_flows_D8_2_weeks-20100221. Traces taken 2010-02-21 to 2010-03-08. Provided by the USC/LANDER project http://www.isi.edu/ant/lander. Results Using This Dataset Traces similar to this one have been used the following previously published work: * Lin Quan and John Heidemann. On the Characteristics and Reasons of Long-lived Internet Flows. In Proceedings of the ACM Internet Measurement Conference, p. 444-450. Melbourne, Australia, ACM. November, 2010. http://www.isi.edu/~johnh/PAPERS/Quan10a.html. User Annotations Suggestion: Edit the annotations at http://wiki.isi.edu/predict/index.php?title=LANDERNOTES:long_flows_D8_2_weeks-20100221action=edit The fully anonymized version of this dataset is LANDER:long_flows_D8_2_weeks-anonymized-20100221. Some statistics: number of source IPs (in flows longer than 5120 minutes) = 1164 number of destination IPs (in flows longer than 5120 minutes) = 860 number of flows (longer than 5120 minutes) = 1845 Categories Retrieved from "http://wiki.isi.edu/predict/index.php?title=LANDER:long_flows_D8_2_weeks-20100221oldid=2619" Categories: * LANDER:PredictCategory:Traffic Flow Data * LANDER:PredictCategory:Traffic Flow Data/Long-lived Flow Summarization-Host Only Anon * LANDER * LANDER:Datasets * LANDER:Datasets:TrafficFlowData * Datasets