Suppose we read the sentence, "George W. Bush met with Vladimir Putin in Moscow." We don't know exactly how long that meeting lasted, but we do get some temporal information from the sentence. We know the meeting lasted more than ten seconds and less than one year. As we guess narrower and narrower bounds, our chances of being correct go down. Just how accurately can we make duration judgments like this? How much agreement can we expect among people? Will it be possible to extract this kind of information from text automatically?
The goal of the project is to extract such implicit and vague typical durations of events from text automatically. This research is potentially very important in applications in which the time course of events is to be extracted from news. For example, whether two events overlap or are in sequence often depends very much on their durations. If a war started yesterday, we can be pretty sure it is still going on today. If a hurricane started last year, we can be sure it is over by now.
We have developed annotation guidelines to minimize discrepant judgments and annotated all the 48 non-Wall-Street-Journal (non-WSJ) news articles (2132 events), as well as 10 WSJ articles (156 events) from the TimeBank corpus; we have developed a method for measuring inter-annotator agreement when the judgments are intervals on a scale; and we have showed that machine learning techniques applied to this data yield coarse-grained event duration information, considerably outperforming a baseline and approaching human performance.
ABC19980108.1830.0711.tmldur.xml
ABC19980114.1830.0611.tmldur.xml
ABC19980120.1830.0957.tmldur.xml
ABC19980304.1830.1636.tmldur.xml
AP900816-0139.tmldur.xml
APW19980213.1310.tmldur.xml
APW19980213.1320.tmldur.xml
APW19980213.1380.tmldur.xml
APW19980219.0476.tmldur.xml
APW19980227.0468.tmldur.xml
APW19980227.0476.tmldur.xml
APW19980227.0489.tmldur.xml
APW19980227.0494.tmldur.xml
APW19980301.0720.tmldur.xml
APW19980306.1001.tmldur.xml
APW19980308.0201.tmldur.xml
APW19980322.0749.tmldur.xml
APW19980418.0210.tmldur.xml
APW19980501.0480.tmldur.xml
APW19980526.1320.tmldur.xml
APW19980626.0364.tmldur.xml
CNN19980126.1600.1104.tmldur.xml
CNN19980213.2130.0155.tmldur.xml
CNN19980222.1130.0084.tmldur.xml
CNN19980223.1130.0960.tmldur.xml
CNN19980227.2130.0067.tmldur.xml
ea980120.1830.0071.tmldur.xml
ea980120.1830.0456.tmldur.xml
ed980111.1130.0089.tmldur.xml
NYT19980206.0460.tmldur.xml
NYT19980206.0466.tmldur.xml
NYT19980212.0019.tmldur.xml
NYT19980402.0453.tmldur.xml
NYT19980424.0421.tmldur.xml
PRI19980115.2000.0186.tmldur.xml
PRI19980121.2000.2591.tmldur.xml
PRI19980205.2000.1890.tmldur.xml
PRI19980205.2000.1998.tmldur.xml
PRI19980213.2000.0313.tmldur.xml
PRI19980216.2000.0170.tmldur.xml
PRI19980303.2000.2550.tmldur.xml
PRI19980306.2000.1675.tmldur.xml
SJMN91-06338157.tmldur.xml
VOA19980303.1600.0917.tmldur.xml
VOA19980303.1600.2745.tmldur.xml
VOA19980305.1800.2603.tmldur.xml
VOA19980331.1700.1533.tmldur.xml
VOA19980501.1800.0355.tmldur.xml
wsj_0006.tmldur.xml
wsj_0026.tmldur.xml
wsj_1025.tmldur.xml
wsj_1031.tmldur.xml
wsj_1035.tmldur.xml
wsj_1038.tmldur.xml
wsj_1039.tmldur.xml
wsj_1040.tmldur.xml
wsj_1042.tmldur.xml
wsj_1073.tmldur.xml