| Name | The RST Corpus |
| Number documents/texts in the corpus | 385 (347 training; 38 test) |
| Number words in the corpus | 176,383 |
| Avg. # words/text | 458 |
| Avg. # elementary discourse units/text | 57 |
| High-level description of the corpus | README |
| Annotation Manual | tagging-ref-manual.pdf |
| Number documents that were double tagged | 53 (13.8%) |
| Discourse units | Clauses and smaller. |
| Number of discourse units | 21789 |
| Avg. # words/discourse unit | 8.1 |
| Corpus samples | Click here |
| Related utilities | Click here |
| How can I obtain the corpus? | From LDC |