Language Data

L32 - The Yahoo News Annotated Comments Corpus, version 1.0 (47MB)

The dataset contains comment threads posted in response to online news articles. We annotated the dataset at the comment-level and the thread-level. The annotations include 6 dimensions of individual comments and 3 dimensions of threads on the whole. The coding was done by professional, trained editors and untrained crowdsourced workers. The corpus contains annotations for a novel corpus of 2.4k threads and 9.2k comments from Yahoo News and 1k threads from Internet Argument Corpus.