Webscope | Yahoo Labs

L13 - Yahoo! Search Query Tiny Sample (41 K)

This dataset contains a random sample of 4496 queries posted to Yahoo's US search engine in January, 2009. For privacy reasons, the query set contains only queries that have been asked by at least three different users and contain only letters of the English alphabet, sequences of numbers not longer than four numbers and punctuation characters. The query set does not contain user information nor does it preserve temporal aspects of the query log. Total size for this dataset is 41K.

Language Data

L13 - Yahoo! Search Query Tiny Sample (41 K)

Dataset has been added to your cart

L12 - Yahoo! Search Popularity by Location for Websites on Politician and Athletes (14 M)

Dataset has been added to your cart

L15 - Yahoo! Search queries that share clicked URLs with TREC queries, version 1.0 (33 K)

Dataset has been added to your cart

L2 - Metadata Extracted from Publicly Available Web Pages, version 1.0 (1.5 GB & 700 MB)

Dataset has been added to your cart

L1 - Yahoo! N-Grams, version 2.0 (multi part) (Hosted on AWS)

Dataset has been added to your cart

L3 - Yahoo! Semantically Annotated Snapshot of the English Wikipedia, version 1.0 (multi part)

Dataset has been added to your cart

L4 - Yahoo! Answers Manner Questions, version 1.0 (102 MB)

Dataset has been added to your cart

L5 - Yahoo! Answers Manner Questions, version 2.0 (104 MB)

Dataset has been added to your cart

L6 - Yahoo! Answers Comprehensive Questions and Answers version 1.0 (multi part)

Dataset has been added to your cart

L8 - Yahoo! Search Query Logs for Nine Languages, version 1.0 (45 K)

Dataset has been added to your cart

L9 - Yahoo! Answers Question Types Sample of 1000, version 1.0 (14 K)

Dataset has been added to your cart

L16 - Yahoo! Answers Query to Questions (1.5 MB)

Dataset has been added to your cart

L18 - Anonymized Yahoo! Search Logs with Relevance Judgments (1.3 Gbyte)

Dataset has been added to your cart

L11 - HTML Forms Extracted from Publicly Available Webpages, version 1.0 (50Gb+) (Hosted on AWS)

Dataset has been added to your cart

L19 - Yahoo! Answers browsing behavior, version 1.0 (166Gb) (Hosted on AWS)

Dataset has been added to your cart

L20 - Yahoo! Answers browsing behavior, version 1.0 (166Gb) (Hosted on AWS)

Dataset has been added to your cart

L21 - Yahoo! Answers Query To Questions, version 2.0 (24K)

Dataset has been added to your cart

L22 - Yahoo! News Sessions Content, version 1.0 (16 MB)

Dataset has been added to your cart

L23 - Yahoo Answers Synthetic Questions, version 1.0

Dataset has been added to your cart

L24 - Yahoo Search Query Log To Entities, version 1.0(1.7MB)

Dataset has been added to your cart

L25 - Yahoo N-Gram Representations, version 2.0 (2.6Gb) (Hosted on AWS)

Dataset has been added to your cart

L26 - Yahoo! Answers consisting of questions asked in French, version 1.0 (3.8Gb) (Hosted on AWS)

Dataset has been added to your cart

L27 - Yahoo Answers Factoids Queries, version 1.0 (3.5MB)

Dataset has been added to your cart

L28 - Yahoo Answers Query Treebank, version 1.0 (456KB)

Dataset has been added to your cart

L29 - Yahoo Answers Novelty Based Answer Ranking, version 1.0 (331KB)

Dataset has been added to your cart

L30 - Model files for Fast Entity Linker, version 1.0 (5.2G) (Hosted on AWS)

Dataset has been added to your cart

L31 - Questions on Yahoo Answers labeled as either informational or conversational, version 1.0 (766KB)

Dataset has been added to your cart

L32 - The Yahoo News Annotated Comments Corpus, version 1.0 (47MB)

Dataset has been added to your cart

L33 - Yahoo News Ranked Multi-label Corpus, version1.0 (59MB)

Dataset has been added to your cart

Datasets