We have various types of data available to share. They are categorized into Ratings, Language, Graph, Advertising and Market Data, Computing Systems and an appendix of other relevant data and resources available via the Yahoo! Developer Network.

Language Data

These types of datasets can be utilized to research information retrieval and natural language processing algorithms. Yahoo! is interested in improved search and information retrieval.

View Datasets
Graph and Social Data

These types of datasets can be utilized to research matrix, graph, clustering, and machine learning algorithms. Yahoo! is interested in better understanding of social networks.

View Datasets


Ratings and Classification Data

These types of datasets can be utilized to research collaborative filtering, recommender systems and machine learning algorithms. Yahoo! is interested in various types of research in providing recommendations for users and personalized and more relevant content to users.

View Datasets
Advertising and Market Data

These types of datasets can be utilized to research behavior and incentives in auctions and markets. Yahoo! is interested in design of advertising systems and Yahoo! marketplaces.

View Datasets
Competition Data

These types of datasets were utilized in a competition event with academics and researchers.

View Datasets
Computing Systems Data

This type of data can be used to analyze behavior and performance of different types computer systems architectures. Including distributed systems and networks. Yahoo! is interested in various types of research in distributed and cloud computing and understanding and optimizing performance in such systems.

View Datasets
Image Data

This type of data can be used to analyze images and tags and is useful for image processing research.

View Datasets