Datasets

The Yahoo Webscope Program is a reference library of interesting and scientifically useful datasets for non-commercial use by academics and other scientists.

All datasets have been reviewed to conform to Yahoo's data protection standards, including strict controls on privacy. We have a number of datasets that we are excited to share with you.

Yahoo is pleased to make these datasets available to researchers who are advancing the state of knowledge and understanding in web sciences. The datasets are only available for academic use by faculty and university researchers who agree to the Data Sharing Agreement.



To be eligible to receive Webscope data, unless specified in a particular dataset, you must:
  • Be a faculty member, research employee or student from an accredited university
  • Send the data request from an accredited university .edu or domain name (for international universities) email address
Unless specified in a particular dataset, we are not able to share data with:
  • Commercial entities
  • Employees of commercial entities with university appointment
  • Research institutions not affiliated with a research university


Datasets

The Yahoo Webscope Program is a reference library of interesting and scientifically useful datasets for non-commercial use by academics and other scientists.



 

Findings

Latest Publications

  • A simple method for unsupervised anomaly detection: An application to Web time series data
  • Design of a Large-Scale Anomaly Detection Algorithm for Time Series in the context of Wireless Network Operations
  • A Comparative Study of the Quality between Formality Style Transfer of Sentences in Swedish and English, leveraging the BERT model
  • Layered Graph Embedding for Entity Recommendation using Wikipedia in the Yahoo! Knowledge Graph
  • Browse Publications


  •