Software & Data



The following datasets are available for research purposes:

  • A sample of 195,849 posts from the LiveJournal blog service annotated with valence and arousal, used in the ‘Seeing stars of valence and arousal in blog posts’ paper is available upon request.
  • Humanly annotated forum discussions from the Predicting emotional responses to long informal text paper are available upon request.
  • BBC News forum posts: 2,594,745 comments from selected BBC News forums and approx. 1,000 human classified sentiment strengths with a postive strength of 1-5 and a negative strength of 1-5. The classification is the average of three human classifiers. To get access the data, please visit the CyberEmotions website.
  • Digg post comments: 1,646,153 comments on Digg posts (typically highlighting news or technology stories) and approx. 1,000 human classified sentiment strengths with a postive strength of 1-5 and a negative strength of 1-5. The classification is the average of three human classifiers.

To get access the data, please visit the CyberEmotions website.