MVSA: Sentiment Analysis on Multi-view Social Data – Multimedia Communications Research Laboratory

New: We are pleased to release our new MVSA dataset including more tweets and annotations. In the new dataset, each tweet is annotated by three annotators. We name this dataset as MVSA-multiple. The original MVSA used in [1], where each tweet only has one label, is named as MVSA-single.

Overview

There is an increasing interest in understanding users’ attitude or sentiment towards a specific topic (e.g., a brand) from the large repository of opinion-rich data on the Web. While great efforts have been devoted on the single media, either text or image, little attempts are paid for the joint analysis of multi-view data which is becoming a prevalent form in the social media. To prompt the research on this interesting and important problem, we introduce a multi-view sentiment analysis dataset (MVSA) including a set of image-text pairs with manual annotations collected from Twitter. The dataset can be utilized as a valuable benchmark for both single-view and multi-view sentiment analysis..

Positive:
Neutral:
Negative

The Dataset

MVSA-multiple can be downloaded from MVSA-multiple on One Drive and MVSA-multiple on BaiduYun.

MVSA-single can be downloaded from MVSA- single on One Drive and MVSA- single on BaiduYun .

We provide following information:

Original image-text pairs collected from Twitter.
Annotation for both text and image.

Please contact Dr. Shiai Zhu (zshiai@gmail.com), if any problems on our dataset.

Pipeline for sentiment analysis

We adopt the standard statistical learning methods for single-view and multi-view sentiment analysis.

Some useful links for extracting visual features including low-level to middle-level features are as follows:

Classemes: http://vlg.cs.dartmouth.edu/projects/vlg_extractor/vlg_extractor/Home.html

Aesthetic:

http://www.ee.columbia.edu/~subh/Software.php

SentiBank, Attribute, BoVW, Color Histogram, Gist and LBP:

http://www.ee.columbia.edu/ln/dvmm/vso/download/sentibank.html

Please cite our paper if the datasets are helpful to your research:

[1] T. Niu, S. A. Zhu, L. Pang and A. El Saddik, Sentiment Analysis on Multi-view Social Data, MultiMedia Modeling (MMM), pp: 15-27, Miami, 2016.

@inproceedings{MVSA,

author = {Teng Niu and Shiai Zhu and Lei Pang and Abdulmotaleb El{-}Saddik},

title = {Sentiment Analysis on Multi-View Social Data},

booktitle = {MultiMedia Modeling},

pages = {15–27},

year = {2016},

}