Joint Aspect and Polarity Classification for Aspect-based Sentiment Analysis with End-to-End Neural Networks
Task Definition
Given a set of aspect categories \(A\) and a text:
- What aspects \( a \in A\) are mentioned? (aspect detection)
- What is the text's sentiment with regard to each of them? (polarity classification)
Approaches:
Pipeline:
Use results of (1) as input for (2)
End-to-end:
Solve (1) and (2) jointly
Data
- GermEval 2017: customer feedback about Deutsche Bahn AG on social media
- ~26K documents
- 19 aspect categories:
General61.5% | Atmosphere6.1% | Design0.2% | Gastronomic offer0.2% |
Information1.7% | Website1.0% | Comfort0.7% | Customer Support2.4% |
Connectivity1.5% | Luggage0.1% | Occupancy1.3% | Ticket purchase3.0% |
Toilets0.2% | Punctuality7.1% | Accessibility 0.3% | Security2.2% |
Image 0.2% | QR code < 0.1% | Traveling with children0.2% | |
Example
German: Alle so "Yeah, Streik beendet" Bahn so "Okay, dafür werden dann natürlich die Tickets teurer" Alle so "Können wir wieder Streik haben?"
Translation: Everybody's like "Yeah, strike's over" Bahn goes "Okay, but therefore we're going to raise the prices" Everybody's like "Can we have the strike back?"
Detected Aspect
Corresponding Polarity
Ticket purchase
negative
General
positive
Joint Model Architecture

Model Comparison
dev | test 1 | test 2 | ||
---|---|---|---|---|
Pipeline LSTM | + word2vec | .350 | .297 | .342 |
End-to-end LSTM | + word2vec | .378 | .315 | .383 |
Pipeline CNN | + word2vec | .350 | .298 | .343 |
End-to-end CNN | + word2vec | .400 | .319 | .388 |
Pipeline LSTM | + glove | .350 | .297 | .342 |
End-to-end LSTM | + glove | .378 | .315 | .384 |
Pipeline CNN | + glove | .350 | .298 | .342 |
End-to-end CNN | + glove | .415 | .315 | .390 |
Pipeline LSTM | + fasttext | .350 | .297 | .342 |
End-to-end LSTM | + fasttext | .378 | .315 | .384 |
Pipeline CNN | + fasttext | .342 | .295 | .342 |
End-to-end CNN | + fasttext | .511 | .423 | .465 |
majority class baseline | - | .315 | .384 | |
GermEval baseline | - | .322 | .389 | |
GermEval best submission | - | .354 | .401 |
Table 1: Micro-averaged F1-scores for aspect + sentiment task
Contributions and Results
- Joint model of aspect detection and polarity classification performs better than pipeline systems
- Subword information is important for embedding learning (especially for German)
- New state of the art in aspect-based sentiment classification without given gold aspects