Sentiment Analysis of Tweets from Kaggle Twitter Dataset
Statement: Given Tweet Content and an Entity, the task is to judge sentiment of Tweet Content about entity. There are 3 classes in this dataset: Positive
, Negative
and Neutral
(messages not relevant to the entity, i.e. Irrelevant) classified as Neutral.
Data Cleaning | Preprocessing | EDA | Defining NN Architecture | Training | Prediction | Evaluation
- Delete nans
- Lower Text
- Remove urls
- Remove punctuation
- Remove contractions (
why'd
->why would
) - Remove mentions (
@user hey
->hey
) - Remove hashtags (
#sometrend
->sometrend
) - Remove double spaces
- Decode emojis
- Remove stopwords (
how is the weather
->weather
) - Remove numbers (
my id 882244
->my id
) - Delete nans
- Lemmatize (
the boy's cars are different colors
->the boy car be differ color
) - Texts vectorized with TF-IDF vectorizer
- Categorical features one-hot-encoded
NNSentimentClassifier( (softmax): Softmax(dim=1) (dropout): Dropout(p=0.2, inplace=False) (model): Sequential( (0): Linear(in_features=8032, out_features=1000, bias=True) (1): ReLU() (2): Dropout(p=0.2, inplace=False) (3): Linear(in_features=1000, out_features=100, bias=True) (4): Tanh() (5): Dropout(p=0.2, inplace=False) (6): Linear(in_features=100, out_features=1000, bias=True) (7): ReLU() (8): Dropout(p=0.2, inplace=False) (9): Linear(in_features=1000, out_features=10, bias=True) (10): ReLU() (11): Dropout(p=0.2, inplace=False) (12): Linear(in_features=10, out_features=4, bias=True) ) )
classes = { 'Irrelevant': 0, 'Negative': 1, 'Neutral': 2, 'Positive': 3 }
- https://www.kaggle.com/code/katearb/sentiment-analysis-in-twitter-93-test-acc
- https://www.kaggle.com/code/vaishnavi28krishna/twitter-analysis-using-dt-and-rfdtc
- https://www.kaggle.com/code/parisrohan/text-feature-cleaning-generation-model-building
- https://www.kaggle.com/code/tanujdhiman/twitter-sentiment-analysis
- https://www.kaggle.com/code/cameronwatts/bag-of-words-sentiment-analysis-with-keras-task
- https://www.lexalytics.com/technology/sentiment-analysis/
- https://towardsdatascience.com/sentiment-analysis-concept-analysis-and-applications-6c94d6f58c17