Libraries Used
Pretrained models were used for sentiment prediction
!pip install transformers
from transformers import AutoModelForSequenceClassification
from transformers import TFAutoModelForSequenceClassification
from transformers import AutoTokenizer
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from scipy.special import softmax
import nltk
from nltk.corpus import stopwords
from nltk import FreqDist
nltk.download('stopwords')
Preparing Tokenizer and Model
tokenizer = AutoTokenizer.from_pretrained("cardiffnlp/twitter-roberta-base-sentiment")
model = AutoModelForSequenceClassification.from_pretrained("cardiffnlp/twitter-roberta-base-sentiment")
model.save_pretrained("cardiffnlp/twitter-roberta-base-sentiment")
airasia_tweets = pd.read_csv("cleaned_airasia_tweets_1.csv")
airasia_tweets
Sentiment Analysis
Tokenize sentence and predicts sentiment
labels = []
for i,v in enumerate(airasia_tweets['Text']):
encoded_input = tokenizer(v, return_tensors='pt')
output = model(**encoded_input)
scores = output[0][0].detach().numpy()
scores = softmax(scores)
ranking = np.argsort(scores)
ranking = ranking[::-1]
if ranking[0] == 0:
labels.append("Negative")
elif ranking[0] == 1:
labels.append("Neutral")
elif ranking[0] == 2:
labels.append("Positive")
labels
Assign the sentiment labels to the sentence
airasia_tweets['Sentiment'] = labels
airasia_tweets.head()
print(airasia_tweets.columns)
Exporting Tweets with Sentiment
airasia_tweets.to_csv("airasia_tweets_sentiment_1.csv")
Visualizing Tweets Sentiment
Count plot by sentiment
filtered_negative_sentiments = airasia_tweets[airasia_tweets['Sentiment'] == "Negative"].head()
Top 5 negative tweets
filtered_negative_sentiments