import calendar from collections import Counter import enchant import h3 import matplotlib.pyplot as plt import numpy as np import nltk from nltk.corpus import stopwords from nltk.stem import WordNetLemmatizer import os import pandas as pd import plotly.express as px import plotly.figure_factory as ff import plotly.graph_objects as go from plotly.offline import init_notebook_mode, iplot import random import re import seaborn as sns from sklearn import metrics from sklearn.ensemble import RandomForestRegressor from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score, confusion_matrix, classification_report from sklearn.model_selection import train_test_split, GridSearchCV from sklearn.svm import SVC import textblob import sys from textblob.sentiments import NaiveBayesAnalyzer from textblob import TextBlob from textblob import Word from wordcloud import WordCloud, STOPWORDS import warnings warnings.filterwarnings('ignore')

About this project.

This was a quick project to help me get to the bottom of specific product issues according to Threads users. By utilising some NLP techniques (in a relatively unorthodox way) I was able to get the information "straight from the horses' mouth", hone in on some insightful comments surrounding the usefulness of the app and other assorted helpful tidbits of user information. In all, I believe a lot of the users were reasonably accurate with their feedback in that the app wasn't living up to all of the hype. The product was as well designed and well implemented as a user would normally expect from a Meta product, although, as a previously-shelved Meta project it did seem as though the app was hurriedly revived and released as a means of capitalising on Twitter's recent (at the time of my creating this project) user mass-exodus.

It was nice to win a Kaggle medal with this project but admittedly it was a side project created 'not so carefully' around some other projects which resulted in somewhat of a bitter sweet win, considering the actual professionalism, time and effort invested in some of my other projects! Anyway, without further ado.....

nltk.download('averaged_perceptron_tagger') nltk.download('punkt') nltk.download('stopwords') nltk.download('wordnet') nltk.download('vader_lexicon')

df = pd.read_csv("/work/threads_reviews.csv")

A peek at the data.

As with many internet procured, text-based dataframes there are no real surprises here, there is a standard date column, a review date, a source and the review itself, all with no missing features:

In the categorical description we see Google Play as the top download source (of two distinct sources) The most commonly occurring word across review descriptions is 'Good', and the review date with the highest frequency is the 6th July 2023, Threads' release date (at 1755):

Datetime features.

Assigning string representations of datetime values (day, month, year) while keeping the time of day:

df['date'] = pd.to_datetime(df['review_date']) df['year'] = df['date'].dt.year df['month'] = df['date'].dt.month df['day'] = df['date'].dt.day df['day_name'] = df['date'].dt.dayofweek df['day_name'] = df['day_name'].apply(lambda x: calendar.day_name[x]) df['month'] = df['month'].apply(lambda x: calendar.month_name[x])

df['date'] = df['date'].astype(str) df['date'] = df['date'].str.split(' ').str[-1]

df.rename(columns={'date': 'time'}, inplace=True) df.rename(columns={'review_description': 'review'}, inplace=True) df = df.drop(['review_date'], axis=1)

Splitting the time at the colons, dropping the seconds, keeping the 24hr format and converting to int (without rounding):

df['time'] = df['time'].str.replace(':', '') df['time'] = df['time'].str[:4] df['time'] = df['time'].apply(lambda x: '{0:0>4}'.format(x)) df['time'] = df['time'].astype(int)

df = df.reindex(sorted(df.columns), axis=1)

One look at the month column shows all data were collected in the month of July, so I will drop that column as well as 'year' to save memory.

df = df.drop(['month', 'year'], axis=1)

df['review'] = df['review'].str.lower()

Analysis.

First, a look at the ratings distribution.

• A little over 3/4 of the data is attributed to five-star and one-star reviews; 47% five-star and 29.6% one-star.

• Four-star reviews make up 9.9%, three-star reviews make up 7.8%, and two-star reviews make up 5.5%.

App download sources in the dataset and their value counts shows exactly where this product has made particular gains, with 30.27K downloads from the Google Play Store compared to 2650 downloads from the App Store, making Android users 90%-92% more likely to use this product than iPhone users. An interesting statistic considering 29% of the global handset market share belongs to the iPhone, a figure that doesn't reflect too well with regard to downloads for this app. One reason for this could be the distribution of Instagram users by nationality and the OS market share for that user's country (statistics for Instagram users and Android users shows both India & Latin America among the top of both lists).

• Average ratings by source shows the majority of ratings (93%) were given via the Google Play Store. The Play Store is responsible for 66% of the total reviews holding the five-star label. The four-star review distribution from the Play Store is 11%. Three-star reviews account for 6%. Two-star reviews account for 3% and one-star reviews total 8%. • The Apple App Store accounts for 7% of the total reviews, translating to 3% of the total five-star review distribution being attributed to IOS users. Four-star, three-star and one-star reviews each hold 1% of the data, and two-star reviews hold less than 1%. • As found across a few of my data projects and having dealt with customer reviews / feedback for 25+ years, users are less likely to leave 2* reviews due to the fact they will either cut the company (or product) in question some slack with a 3* review, or opt to dunk on it completely by way of a 1* review, there is rarely a middle ground. That pattern is visible here once more.

Review distribution by weekday.

Release day is responsible for the majority of reviews, a figure of 12.4K. Friday sees 4.4K less reviews, Saturday sees 3.5K less than that. I can't say if this is a reflection of Threads' userbase, this is simply the review count. But this is something that I will look into (sentiment etc.).

Language Analysis.

Following a similar trend with some other NLP projects in my back-catalogue, the average word length for good reviews is much shorter than that of the negative reviews. The outlier here is the 1* review rating where the average word length is almost the same as the 4* reviews:

Creating polarity analysis columns.

def get_polarity(text): return TextBlob(text).sentiment.polarity def get_subjectivity(text): return TextBlob(text).sentiment.subjectivity df['subjectivity'] = df['review'].apply(get_subjectivity) df['polarity'] = df['review'].apply(get_polarity)

def get_analysis(score): if score < 0: return 'Negative' elif score == 0: return 'Neutral' else: return 'Positive' df['analysis'] = df['polarity'].apply(get_analysis)

stop = stopwords.words('english') df['review'] = df['review'].apply(lambda x: ' '.join([word for word in x.split() if word not in (stop)]))

Sentiment by day.

• The sunburst chart shows the majority of reviews being written on release day, with the review volume diminishing as the month progresses. Negative sentiment isn't too bad across the board, sitting at 4% on release day, dropping to 3% the following day, then 2%, remaining at a steady 1% for the duration. • Positive sentiment is very good on days 1 and 2, seeing 24% and 14% respectively, followed by a steady drop through 7%, 4%, 3%, then 2%. • The Neutral reviews see more of a stable pattern, with 14% on day 1, 9% on day 2, dropping steadily from 5% to 1% for the duration. • So with the 10% drop in positive sentiment on day 2 (from day 1), it's safe to say the honeymoon period was short and sharp. Although the relatively steady neutral sentiment and comparatively paltry negative sentiment shows that users were still reasonably happy with the app once the initial hype wore off.

On to the latter half of the month now, we see some different patterns. Chief of which is visible on the 16th of the month where positive sentiment settles at 3%, remaining there for the rest of the month. That being said, user review counts drop to under 10% on the same day and from what I can see, there seems to be a notable theme here, I can't say for certain but I do think that positive sentiment would be better had there been more users. Such is the nature of data. But, what is baking my noodle right now is thus; why are 15% of users leaving 7% majority positive reviews and 3% negative, where 8% (a shade over half as many users) are leaving less than half the positive review percent rate and almost the same negative review percent rate? Keen to find out.

Quick sidenote: I saw df.day == '5' in the sunburst chart which I thought could be a mistake due to Threads being released on the 6th of the month. Further analysis shows the minimum time for the 5th of the month being 2253, late evening. Comparing the size of the data for the 5th vs. the size of data for the 6th tells me that these few users or journos managed to get a hold of the app pre-release, or were simply somehow 'first-up, best-dressed'. A semi-educated assumption could be that the timestamps are in Eastern Time and the first users were somewhere in South America or middle / west U.S.A.. I am just spit-balling but this is worth keeping in mind for any readers who spotted the 5th in the data.

Statistical description for the 5th July:

Statistical description for the 6th July:

The ten most positive words from launch day include 'awesome', 'best Twitter', 'superb experience', 'excellent work' and 'perfect:

And the worst of the bunch on launch day includes 'idiot app', 'worst app of the century', 'Mark Zuckerberg made app iPhone users insane'. Hrm:

The positive polarity from the 16th where we saw a slowdown in positive reviews / user reviews in general sees 'excellent' (x2), ' perfect', 'superb' and 'best' (x2) / 'best application':

The negative polarity from the 16th is a bit brutal, so developers of a sensitive nature may want to look away now. Reviews including the word 'disgusting' (x2 ?!), and four reviews / phrases including the word 'boring'. It's looking like the initial excitement really did wear off for some:

A look at the subjectivity analysis for the 16th shows a pretty healthy mix of reviews, reviews containing the word 'beautiful', 'nice' (x4), plus a couple of user complaints about data collection and deleting the app:

I won't subject you to the complete work involved in analysing the text, but one funny thing that struck me is how the reviews containing the word 'boring' didn't start popping up in droves until around the 14th, iirc. There was the odd one or two here and there (especially on days 8 and 10), but they began to come in thick & fast after the 15th. And there were none in the negative echelons of the polarity analysis on day 1. Tough crowd.

Due to this, I will check how many instances of the word' boring' there are in the dataset:

df['review'].str.contains('boring').sum()

Versus the most occurring words in the dataset. Things still look pretty good across the board, no pun intended:

Trimming some of those chaff words off and assigning the topmost common words to the day names:

freq = ['app', 'see','im', 'use', 'like']

df['reviews_new'] = df.review.apply(lambda x: " ".join(x for x in x.split() if x not in freq))

keywords_by_day = df.groupby('day_name')['reviews_new'].apply(lambda x: pd.Series(x.str.split(expand=True).stack().value_counts().head(5))).reset_index()

As already seen in some snippets, overall it's a great outlook by day, with users mostly mentioning the app itself as well as Twitter, 'good', 'nice', and a few mentions of Instagram:

It seems the comparison to Twitter is the most prominent conversation across all days of the week, meaning things have gone to plan with the product release:

Tf-idf analysis.

vectoriser= TfidfVectorizer(max_features=1000, lowercase=True, analyzer='word', stop_words= 'english',ngram_range=(1,1)) X = vectoriser.fit_transform(df['review']) df_2 = pd.DataFrame(X[0].T.todense(), index=vectoriser.get_feature_names_out(), columns=["tf_idf"])

'Phones', 'choice', 'developers', 'meh' and 'complete' are the top five most reoccurring words in the Tf-idf corpus. 'Meh' doesn't sound fantastic but let's not forget the fact that systems like Threads have existed for a long time and some users may have been expecting more than it actually offers. 'Developers' could go either way:

Keyword extraction & keyword analysis.

stopwords = list(stopwords.words('english')) df.review = df.review.apply(lambda x: ' '.join([word for word in x.split() if word not in (stopwords)]))

df['review'] = df['review'].apply(lambda x: re.sub(r'(?<=[.,])(?=[^\s])', r' ', x)) df['review'] = df['review'].apply(lambda x: re.sub('[^a-zA-Z\s]', '', x.lower())) df['review'] = df['review'].apply(lambda x: re.sub(r'\n', '', x)) df['review'] = df['review'].apply(lambda x: re.sub(r'http\S+', '', x))

Aside from words raised by the tf-idf analysis, I was eyeballing the dataframe earlier and saw two instances of the word 'however, which could precede negative implications given the positive polarity so far. I think is a pretty good place to start.

df['however'] = df['review'].apply(lambda x: 1 if 'however' in x else 0)

df.however.value_counts()

Words such as 'annoying', 'disappointed' & 'crashes' feature here. To troubleshoot from a dev perspective this is handy information, people can't seamlessly switch accounts, the users' FB id is showing as incorrect, Threads apparently feels like an unfinished product and there is some amount of censorship according to one user:

Analysis of reviews including the word 'developer' as per the tf-idf analysis.

df['devs'] = df['review'].apply(lambda x: 1 if 'developers' in x else 0)

df.devs.value_counts()

First is a bit of a wise-ass calling the developers out for copying Twitter's ideas, something i've seen a lot of (this is kiiind of an obvious thing the end user could have anticipated on their own). The login issue rears its head again at least twice, there's a complaint about tracking info, functionality and reliability. A complaint re: a lack of a .gif library which is a bit strange from a company that bought a gif platform for a ten-figure sum or thereabouts. There are also reviews mentioning endless ads:

Backing the keyword ('meh') up with some objectivity here. One user apparently would have given four stars if there was an option for the user to remove suggested content form their feed... that isn't a huge app issue, more along the lines of personal preference. I see a complaint about the UI (objects overlapping), but also I see users saying the app works well & is a good Twitter alternative especially for the one user who was "angry at Musk's behaviour":

Returning some reviews containing the keyword 'choice'.

A couple more user preferences here once again but also a couple of reviews pertaining to the 'algorithmic timeline choices'; one user complaining about random users' posts appearing in their feed, which may or may not be an algorithmic issue. There is also something about removing freedom of choice as well as a review mirroring I saw in media reports re: Meta not allowing the user to delete their Threads account without simultaneously deleting their Insta account (which could be something to do with the freedom of choice as well).

Deep breath... Reviews containing the keyword 'complete'. (It could be positive!?):

Okay... 'completely useless, uninstalling', 'completely unable to use', 'incomplete, cannot even scroll', 'complete worst app', 'nefarious data scraper' (that user has good gut instincts), 'completely broken' due to error, and a comment from one user who can't see who they're following:

At the tail of this dataframe are some reviews that contain some more nods to the user being presented with other, random user accounts which the user doesn't particularly want to follow, including the words 'laggy', 'glitchy', plus something else about 'data mining'. There are some good reviews here from people saying it's a good Twitter copy, a good Twitter substitute, and that it's a 'complete app':

Comment polarity by source (Google Play / Apple App Store).

Positive polarity per the iPhone app:

"A great app, easy to use, clean, smooth", and again, "A better Twitter alternative", etcetera:

Negative polarity per the iPhone app:

One iPhone user isn't happy with having to create an Insta account to use Threads, one user isn't happy with the tracking, etc.:

Positive polarity per the Android app:

Good new features, great new features, 'love the app', professional addition to Instagram, clean, intuitive, easy to navigate, aesthetically pleasing and simple:

Negative polarity per the Android app:

A lot shorter and to the point. Parsing the average age by handset make and model here would be a good idea also.

A word cloud containing the most positive words in the entire review column.

Again, 'nice', 'good', 'perfect', 'feature', 'awesome', 'excellent' and 'best' are clearly visible here:

A word cloud containing the 100 most negative words in the entire review column.

And 'annoying', 'bug', 'worst app', 'stupid', 'boring', 'glitch' and 'useless' are all present here:

So, a lot of information that one can make a few precise conclusions from.

Conclusions.

Consumers are always going to flock to a new product as a result of brand loyalty as well as personal interest in the new product. Although I have seen some evidence to suggest that a few people were expecting more from this app. I experienced it myself when Facebook released their dating platform that didn't turn out to be the Match.com killer that it both could have been and which I thought it was going to be. I feel that people have a lot of faith in big tech developers to deliver a product that goes beyond the norm, which I guess is still high praise even considering the negative reviews that reflect a user's disappointment once the realisation that the app or product in question is just another 'copy', or a run-of-the-mill equivalent of a product they were initially expecting eventually dawns. Those expectations have historically been based on consumer perception of that company's ability to produce a quality product which is no bad thing; the hard-hitting reviews can be treated as a learning experience that will result in the company delivering a more polished product in future, or they can be pushed aside as nothing more than a user putting the company on a pedestal. Ascertaining which path to follow could be difficult judging only by these reviews, and, "What I would do in their situation" isn't for me to say here. When all is said & done, bugs can be fixed relatively easily, ML recommendation models can be fine-tuned to better fit the user's interests while keeping the interests of the company's partners in mind, and on a lighter note, the positive feedback will always be thing come-what-may.

My own personal boggle is: How long can a company such as this continue to suffer the same release day woes and data collection / censorship accusations before the proverbial sleeping giant awakens and begins to affect the majority of end users as opposed to the minority? The general gist of some forecasting group analyses over time appear to mirror the sentiments in these reviews with regard to certain similar apps such as FB dating, FB's gaming platform (a project canned at the final hurdle which could have been an incredible offering), and more recently the Metaverse, which I admittedly have some faith in. The shortfalls are nothing new and don't differ too wildly to other companies analysed in the forecasting groups, and looking at things from a positive angle again, the good ideas are always there, the timing of Threads' release tells me the competitive strategy is there, the developers are also world-class, but the execution does seem to miss the mark from time to time.

This project is only for fun so this is by no means a recommendation, I am not being paid by any company to wax lyrical on any corporate issues, and due to the nature of forecasting groups I can sometimes speak freely in a manner deemed as 'unprofessional' in some circles, these are purely some day off musings regarding an ongoing trend that I could never put my finger on with this company both as a user and an analyst. The technology consumer market primarily consists of people who expect to be entertained, they expect to be shown something new and exciting, and not even giving any credence to what appears to be the users' (sometimes) impossibly high standards in the present-day, product delivery sometimes tends to just.... fall a bit short. Unfortunately, as witnessed in the analysis today, this appears to be one of those occasions.

.css-15w88e5{color:var(--chakra-colors-fg-neutral-primary);font-weight:inherit;letter-spacing:-0.09px;}About this project.