Profitable App Profiles for the App Store and Google Play Markets
Our aim in this project is to find mobile app profiles that are profitable for the App Store and Google Play markets. We're working as data analysts for a company that builds Android and iOS mobile apps, and our job is to enable our team of developers to make data-driven decisions with respect to the kind of apps they build.
At our company, we only build apps that are free to download and install, and our main source of revenue consists of in-app ads. This means that our revenue for any given app is mostly influenced by the number of users that use our app. Our goal for this project is to analyze data to help our developers understand what kinds of apps are likely to attract more users.
Opening and Exploring the Data
As of September 2018, there were approximately 2 million iOS apps available on the App Store, and 2.1 million Android apps on Google Play.
Deleting Wrong Data
The Google Play data set has a [dedicated discussion](https://www.kaggle.com/lava18/google-play-store-apps/discussion) section, and we can see that [one of the discussions](https://www.kaggle.com/lava18/google-play-store-apps/discussion/66015) outlines an error for row 10472. Let's print this row and compare it against the header and another row that is correct.
Removing Duplicate Entries
Part One
If we explore the Google Play data set long enough, we'll find that some apps have more than one entry. For instance, the application Instagram has four entries:
In total, there are 1,181 cases where an app occurs more than once:
Part Two
Let's Start by building the dictionary.
Removing Non-English Apps
Part One
Part Two
To minimize the impact of data loss, we'll only remove an app if its name has more than three non-ASCII characters:
Isolating the Free Apps
We're left with 8864 Android apps and 3222 iOS apps, which should be enough for our analysis.