Data Preprocessing Project Unit 1
After every step done to read and preprocess a dataset is the time to apply all the knowledge.
Exploratory Data Analysis: Game reviews from user on Steam
Steam is the world's most popular PC Gaming hub, with over 6,000 games and a community of millions of gamers. With a massive collection that includes everything from AAA blockbusters to small indie titles, great discovery tools are a highly valuable asset for Steam. How can we make them better? YES! Reviewing their games
First we are going to analyze the data set in order to understand it.
Here, we describe and see if there is some null variable on the data set
Data cleaning
In the review section, we can see that the only problem that this dataset has is the null values on the review column. Let's make something about it.
Let's make tidy this dataset
Requirement already satisfied: pandas-profiling==2.7.1 in /root/venv/lib/python3.7/site-packages (2.7.1)
Requirement already satisfied: ipywidgets>=7.5.1 in /root/venv/lib/python3.7/site-packages (from pandas-profiling==2.7.1) (7.6.5)
Requirement already satisfied: scipy>=1.4.1 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pandas-profiling==2.7.1) (1.7.2)
Requirement already satisfied: confuse>=1.0.0 in /root/venv/lib/python3.7/site-packages (from pandas-profiling==2.7.1) (1.7.0)
Requirement already satisfied: pandas!=1.0.0,!=1.0.1,!=1.0.2,>=0.25.3 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pandas-profiling==2.7.1) (1.2.5)
Requirement already satisfied: numpy>=1.16.0 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pandas-profiling==2.7.1) (1.19.5)
Requirement already satisfied: joblib in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pandas-profiling==2.7.1) (1.1.0)
Requirement already satisfied: htmlmin>=0.1.12 in /root/venv/lib/python3.7/site-packages (from pandas-profiling==2.7.1) (0.1.12)
Requirement already satisfied: phik>=0.9.10 in /root/venv/lib/python3.7/site-packages (from pandas-profiling==2.7.1) (0.12.0)
Requirement already satisfied: missingno>=0.4.2 in /root/venv/lib/python3.7/site-packages (from pandas-profiling==2.7.1) (0.5.0)
Requirement already satisfied: tangled-up-in-unicode>=0.0.4 in /root/venv/lib/python3.7/site-packages (from pandas-profiling==2.7.1) (0.2.0)
Requirement already satisfied: requests>=2.23.0 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pandas-profiling==2.7.1) (2.26.0)
Requirement already satisfied: visions[type_image_path]==0.4.1 in /root/venv/lib/python3.7/site-packages (from pandas-profiling==2.7.1) (0.4.1)
Requirement already satisfied: matplotlib>=3.2.0 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pandas-profiling==2.7.1) (3.4.3)
Requirement already satisfied: tqdm>=4.43.0 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pandas-profiling==2.7.1) (4.62.3)
Requirement already satisfied: jinja2>=2.11.1 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from pandas-profiling==2.7.1) (3.0.3)
Requirement already satisfied: astropy>=4.0 in /root/venv/lib/python3.7/site-packages (from pandas-profiling==2.7.1) (4.3.1)
Requirement already satisfied: jupyterlab-widgets>=1.0.0; python_version >= "3.6" in /root/venv/lib/python3.7/site-packages (from ipywidgets>=7.5.1->pandas-profiling==2.7.1) (1.0.2)
Requirement already satisfied: ipython-genutils~=0.2.0 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.2.0)
Requirement already satisfied: ipykernel>=4.5.1 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from ipywidgets>=7.5.1->pandas-profiling==2.7.1) (5.5.5)
Requirement already satisfied: traitlets>=4.3.1 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from ipywidgets>=7.5.1->pandas-profiling==2.7.1) (4.3.3)
Requirement already satisfied: ipython>=4.0.0; python_version >= "3.3" in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from ipywidgets>=7.5.1->pandas-profiling==2.7.1) (7.29.0)
Requirement already satisfied: nbformat>=4.2.0 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from ipywidgets>=7.5.1->pandas-profiling==2.7.1) (5.1.3)
Requirement already satisfied: widgetsnbextension~=3.5.0 in /root/venv/lib/python3.7/site-packages (from ipywidgets>=7.5.1->pandas-profiling==2.7.1) (3.5.2)
Requirement already satisfied: pyyaml in /shared-libs/python3.7/py/lib/python3.7/site-packages (from confuse>=1.0.0->pandas-profiling==2.7.1) (6.0)
Requirement already satisfied: python-dateutil>=2.7.3 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from pandas!=1.0.0,!=1.0.1,!=1.0.2,>=0.25.3->pandas-profiling==2.7.1) (2.8.2)
Requirement already satisfied: pytz>=2017.3 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pandas!=1.0.0,!=1.0.1,!=1.0.2,>=0.25.3->pandas-profiling==2.7.1) (2021.3)
Requirement already satisfied: seaborn in /shared-libs/python3.7/py/lib/python3.7/site-packages (from missingno>=0.4.2->pandas-profiling==2.7.1) (0.11.2)
Requirement already satisfied: certifi>=2017.4.17 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from requests>=2.23.0->pandas-profiling==2.7.1) (2021.10.8)
Requirement already satisfied: charset-normalizer~=2.0.0; python_version >= "3" in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from requests>=2.23.0->pandas-profiling==2.7.1) (2.0.7)
Requirement already satisfied: idna<4,>=2.5; python_version >= "3" in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from requests>=2.23.0->pandas-profiling==2.7.1) (3.3)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from requests>=2.23.0->pandas-profiling==2.7.1) (1.26.7)
Requirement already satisfied: networkx>=2.4 in /root/venv/lib/python3.7/site-packages (from visions[type_image_path]==0.4.1->pandas-profiling==2.7.1) (2.6.3)
Requirement already satisfied: attrs>=19.3.0 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from visions[type_image_path]==0.4.1->pandas-profiling==2.7.1) (21.2.0)
Requirement already satisfied: imagehash; extra == "type_image_path" in /root/venv/lib/python3.7/site-packages (from visions[type_image_path]==0.4.1->pandas-profiling==2.7.1) (4.2.1)
Requirement already satisfied: Pillow; extra == "type_image_path" in /shared-libs/python3.7/py/lib/python3.7/site-packages (from visions[type_image_path]==0.4.1->pandas-profiling==2.7.1) (8.4.0)
Requirement already satisfied: cycler>=0.10 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from matplotlib>=3.2.0->pandas-profiling==2.7.1) (0.11.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from matplotlib>=3.2.0->pandas-profiling==2.7.1) (1.3.2)
Requirement already satisfied: pyparsing>=2.2.1 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from matplotlib>=3.2.0->pandas-profiling==2.7.1) (2.4.7)
Requirement already satisfied: MarkupSafe>=2.0 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from jinja2>=2.11.1->pandas-profiling==2.7.1) (2.0.1)
Requirement already satisfied: importlib-metadata; python_version == "3.7" in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from astropy>=4.0->pandas-profiling==2.7.1) (4.8.2)
Requirement already satisfied: pyerfa>=1.7.3 in /root/venv/lib/python3.7/site-packages (from astropy>=4.0->pandas-profiling==2.7.1) (2.0.0.1)
Requirement already satisfied: tornado>=4.2 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from ipykernel>=4.5.1->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (6.1)
Requirement already satisfied: jupyter-client in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from ipykernel>=4.5.1->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (6.1.12)
Requirement already satisfied: six in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from traitlets>=4.3.1->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (1.16.0)
Requirement already satisfied: decorator in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from traitlets>=4.3.1->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (5.1.0)
Requirement already satisfied: pickleshare in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from ipython>=4.0.0; python_version >= "3.3"->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.7.5)
Requirement already satisfied: pygments in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from ipython>=4.0.0; python_version >= "3.3"->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (2.10.0)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from ipython>=4.0.0; python_version >= "3.3"->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (3.0.22)
Requirement already satisfied: matplotlib-inline in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from ipython>=4.0.0; python_version >= "3.3"->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.1.3)
Requirement already satisfied: setuptools>=18.5 in /root/venv/lib/python3.7/site-packages (from ipython>=4.0.0; python_version >= "3.3"->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (47.1.0)
Requirement already satisfied: jedi>=0.16 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from ipython>=4.0.0; python_version >= "3.3"->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.17.2)
Requirement already satisfied: pexpect>4.3; sys_platform != "win32" in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from ipython>=4.0.0; python_version >= "3.3"->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (4.8.0)
Requirement already satisfied: backcall in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from ipython>=4.0.0; python_version >= "3.3"->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.2.0)
Requirement already satisfied: jupyter-core in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from nbformat>=4.2.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (4.7.1)
Requirement already satisfied: jsonschema!=2.5.0,>=2.4 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from nbformat>=4.2.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (3.2.0)
Requirement already satisfied: notebook>=4.4.1 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (6.3.0)
Requirement already satisfied: PyWavelets in /root/venv/lib/python3.7/site-packages (from imagehash; extra == "type_image_path"->visions[type_image_path]==0.4.1->pandas-profiling==2.7.1) (1.2.0)
Requirement already satisfied: zipp>=0.5 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from importlib-metadata; python_version == "3.7"->astropy>=4.0->pandas-profiling==2.7.1) (3.6.0)
Requirement already satisfied: typing-extensions>=3.6.4; python_version < "3.8" in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from importlib-metadata; python_version == "3.7"->astropy>=4.0->pandas-profiling==2.7.1) (3.10.0.2)
Requirement already satisfied: pyzmq>=13 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from jupyter-client->ipykernel>=4.5.1->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (22.3.0)
Requirement already satisfied: wcwidth in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython>=4.0.0; python_version >= "3.3"->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.2.5)
Requirement already satisfied: parso<0.8.0,>=0.7.0 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from jedi>=0.16->ipython>=4.0.0; python_version >= "3.3"->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.7.1)
Requirement already satisfied: ptyprocess>=0.5 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from pexpect>4.3; sys_platform != "win32"->ipython>=4.0.0; python_version >= "3.3"->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.7.0)
Requirement already satisfied: pyrsistent>=0.14.0 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.2.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.18.0)
Requirement already satisfied: Send2Trash>=1.5.0 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (1.8.0)
Requirement already satisfied: prometheus-client in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.12.0)
Requirement already satisfied: terminado>=0.8.3 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.12.1)
Requirement already satisfied: nbconvert==6.0.7 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (6.0.7)
Requirement already satisfied: argon2-cffi in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (21.1.0)
Requirement already satisfied: testpath in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from nbconvert==6.0.7->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.5.0)
Requirement already satisfied: pandocfilters>=1.4.1 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from nbconvert==6.0.7->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (1.5.0)
Requirement already satisfied: bleach in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from nbconvert==6.0.7->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (4.1.0)
Requirement already satisfied: entrypoints>=0.2.2 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from nbconvert==6.0.7->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.3)
Requirement already satisfied: jupyterlab-pygments in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from nbconvert==6.0.7->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.1.2)
Requirement already satisfied: nbclient<0.6.0,>=0.5.0 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from nbconvert==6.0.7->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.5.8)
Requirement already satisfied: mistune<2,>=0.8.1 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from nbconvert==6.0.7->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.8.4)
Requirement already satisfied: defusedxml in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from nbconvert==6.0.7->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.7.1)
Requirement already satisfied: cffi>=1.0.0 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (1.15.0)
Requirement already satisfied: webencodings in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from bleach->nbconvert==6.0.7->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (0.5.1)
Requirement already satisfied: packaging in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from bleach->nbconvert==6.0.7->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (21.2)
Requirement already satisfied: nest-asyncio in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from nbclient<0.6.0,>=0.5.0->nbconvert==6.0.7->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (1.5.1)
Requirement already satisfied: pycparser in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from cffi>=1.0.0->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.5.1->pandas-profiling==2.7.1) (2.21)
WARNING: You are using pip version 20.1.1; however, version 21.3.1 is available.
You should consider upgrading via the '/root/venv/bin/python -m pip install --upgrade pip' command.
IPyWidgets are not supported
IPyWidgets are not supported
IPyWidgets are not supported
Conclusions
After this multiple processes, we can conclude that the user trend to judge the games during the early access games, and even if they are released they do not change the comment (based on the commentary date and the release date of the game), making report difficult and with a lack of sufficient information in order to deliver to the developer the information. Nevertheless, is important to understand what does the community wants and try to improve the user experience of each game, and which games have the biggest problem.
Project Part II: Visualization
After eliminated the null data, is time to get some order in the data set and make understandable the information.
This data set only have 48 games of the 6k+ that the platform has. Nevertheless, it is important to see if which game has the biggest
IPyWidgets are not supported
IPyWidgets are not supported
IPyWidgets are not supported