# Python Practice 1 Week 10 - DataFrames

- Import libraries and DataFrame
- Create a pandas DataFrame from the file pokeman.csv by using pandas command
`read_csv`

; print out first 6 lines - Print out data types of each feature using
`.dtypes`

- Print out column (feature) names using
`.columns`

- Create a pandas Series for the feature
`Speed`

; print out its type to check - Create a NumPy array for the feature
`Speed`

from the DataFrame; print out its type to check - Make NumPy arrays from the features
`Attack`

and`Defense`

and do a scatter plot using matplotlib; add labels to axes - Create a new DataFrame from the original one by dropping the feature "Type 2"; print out first few lines to check

### Data file - Pokeman data

The file

`pokeman.csv`

contains combat statistics for the original 151 Pokeman characters.

# Python Practice Week 10 - Visualization using DataFrames and Seaborn

- Import libraries and DataFrame
- Create a pandas DataFrame by using pandas command
`read_csv`

as in previous practice notebook; print out first 6 lines - Add a white grid to background of all remaining plots using
`set_style`

- Make a scatter plot using Seaborn's
`relplot`

of Defense statistics (y-axis) vs Attacks Stats - Repeat previous plot but use color to indicate Type 1 using
`hue =`

- Make a category plot using
`catplot`

of Defense statistics (y-axis) vs Type 1 (non-numerical data) If you can't read x-labels rotation labels using plt.xticks using`plt.xticks(rotation=-45)`

- Make a Bar graph of Defense statistics for Type 1
- Make a violin plot of the Defense data for Type 1
- Redo previous violin plot but change palette to 'prism' using
`palette =`

and change size using`height=`

- Overlaying plots - overlay previous violin plot with actual points. To do this (1) increase figure size using
`plt.figure(figsize = (10,6) )`

; (2) create violin plot and set`inner = None`

to get rid of the bars inside violin plot, (3) rotate x-axis labels for readability; (4) create swarmplot for points and set`color='k'`

to create the points in black; (5) add title "Defense Data for Type 1"

### Data file - Pokeman data

The file

`pokeman.csv`

contains combat statistics for the original 151 Pokeman characters.

# Week 10 Practice 3 - Linear Regression with scikit-learn

In this notebook you load a CSV file into a dataframe. The file contains some medical insurance records. We want to see if the yearly insurance charges can be modeled by just the BMI data. So the idea is given a BMI what is the prediction for the yearly insurance charges.

- Import libraries
- Import Linear Regression model from scikit-learn
- Read in the data file 'insurance.csv' with pandas
`read_csv()`

to create a dataframe; print out some lines - Set background grid for Seaborn
- Use Seaborn to create a scatter plot of charges vs BMI with the feature smoker indicated by color.
- Get NumPy arrays for
`BMI`

and`Charges`

from Dataframe (use`df.feature_name.values`

with correct feature name) - Use NumPy
`reshape()`

to make the arrays n by 1 instead of 1 by n;`reshape`

needs two arguments the array to reshape and the size to change it to such as`(n,1)`

- Create the model using LinearRegression function in scikit-earn
- Fit the data using
`.fit()`

whose arguments are the x and y data (here "BMI" and "Charges" arrays) - Write out the equation of the line fitting the data using
`.intercept_[0]`

and`.coef_[0,0]`

- Plot the data and the line using Seaborn's regplot
- Predict the insurance costs for a person with a bmi of 31.7; round answer to near cent; compare with plot