Homework 2: Data Cleaning Basics
In this homework, we will be going over a couple of concepts we learned in lecture 2! Everything in this homework was covered in lecture so feel free to reference the slides and remember you can also google!
Lecture slides: https://docs.google.com/presentation/d/1TIML5REJKThU4XVHe28m2-zy-NGByi-9Y8upwGGPGrs/edit#slide=id.g215ffca71a4_0_1
0
0
4.8
1
1
6.4
2
2
5.2
3
3
8.1
4
4
4.6
5
5
5.4
6
6
nan
7
7
6.5
8
8
6.9
9
9
4.2
Question 1:
0
0
4.8
1
1
6.4
2
2
5.2
7
7
6.5
8
8
6.9
13
13
6.6
14
14
7.1
15
15
6.5
17
17
6.4
21
21
5.2
Question 2:
0
0
4.8
1
1
6.4
2
2
5.2
7
7
6.5
8
8
6.9
13
13
6.6
14
14
7.1
15
15
6.5
17
17
6.4
21
21
5.2
Question 3:
1
1
6.4
2
2
5.2
7
7
6.5
8
8
6.9
13
13
6.6
14
14
7.1
15
15
6.5
17
17
6.4
21
21
5.2
25
25
5.1
Question 4
#For this question please respond with a text cell outlining the steps involved in Data Cleaning. #Please also include steps for what you would do with outliers, missing/NA values and wrong type values.
- For outliners, investigate then might remove them. - For missing / NA values or wrong values, I could try valrious methods to replace the the missing / NA by mean, median, or linear regression.
CONGRATS! You've finished the coding part of your homework. For the last part of your homework include a summary of 5 things you learned from this week's DSS's Article:
ChatGPT could be a great study buddy or essay reviewer.
Banning AI is not a solution because people always try to game the system.
Using AI while in school will prepare students for the real world.
Teachers can use ChatGPT to improve their materials.
There are other AIs that do the same as ChatGPT, and will be more of them. Banning is not the solution.