# Project_1

# Analyzing IMDB Movie Dataset

The Objective of this project is to analyze the IMDb Movie Dataset to uncover trends in ratings, genres and Revenues and Analyzing Based on the given Questions ..

Run to view results

# TASK-1

Run to view results

Code Commentary : df.shape[0] retrieves the number of rows present in the Dataframe named as df.

# TASK-2

Run to view results

Code Commentary : The function df.describe() provides summary statistics for numerical columns in the DataFrame df, including count, mean, standard deviation, minimum, quartiles, and maximum values.

# TASK-3

Run to view results

Code Commentary : Using sum function calculates the total number of missing values in the 'Revenue_millions' column of the DataFrame df.

# TASK-4

Run to view results

Code Commentary : Count the Sum of occurrences where the 'Revenue_millions' column in DataFrame df has a value greater than 75.

# TASK-5

Run to view results

Code Commentary : Counts the number of movies in DataFrame df Using count function that have a revenue greater than 50 million and a rating less than 7.

# TASK-6

Run to view results

Code Commentary : Calculates the total revenue generated by movies released in the year 2015 in the DataFrame df.

# TASK-7

Run to view results

Code Commentary : Computes the average rating of adventure genre movies released in the year 2015 in the DataFrame df.

# TASK-8

Run to view results

Code Commentary : Using mean function Calculates the average runtime in minutes for the movies indexed from 75 to 150 in the DataFrame df.

# TASK-9

Run to view results

Code Commentary : Here Using groupby function groups the DataFrame df by the "Year" column, calculates the sum of "Revenue_millions" for each year, and show the results in descending order based on the total revenue.

# TASK-10

Run to view results

Code Commentary : By using iloc function it selects every 10th row from index 10 to index 59 in the DataFrame df, and then calculates the maximum value of the "Revenue_millions" column within this subset.

# TASK-11

Run to view results

Code Commentary : Using Count Function counts the number of movies in the DataFrame df that belong to the genres Adventure, Action, Horror, or Crime.

# TASK-12

Run to view results

Code Commentary : Calculates the mean (average) rating of movies belonging to the Horror genre in the DataFrame df.

# TASK-13

Run to view results

Code Commentary : This code first counts the number of movies directed by "Billy Ray" and stores it in the variable cond. It then prints this count. Next, it retrieves the years of the movies directed by "Billy Ray" and prints them.

# TASK-14

Run to view results

Code Commentary : In 1st Line of code counts the number of movies released between 2012 and 2014 and i stored it in variable x, then prints it out. Subsequently, it determines the genre that appears first alphabetically among the movies released in that time frame, storing it in variable y, and prints out the most released genre.

# TASK-15

Run to view results

Code Commentary : This code prints the titles of the movie(s) with the maximum number of votes and retrieves their corresponding genres.

# TASK-16

Run to view results

Code Commentary : First I identifies the directors whose movies grossed the highest revenue and stores the director names in variable x, then prints it out. I got the answer as "j.j.Abrams" And I calculates the total revenue generated by the director "J.J. Abrams" and stores it in variable y, and prints out the total revenue.

# TASK-17

Run to view results

Code Commentary : This code snippet calculates the percentage of revenue generated by each movie within its respective genre and year, assigns it to a new column named '%_Revenue', and then extracts the percentage revenue of the movie 'Split' using its respective genre and year. Finally, it prints out the percentage revenue of the movie 'Split'.