QAM-II Project Report
Hollywood Rules
Importing Libraries
Run to view results
Loading and Pre-processing the Data
Run to view results
Run to view results
Run to view results
Set the Stage
Question. 1
To obtain an initial overview of the data, calculate the minimum, average, and maximum values of the variables—opening gross, total U.S. gross, total non-U.S. gross, and opening theatres. How many of the movies in the data set are comedies and how many movies are R-rated ?
Run to view results
Run to view results
Question. 2
Michael London (of Sideways fame) declared in The Hollywood Reporter, “The studio business historically returns around 12 percent a year.” Griffith knew any investor would want justification for such a statement.
a. Calculate the U.S. return on investment (ROI) (simply defined as the difference of total U.S. box-office gross and budget divided by budget, ignoring any form of discounting) for each movie in the data set.
Run to view results
b. Provide a 95 percent confidence interval for the mean U.S. ROI of movies.
Run to view results
c. Show that the mean U.S. ROI is significantly larger than the 12 percent London cited.
Run to view results
Before Production
Question. 3
3. While any genre can produce a blockbuster, Griffith suspected that some categories are more likely to do so than others. If he could stack the deck in his favor through storyline selection, he did not want to pass up the opportunity.
a. Compare the total U.S. box-office gross of movies from the comedy genre with movies from other genres. Is there a statistically significant difference between the total U.S. gross of comedies and non-comedy movies?
Run to view results
b. Griffith was not so sure about the results, because they were contrary to his gut feelings. Maybe higher revenue accompanied higher investments? Calculate additionally the difference of U.S. ROIs from movies of the comedy genre and of other movie genres. Is there a statistically significant difference between the U.S. ROIs?
Run to view results
ROI P-value: 0.0474 (Significant). Comedies perform significantly differently (better) in terms of ROI in this sample.
Question. 4
Prevailing wisdom maintained that R-rated movies performed better than other movies.
a. Is there a statistically significant difference between the total U.S. gross of R-rated movies and movies with other ratings?
Run to view results
P-value: 0.3979. There is no statistically significant difference in gross revenue for R-rated movies.
Question. 5
Believed to be among the preproduction factors driving success were budget (which expresses both the cost of the film and the quality of the actors as expressed by their fee), genre (comedy vs. non-comedy), MPAA rating (R-rated vs. other rating), and audiences’ familiarity with the story (whether the film is a sequel or an adaptation of a known story).
a. Based on the described beliefs, determine a sound regression model predicting total U.S. box-office gross of movies prior to production.
Run to view results
b. Drop all variables from the regression that are not significant at a 10 percent level of significance. Report the final regression.
Run to view results
c. Holding all other explanatory variables in your regression fixed, which movies have higher total U.S. gross, those that are a sequel or those that are not?
Run to view results
Refined Model: Budget and Sequel are significant.
Sequel Effect: Holding budget constant, a sequel adds approximately $30,500,000 to the Total U.S. Gross.
Before Opening Weekend
Question. 6
Griffith knew the age-old Hollywood wisdom that the opening weekend is absolutely critical for the overall commercial success of a movie. Therefore, both the release date (whether during the summer, on a U.S. holiday, or around Christmas) and the number of movie theatres in which a movie is shown during opening weekend are assumed to be very important. These factors are believed to strongly influence revenue during the opening weekend and, thereby, to have a strong impact on the overall commercial success of a movie.
a. Determine a sound regression model predicting opening weekend box-office gross revenue. Consider both the preproduction success factors as well as the factors describing the opening weekend.
Run to view results
b. Drop all variables from the regression that are not significant at a 10 percent level of significance. Report the final regression.
Run to view results
c. Carefully interpret the slope coefficient of each variable in the regression.
d. Suppose the number of movie theatres showing a movie on the opening weekend increases by one hundred. Provide a point estimate and a 95 percent confidence interval for the expected change in the opening weekend box-office revenue.
Run to view results
Question. 7
Griffith also knew the even stronger version of that age-old Hollywood wisdom which stated that 25 percent of a movie’s U.S. box-office gross revenue came in during the opening weekend. All this conventional wisdom made him curious to examine the relationship between total U.S. box-office gross and opening weekend box-office gross.
a. Run a simple linear regression predicting total U.S. box-office gross from opening weekend box-office gross.
Run to view results
b. If the stronger version of that age-old wisdom were true, that is, if indeed 25 percent of a movie’s U.S. box-office gross revenue came in during the opening weekend, what would the value of the slope coefficient in the linear regression model have to be?
Run to view results
c. Can the age-old wisdom be rejected based on the simple linear regression?
Run to view results
Since the p-value ($0.0001$) is less than the significance level of 0.05, we reject the age-old wisdom. The data suggests the multiplier is significantly different from 4 (specifically, it is lower, around 3.12).
d. Critique the statistical analysis in part (c).
Run to view results
Answer: The analysis in part (c) uses a model with an intercept term. However, the intercept is not statistically significant (p-value = 0.260, which is > 0.05).
e. Determine a sound regression model predicting total U.S. box-office gross from opening weekend box-office gross.
Run to view results
f. Examine the validity of the age-old wisdom using the new regression.
Run to view results
g. What proportion of the variation in total U.S. box-office gross revenue can be explained by variation in the opening weekend box-office gross revenue?
Run to view results
91.7% of the Variation in total US box-office gross revenue can be explained by variation in the opening weekend.
After Opening Weekend
Question. 8
Investors often wonder just how much influence press reviews have on box-office admissions. If Meyer turned out to be a Flags of Our Fathers fan who blamed the failure of his favorite film on the evil critics, how would Griffith respond?
a. Determine a sound regression model predicting total U.S. box-office gross revenue. Consider all factors known after the opening weekend, including those known before production and those known only before opening weekend, as well as the opening box- office gross and the critics’ opinion score.
Run to view results
b. Drop all variables from the regression that are not significant at a 10 percent level of significance. Report the final regression.
Run to view results
c. Consider a movie with the characteristics of Flags of Our Fathers. Using the regression from part (b), provide a point estimate and a 95 percent prediction interval for the total U.S. gross revenue of a movie with such characteristics.
Run to view results
d. Advise Griffith on how much he should be willing to invest in order to influence the critics to gain an extra ten points in the opinion score of a movie with the characteristics of Flags of Our Fathers, thereby earning such a film a score of eighty-nine points instead of seventy-nine points.
Run to view results
Advice: Griffith should be willing to invest up to $5.9 million in marketing or production improvements if he is confident it will raise the critics' score by 10 points (e.g., from 79 to 89).
Question. 9
Griffith surmised that poor reviews affected the total U.S. box-office gross of comedies less strongly than the total U.S. gross of movies from other genres. In particular, he theorized that the critics’ opinion score had a significantly smaller influence on total U.S. box-office gross for comedies than for non-comedies.
a. Modify your regression from Question 8 to examine Griffith’s claim. Can you prove his theory?
Run to view results
Coefficient for Comedies is: $630,080.38
Coefficient for Non-Comedies is: $743,938.78
Even though the Effect of Critics Opinion on Non-Comedies is Larger than Comedies, but the p-value is 0.496 and this its is not statistically significant.
Question. 10
Standard paychecks for A-list stars such as George Clooney or Brad Pitt are routinely on the order of $15 million or more. Producers hope that famous faces in a film will guarantee packed movie theatres. Griffith concluded that it is not really a large budget per se that has a strong positive effect on total U.S. box-office gross revenue. Instead the number of star movie actors in a film drives up total U.S. gross. He regretted that he did not have any data on the number of movie stars in his data set to examine his claim.
a. Consider a variable called “star power” that reports the number of A-list stars for a movie. If you had data for this variable and added it to your regression from Question 9, what would have to be true for the slope coefficient of star power and how would the slope coefficient of the budget variable have to change for Griffith’s conclusion to be correct?
Analysis: Currently, Budget is positively correlated with Star Power (stars are expensive). If Star Power is added:
Star Power coefficient: Would be positive and significant.
Budget coefficient: Would decrease, as the variance in revenue previously explained by "Budget" would now be captured by "Star Power".