```
Collecting statsmodels==0.12.2
Downloading statsmodels-0.12.2-cp37-cp37m-manylinux1_x86_64.whl (9.5 MB)
|████████████████████████████████| 9.5 MB 17.7 MB/s
Requirement already satisfied: scipy>=1.1 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from statsmodels==0.12.2) (1.6.3)
Requirement already satisfied: pandas>=0.21 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from statsmodels==0.12.2) (1.2.4)
Collecting patsy>=0.5
Downloading patsy-0.5.1-py2.py3-none-any.whl (231 kB)
|████████████████████████████████| 231 kB 54.9 MB/s
Requirement already satisfied: numpy>=1.15 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from statsmodels==0.12.2) (1.19.5)
Requirement already satisfied: python-dateutil>=2.7.3 in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from pandas>=0.21->statsmodels==0.12.2) (2.8.1)
Requirement already satisfied: pytz>=2017.3 in /shared-libs/python3.7/py/lib/python3.7/site-packages (from pandas>=0.21->statsmodels==0.12.2) (2021.1)
Requirement already satisfied: six in /shared-libs/python3.7/py-core/lib/python3.7/site-packages (from patsy>=0.5->statsmodels==0.12.2) (1.16.0)
Installing collected packages: patsy, statsmodels
Successfully installed patsy-0.5.1 statsmodels-0.12.2
```

## To what extent does daily screen time and outdoor play affect the disruptiveness in children (ages ranging from 2-5)?

### Abstract

This investigation has been carried out to find a possible correlation between a child's disruptiveness and the possible causes of it. A multiple regression model has been created to showcase this. Screen time on a TV and playing games has been examined as well as the child's outdoor play time, gender and age. It was unexpected to see that screen time had no large effect on their disruptiveness but rather outdoor play time. Hence, it can be derived that outdoor play time is a contributor to a child's social skills however, more variables should be included to further expand on this investigation.

### Introduction

As the prevalence of technology rises its drawbacks also become noticeable. Children, in contrast to children 20 years ago have vastly different daily lives. Instead of playing in the playground as their main source of entertainment, video games have become the primary enjoyment. This is what will be the study of this report, the ways in which screen time affects a child's social abilities or rather - disruptiveness. This is something of value to investigate as one knows that being reliant on a screen for entertainment is not ideal, however, it still hasn't gotten as much awareness as it should. We will be studying this using the screen time of children whilst watching television or playing video games, their gender, age and their time spent outdoors - these will all be plotted against their disruptiveness to hopefully be able to draw conclusions. The model that this investigation ended up with showed us that outdoor play time had the largest influence on a childs disruptiveness, essentially telling us that despite screen time being outdoors is the most important.

### Hypothesis

### Data

The dataset comes from a cross sectional study of children aged between 2 and 5. It describes their usual daily screen time, usual daily outdoor hours and their ASBI scores for three different social skills. The dataset contains data from 575 families. For every child, the parents filled in a survey. There were different surveys for mothers and fathers. Our response variable is the ASBI score for disruptiveness, therefore we disregarded the other two ASBI scores. For our explanatory variables, we have the age and gender of the children, as well as the screen time and outdoor hours. In our dataset report, we have taken two screen time variables into account: television/DVD viewing and computer/e-game/handheld game use.

Disruptiveness is measured with ASBI scores, which is based on how often a subject shows disruptive behaviour, like bullying or teasing. It consists of 7 items, which are rated on a three point scale (almost never, sometimes, almost always). The maximum range is between 7 and 21, but most values in this dataset are between 7 and 15, as can be seen in Fig.1.

The participants from the dataset were parents of children aged 2-5 who had not started school yet, meaning that the observational units represent the children. The research was conducted in Melbourne. We deem the generalized population to be preschool children, more specifically from developed countries. Access to different kinds of technology may be different in different parts of the world, but if there is a link between behaviour and screen time or outdoor time, it should be applicable to children in different countries.

Fig 1: Scatterplots of TV screen time, gaming screen time, outdoor hours and age against the response variable

### Results

```
OLS Regression Results
==============================================================================
Dep. Variable: disruptiveness R-squared: 0.012
Model: OLS Adj. R-squared: 0.004
Method: Least Squares F-statistic: 1.435
Date: Sun, 06 Jun 2021 Prob (F-statistic): 0.210
Time: 15:14:42 Log-Likelihood: -1168.1
No. Observations: 575 AIC: 2348.
Df Residuals: 569 BIC: 2374.
Df Model: 5
Covariance Type: nonrobust
=========================================================================================
coef std err t P>|t| [0.025 0.975]
-----------------------------------------------------------------------------------------
Intercept 10.2312 0.345 29.671 0.000 9.554 10.908
gender[T.Male] 0.0854 0.157 0.545 0.586 -0.223 0.394
average_outdoor_hours -0.1020 0.042 -2.422 0.016 -0.185 -0.019
age 0.0390 0.083 0.471 0.638 -0.124 0.201
screen_time_tv 0.0383 0.063 0.607 0.544 -0.086 0.162
screen_time_game 0.1362 0.124 1.101 0.271 -0.107 0.379
==============================================================================
Omnibus: 24.380 Durbin-Watson: 1.968
Prob(Omnibus): 0.000 Jarque-Bera (JB): 26.487
Skew: 0.523 Prob(JB): 1.77e-06
Kurtosis: 3.113 Cond. No. 25.1
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
```

After fitting a linear model to all of the explanatory variables, we obtained an adjusted R^2 of 0.004. We then used backwards selection where we examined the p-value of each variable to eliminate less significant variables. This was done using a significance level of α = 0.05. The first variable to be removed was age, with a p-value of 0.638. Gender was removed next, with a p-value of 0.570. Then we removed screen_time_tv, the p-value being 0.557. Finally, we removed screen_time_game, which had a p-value of 0.161. This left us with only average_outdoor_hours, which has a statistically significant p-value of 0.038. The new adjusted R^2 value was 0.006.

```
OLS Regression Results
==============================================================================
Dep. Variable: disruptiveness R-squared: 0.008
Model: OLS Adj. R-squared: 0.006
Method: Least Squares F-statistic: 4.338
Date: Sun, 06 Jun 2021 Prob (F-statistic): 0.0377
Time: 15:14:13 Log-Likelihood: -1169.6
No. Observations: 575 AIC: 2343.
Df Residuals: 573 BIC: 2352.
Df Model: 1
Covariance Type: nonrobust
=========================================================================================
coef std err t P>|t| [0.025 0.975]
-----------------------------------------------------------------------------------------
Intercept 10.4952 0.146 72.110 0.000 10.209 10.781
average_outdoor_hours -0.0832 0.040 -2.083 0.038 -0.162 -0.005
==============================================================================
Omnibus: 25.576 Durbin-Watson: 1.975
Prob(Omnibus): 0.000 Jarque-Bera (JB): 27.946
Skew: 0.537 Prob(JB): 8.54e-07
Kurtosis: 3.117 Cond. No. 7.24
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
```

The intercept of our model is at 10.4952 and the average_outdoor_hours coefficient is -0.0832. This gives us the following linear regression model:

Fig 2: Scatter plot showing disruptiveness scores against average outdoor hours and a line showing the linear regression model.

The result of the linear model shown in Fig.2 is interesting, it shows a correlation between the two variables, however the correlation is not strong. The correlation is negative, meaning that, according to our findings, a larger amount of hours spent outdoors is correlated with a slightly lower level of disruptiveness.

Here, we examine the reliability of the linear model. The histogram shows a normal distribution which is skewed to the right. This is also visible in the QQ-plot, as the values deviate more from the left side of the line. Other than that, the QQ-plot does not show any extreme outliers, most of the values are close to the line. From these plots, we can determine the linear model to be reliable, although a little skewed.

### Conclusion

In this study, we attempted to find out to what extent daily screen time and outdoor play affect the level of disruptiveness in children aged 2 to 5. We examined five explanatory variables: age, gender, screen time for television, screen time for game devices and average hours spent outside. In the end, the outdoor hours had the largest effect. We removed all the other variables from the model because they didn't show any statistically significant effect. In our final linear regression model, the slope is not very steep. The outdoor hours is not a very reliable predictor of disruptiveness, so there are likely other factors that come into play. Nevertheless, there is a slight negative correlation. From this study, we can observe that a larger amount of hours outside is correlated with a slightly lower level of disruptiveness and vice-versa.

Clearly, the model has its limitations. There is only a slight correlation, which means that in order to reliably predict disruptive behaviour in children, it would be necessary to examine other factors. Additionally, the dataset is limited in a few ways. Firstly, the level of disruptiveness is measured using only 7 questions, and results in scores between 7 and 21 points. Perhaps using more questions, and therefore having a wider range for the scores, would give a more accurate estimation of behaviour. Secondly, the scores are somewhat subjective. Parents might assume their child to be more well-behaved than in reality, or they might underestimate their child's screen time. Because the data was collected from questionnaires instead of from measurements, it might not be entirely accurate.