Analysis 4 Instructions
Degree Distributions and Preferential Attachment Models
Final version due in Laulima Saturday March 5th
75 points
In these problems we analyze natural networks through plotting degree distributions, fitting distributions to the power law, and modeling with preferential attachment models. The objective is to identify real world processes and constraints that may have generated the networks. We use prior techniques of visualization and metrics to assist in the interpretation. The assignment builds on previous demonstrations and uses utility functions we have defined.
Template
You will use an .Rmd template I provide to construct your response. This template also has further guidance on response format, but does not have the detailed discussion below, so you should follow these instructions while working with the template.
Utilities
The template loads utility functions that you should include in your assignment folder. You may also write utility functions for plotting with legends, and making metrics tables, but are not required to do so. If you do so, it is your choice whether to place them in the Utility folder or in-line in the .Rmd. I would put the definition in the .Rmd if I wanted the reader to be able to easily inspect it, and in the Utility folder otherwise.
A Caution Concerning Sampling Random Models
You will be sampling from random models in this assignment. Ideally we would take many samples and average results before making claims about what the models predict. We aren’t doing this because students have varying programming backgrounds, and it would take a long time to run and compute metrics on (say) 100 samples of each model.If you just take one sample it is likely typical but it could be atypical. To avoid drawing unwarranted conclusions based on an atypical sample, I suggest that you run the model a few times to get a feel for what it typically produces. Then, when writing up your results do not hard-code the metrics in your text: use R variables to embed the actual values. This is because the number may change when Knit re-runs all the code to generate the html. So, for example, rather than saying “the average degree is 3.78”, assign the actual value to a variable in a code chunk: ```{r} avgdeg <- mean(degree(g)) … ```and embed in your prose a reference to the variable like: “the average degree is ```r avgdeg```”.
Part A. Analyzing and Interpreting the Structure of a Real World Network (30 pts)
In this section, we use our analytic toolkit to characterize a natural network. This section revisits methods we learned in all three of the previous analysis assignments (visualization, metrics, and random models), but also adds interpretation of degree distribution plots and power law fits.