Downloading list of writers from the mentioned categories and all literary awards
Downloading the individual writer's pages
Reading the individual pages for the list of writers
Writers Network(Awards)
Since it was rare to find links of other writers in wikipages of writers, we decided to create a network based on other information such as common birth decades, death decades, nationality, genres, awards etc. We experimented with these networks and eventually decided to go with just the awards network, as it made the most sense intellectually as people winning the same awards are connected to each other through some common physical people networks and represent the same kind of qualities the juries of these awards are looking for, year after year. We chose to connect all individuals who got the same award, assuming the larger the number of famous recipients of a award, the more prestigious the award becomes and the more relevant it is to the writers network.
* What is the number of nodes and links in the network?
Most awarded writers
The above result looks quite accurate as these authors are highly revered, famous and have won numerous accolades/awards.
Most awarded writers from different genres
Degree distribution
Power law fit
Looking at the degree distribution and the power law fit it can be seen that the writers award network is a small world network. The hubs are present but are not large enough to significantly decrease the distance between the writers. This network is more characteristic of a random network. This kind of makes sense as well since writers are a huge network with a lot of famous writers who have been awarded over centuries and the network of writers and books is just too big for a single writer to have a huge affect on this network. There are a lot of writers over different genres that act like medium sized hubs instead.
Random Network with same degree distribution
Network Visualization
In the visualization above, the playwrights are shown in green, poets in blue, fantasy writers in magenta and historical novelist in gold. Just by looking at the visual representation of this network, we can see that the network doesn't have really large hubs, rather a lot of medium-sized ones. It can also be seen that the poets are more close to the playwrights. The historical novelist seem to be all over the place but if looked at it closely it can be seen that they are more closely associated with the fantasy authors which kinds of make sense as well as the historical novelist also create a fantasy universe with the story setting information coming from the history of humanity.
Part 2: Word-clouds
Cleaning up the authors wiki files and storing it in a new folder, "cleanFiles". Comment out the code in the cell below if you don't want to see this code in action and just make sure that the "cleanText" folder is in the current directory of the notebook.
Some of the key observation from the world clouds of all the genres is how they are dominated by the most iconic authors in these categories. For playwrights its Harold Pinter, Agatha Christie, Anton Chekhov. For poets it's interestingly Kendrick Lamar and Bob Dylan who are singer songwriters. For fantasy its JK Rowling and Roald Dahl. For Historical Novelist, it is Rudyard Kipling, Gore Vidal, Salman Rushdie etc. This represents that the writers in these genres are heavily influenced by the most popular writers in their genre. These writers are heavily influenced by other writers in their category and have probably read a lot of their books. From these world clouds, you could also conclude that in order to be a great writer in a particular genre, first you have to read the other authors in that genre and only then you could a create a new masterpiece of your own.