Now that I had obtained the dataset which I wanted. I processed the data by finding the most common words in the dataset and then writing them to a CSV file.
I had to do some research when it came to writing data in a CSV file but other than that no issues came up.
After obtaining my CSV file with the top 50 most common words in my dataset. I was going to use a Python word cloud generator called word_cloud by Github user “Amueller”. It turns out that it only worked with Python 2 and I only found this out whilst trying to install it. In the end, I settled for an online word cloud generator.
Click here to be taken to the IPython Notebook where I extracted the top 50 words from my dataset.
I had to manipulate the CSV file because the word cloud generator expects just words as an input and not the word and a count statistic. I used the following website as a guide on how to do this:
The resulting excel file looked like this:
Now I present to you my word cloud for UFC 197:
It’s been a long-ish journey and in the next and final part of this project I will give an overview of my experience of doing the project and blogging alongside it.