In yesterday's post, I have experimented with R packages for generating Twitter Word clouds. In this post, I will give some hints how to proceed. I will also refer to my GitHub repository, where you can find the complete program code. I have added some examples in generating all the twitter clouds for all member of the IBM staff with a Twitter account, for the department and the university account.
Steps for generating twitter word clouds
1. Generate Twitter API key
For the purpose of authentication, you have to get a Twitter API key. You have to create an application in Twitter via https://apps.twitter.com/app/new. Creating a Twitter application is free and you don't need to know all the details for programming a Twitter API. This is done by the R packages twitteR.
There are several tutorials how to get the Twitter API key: See for instance this YouTube Video or read the article on R-bloggers.
2. Install R and copy the R word cloud program
If you haven't installed R yet, read one of the many tutorials: For instance: How to install R and a Brief Introduction to R. I recommend also to install RStudio as THE interactive integrated development environment (IDE) for R. (You must install first R, and after that RStudio.) If you want more to do with R as just producing the word cloud, then you should read the (in my opinion) best and still very gentle introductory book by Hadley Wickam: R for Data Science. It is free available on the internet!
You have to fill in your authentication keys and the user account for the word cloud. For instance, the line with my account would be:
user = 'pbaumgartner'
3. Experiment with the different parameters
The last task before you run the program is to adapt the parameters for your word cloud.
Twitter Word clouds: Setting parameters
# experiment with different settings of the parameters
if (require(RColorBrewer[/efn_note] { # using color palette from RColorBrewer
pal <- brewer.pal(9,"Blues") # sequential color palettes
pal <- pal[-(1:4)] # for a one color (shaded) appearance
wordcloud( # call the essential function
words, # used words by this account
freqs, # frequencies of every word in this account
scale = c(4.5, .3), # size of the wordcloud
min.freq = 6, # high (5+) if not many different words
max.words = 200, # use less (100) if the account is new
# (< 500 tweets)
random.order = FALSE, # most important words in the center
random.color = FALSE, # color shades provided by RColorBrewer
# remove RcolorBrewer and set to TRUE
rot.per = .15, # percentage of words 90% rotated
colors = pal) # use shaded color palette from RColorBrewer
}
You see this is a little bit complex as there are many different parameters. The best and fastest way is to duplicate the program snippet above and to run it as a separate program. For this, it is essential that the hard word (text mining and transforming the data from the Twitter account is already done and all the variables are still in the R memory.
Examples of Twitter word clouds
You can see a big difference in comparison with the clouds I have published yesterday. This time I have adjusted the parameters so that all word cloud have a similar size and have more or less the same amount of information. You can see the parameters I have used on this page here.
The Twitter account of Wolfgang Rauter is a very new one. So he has not many tweets yet (21). Therefore I had to tweak the parameters. Instead of using a minimum frequency of 5(yesterday) I had to use 1 and to limit to 100 (yesterday: 200) words.
Another interesting tweaking example is the timeline of @donau_uni. The word 'presseaussendung' (yesterday) is very dominant (frequency = 240) and destroys a nice appearance of the word cloud. I could delete this word from the list or – I used a greater scale for the cloud with the effect that the huge word 'presseaussendung" could not be displayed in the predefined limits.
Enjoy!
Eine Antwort auf „Twitter word clouds explained“
[…] Источник […]