Machine Learning with R – Book recommendation

Do you want to understand what machine learning is about and how it works? And to get at the same time a crash course on "R", a freely available language and environment for statistical computing and graphics which provides a wide variety of statistical and graphical techniques? Then you should definitely read the book by Brett Lantz (2015): Machine Learning with R – Second Edition – Deliver Data Insights with R and Predictive Analytics (2nd Revised edition). Packt Publishing. It is not cheap – 53 € (Paperback) resp. 33 € (eBook) – but I have not regretted this purchase.

machine-learning-with-r-2-editionThis book has opened a new world for me! I bought it to get some understanding about machine learning. The book holds everything what it promises in the title.

The author gives a very gentle introduction to key issues in statistics. Even simple things like the difference between mean and median are explained.

But the book is also a crash course on R. Parallel to my reading I could experiment with the data and the R  environment.


Learning machine learning with real data

Especially intriguing for me was that one could follow the data analysis hands-on with real data sets! (I didn't know previously that there are real data sets free available on the internet – for instance at the UCI machine learning repository). And all this could be done without previous knowledge of R.

machine-learning-with-r-pictureI have to confess that some of the statistical details in the later chapters I didn't understand completely in my first reading. But I didn't expect that with my first dive into the domain of machine learning I will become a professional data scientist. I got some understanding about the main concepts and know now where to go for further practice and to build up my skills for analysing big data.

Excellent teaching approach to machine learning

The book is also (almost) perfect from an educational point of view. After two introductory chapters (one about general features of machine learning and one about the first steps and general syntax of R) the next seven chapters follow the same outline:

  1. Providing a general understand of the algorithms with strength and weaknesses: Explaining the most important formulas and the effects demonstrating with some illustrative sample data. This provides you with a qualitative understanding of the method.
  2. The chapter continues with a practical demonstration in the following order:
    1. Collecting data: Where to get the data set, references and explaining the structure of the data.
    2. Exploring and preparing the data. Every R-command to load the data, to transform etc. is explained and written down as code. The data and even these command are provided in a .zip archive at github.
    3. Training the model on the data
    4. Evaluating the model performance, looking for and discussing the false positives and false negatives including their effects in the real world.
    5. Improving the performance of the model.
  3. And finally a summary with lessons learned from this chapter.
  4. Like the first two chapters the structure of the last three chapters are different too:  They are dedicates on strategies for evaluating and improving of model performances and some other specialised issues on machine learning.

Some suggestions for the third edition of machine learning with R

machine-learning-with-rAbove I mentioned the word "almost perfect": The only three things I was missing:

  1. Please provide a section with exercises and solutions for the next edition! This would be very important for the transfer from understanding to applicable skills.
  2. I would like to see one application in learning analytics with a real data set from the educational domain.
  3. And last not least – there should be a last chapter "Where to go from here now".

But all in all: One of the best tutorial books I have read!

Machine Learning with R - Second Edition: Deliver Data Insights with R and Predictive AnalyticsMachine Learning with R - Second Edition: Deliver Data Insights with R and Predictive Analytics by Brett Lantz
My rating: 5 of 5 stars
View all my reviews

Von Peter Baumgartner

Seit mehr als 30 Jahren treiben mich die Themen eLearning/Blended Learning und (Hochschul)-Didaktik um. Als Universitätsprofessor hat sich dieses Interesse in 13 Bücher, knapp über 200 Artikel und 20 betreuten Dissertationen niedergeschlagen. Jetzt in der Pension beschäftige ich mich zunehmend auch mit Open Science und Data Science Education.

Schreiben Sie einen Kommentar

Ihre E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert