Spark 1.4 for RStudio

2015-07-14 Garrett Grolemund
Today’s guest post is written by Vincent Warmerdam of GoDataDriven and is reposted with Vincent’s permission from You can learn more about how to use SparkR with RStudio at the 2015 EARL Conference in Boston November 2-4, where Vincent will be speaking live. This document contains a tutorial on how to provision a spark cluster with RStudio. You will need a machine that can run bash scripts and a functioning account on AWS. Read more →

SparkR preview by Vincent Warmerdam

2015-05-28 Garrett Grolemund
This is a guest post by Vincent Warmerdam of SparkR preview in Rstudio Apache Spark is the hip new technology on the block. It allows you to write scripts in a functional style and the technology behind it will allow you to run iterative tasks very quickly on a cluster of machines. It’s benchmarked to be quicker than hadoop for most machine learning use cases (by a factor between 10-100) and soon Spark will also have support for the R language. Read more →