Big data

sparklyr 0.9

2018-10-01 Javier Luraschi
Thumbnail
Today we are excited to share that a new release of sparklyr is available on CRAN! This 0.9 release enables you to: Create Spark structured streams to process real time data from many data sources using dplyr, SQL, pipelines, and arbitrary R code. Monitor connection progress with upcoming RStudio Preview 1.2 features and support for properly interrupting Spark jobs from R. Use Kubernetes clusters with sparklyr to simplify deployment and maintenance. Read more →

See RStudio + sparklyr for big data at Strata + Hadoop World

2017-02-13 Roger Oberg
If big data is your thing, you use R, and you’re headed to Strata + Hadoop World in San Jose March 13 & 14th, you can experience in person how easy and practical it is to analyze big data with R and Spark. In a beginner level talk by RStudio’s Edgar Ruiz and an intermediate level workshop by Win-Vector’s John Mount, we cover the spectrum: What R is, what Spark is, how Sparklyr works, and what is required to set up and tune a Spark cluster. Read more →

SparkR preview by Vincent Warmerdam

2015-05-28 Garrett Grolemund
This is a guest post by Vincent Warmerdam of koaning.io. SparkR preview in Rstudio Apache Spark is the hip new technology on the block. It allows you to write scripts in a functional style and the technology behind it will allow you to run iterative tasks very quickly on a cluster of machines. It’s benchmarked to be quicker than hadoop for most machine learning use cases (by a factor between 10-100) and soon Spark will also have support for the R language. Read more →