big data
sparklyr 1.3: Higher-order Functions, Avro and Custom Serializers
2020-07-16
Yitao Li

sparklyr 1.2: Foreach, Spark 3.0 and Databricks Connect
2020-05-06
Yitao Li

sparklyr 1.1: Foundations, Books, Lakes and Barriers
2020-01-29
Javier Luraschi

sparklyr 1.0: Apache Arrow, XGBoost, Broom and TFRecords
2019-03-15
Javier Luraschi

sparklyr 0.9: Streams and Kubernetes
2018-10-01
Javier Luraschi

See RStudio + sparklyr for big data at Strata + Hadoop World
2017-02-13
Roger Oberg
If big data is your thing, you use R, and you’re headed to Strata + Hadoop World in San Jose March 13 & 14th, you can experience in person how easy and practical it is to analyze big data with R and Spark.
In a beginner level talk by RStudio’s Edgar Ruiz and an intermediate level workshop by Win-Vector’s John Mount, we cover the spectrum: What R is, what Spark is, how Sparklyr works, and what is required to set up and tune a Spark cluster.
Read more →
SparkR preview by Vincent Warmerdam
2015-05-28
Garrett Grolemund
This is a guest post by Vincent Warmerdam of koaning.io.
SparkR preview in Rstudio Apache Spark is the hip new technology on the block. It allows you to write scripts in a functional style and the technology behind it will allow you to run iterative tasks very quickly on a cluster of machines. It’s benchmarked to be quicker than hadoop for most machine learning use cases (by a factor between 10-100) and soon Spark will also have support for the R language.
Read more →