Distributed Computing

sparklyr 0.7

2018-01-29 Kevin Kuo
We are excited to share that sparklyr 0.7 is now available on CRAN! Sparklyr provides an R interface to Apache Spark. It supports dplyr syntax for working with Spark DataFrames and exposes the full range of machine learning algorithms available in Spark. You can also learn more about Apache Spark and sparklyr in spark.rstudio.com and our new webinar series on Apache Spark. Features in this release: Adds support for ML Pipelines which provide a uniform set of high-level APIs to help create, tune, and deploy machine learning pipelines at scale. Read more →

sparklyr 0.6

2017-07-31 Javier Luraschi
We’re excited to announce a new release of the sparklyr package, available in CRAN today! sparklyr 0.6 introduces new features to: Distribute R computations using spark_apply() to execute arbitrary R code across your Spark cluster. You can now use all of your favorite R packages and functions in a distributed context. Connect to External Data Sources using spark_read_source(), spark_write_source(), spark_read_jdbc() and spark_write_jdbc(). Use the Latest Frameworks including dplyr 0.7, DBI 0. Read more →