sparklyr
  • Get Started
  • Guides
  • Databricks
  • Deployment
  • News
  • Reference
  • Packages
  • Learn more

R interface to Apache Spark ™

  • Interact with Spark using familiar R interfaces, such as dplyr, broom, and DBI.

  • Gain access to Spark’s distributed Machine Learning libraries, Structure Streaming,and ML Pipelines from R.

  • Extend your toolbox by adding XGBoost, MLeap, H2O and Graphframes to your Spark plus R analysis.

  • Connect R wherever Spark runs: Hadoop, Mesos, Kubernetes, Stand Alone, and Livy.

  • Run distributed R code inside Spark

Get Started

Welcome new users! Start here to learn how to install and use sparklyr.

Guides

“How-to” articles to help you learn how to do things such as: connect AWS S3 buckets, handling Streaming Data, create ML Pipelines and others.

Deployment

Articles on Spark environments. Including AWS EMR, Databricks and Qubole.