Perform Tidymodels grid search tuning inside Spark

tune_grid_spark

Description

Perform Tidymodels grid search tuning inside Spark

Usage

tune_grid_spark(
  sc,
  object,
  preprocessor,
  resamples,
  ...,
  param_info = NULL,
  grid = 10,
  metrics = NULL,
  eval_time = NULL,
  control = NULL,
  no_tasks = NULL
)

Arguments

Arguments Description
sc A Spark Connection
object A parsnip model specification or an unfitted workflow(). No tuning parameters are allowed; if arguments have been marked with tune(), their values must be finalized.
preprocessor A traditional model formula or a recipe created using recipes::recipe().
resamples An rset resampling object created from an rsample function, such as rsample::vfold_cv().
Not currently used.
param_info A dials::parameters() object or NULL. If none is given, a parameters set is derived from other arguments. Passing this argument can be useful when parameter ranges need to be customized.
grid A data frame of tuning combinations or a positive integer. The data frame should have columns for each parameter being tuned and rows for tuning parameter candidates. An integer denotes the number of candidate parameter sets to be created automatically.
metrics A yardstick::metric_set(), or NULL to compute a standard set of metrics.
eval_time A numeric vector of time points where dynamic event time metrics should be computed (e.g. the time-dependent ROC curve, etc). The values must be non-negative and should probably be no greater than the largest event time in the training set (See Details below).
control An object used to modify the tuning process, likely created by control_grid().
no_tasks Number of parallel tasks (jobs) to request Spark

Details

The parsnip model, or the unfitted workflow(), the pre-processor and the re-samples are uploaded to the Spark cluster as R objects. The Spark cluster runs each tuning combinations in parallel using R, Tidymodels and any other modeling packages used for the given workflow. The results are then downloaded back to the local R session.

Value

tune_results object