Split a data.frame into training and test sets.

data_split(
  data = get_data("german"),
  varname = "credit_risk",
  p_test = 0.2,
  p_quiz = 0.5
)

Arguments

data

data.frame

varname

string. output variable name

p_test

real. proportion of samples in the test set

p_quiz

real. proportion of samples from the test set in the quiz set

Value

list with members

train

training set with output variable

test

test set without output variable

y_test

test set output variable

ind_quiz

indices of quiz samples in the test set