Randomly splits a feature matrix or data.frame and its corresponding response vector into training and test subsets.
train_test_split(X, y, test_size = 0.2, seed = NULL)A matrix or data.frame of features.
A vector of responses (numeric or factor). Must have the same
number of rows as X.
Proportion of observations to use as the test set.
A number in (0, 1). Default is 0.2 (80/20 split).
An optional integer random seed for reproducibility. If
NULL (default) the current RNG state is used.
A named list with four elements:
Training features (same type as X).
Test features (same type as X).
Training response.
Test response.
# matrix input
X <- iris[, 1:4]
y <- iris$Species
d <- unifiedml::train_test_split(X, y, test_size = 0.3, seed = 42)
dim(d$X_train) # 105 x 4
#> [1] 105 4
dim(d$X_test) # 45 x 4
#> [1] 45 4
# data.frame input
d2 <- unifiedml::train_test_split(iris[, 1:4], iris$Species, test_size = 0.2)
is.data.frame(d2$X_train) # TRUE
#> [1] TRUE