Get train/test splits of the phenotypic MET dataset based on a number of random k-folds partitions determined by the user, according to the type CV1. Creation of the list of train/test splits based on phenotypic data, so that all the phenotypes from the same line appear in same fold (prediction of new lines never observed in any environment).

predict_cv1(pheno_data, nb_folds, reps, seed)

Arguments

pheno_data

data.frame Dataset containing phenotypic outcome data, as well as the predictor variables

nb_folds

numeric Number of folds in the CV process. In CV1 lines are randomly assigned to folds: this ensures that all the records of a given line are assigned to the same fold.

reps

numeric Number of repeats of the k-folds CV

Value

a cv_object object which contains nb_folds x reps elements. Each element of the object corresponds to a split object with two elements:

training

data.frame Dataset with all observations for the training set.

test

data.frame Dataset with all observations for the test set.

References

Jarqu攼㹤n D, Lemes da Silva C, Gaynor RC, Poland J, Fritz A, Howard R, Battenfield S, Crossa J (2017). “Increasing genomic-enabled prediction accuracy by modeling genotype\(\times\) environment interactions in Kansas wheat.” The plant genome, 10(2), 1--15. Jarqu攼㹤n D, Crossa J, Lacaze X, Du Cheyron P, Daucourt J, Lorgeou J, Piraux F, Guerreiro L, P攼㸹rez P, Calus M, others (2014). “A reaction norm model for genomic selection using high-dimensional genomic and environmental data.” Theoretical and applied genetics, 127(3), 595--607.

Author

Cathy C. Westhues cathy.jubin@uni-goettingen.de