Data dimensionality reduction by modeling genetic effects using the PCs of the genomic relationship matrix.

Computes the genomic relationship after centering and scaling genotype matrix of the training set. Eigenvalues and eigenvectors of the G matrix are estimated. PCA is applied on the training set and the same transformation is applied on the test set. The goal is to use principal components in prediction models as a smaller number of variables instead of all the marker predictors.

apply_pcs_G_Add(split, geno, num_pcs = 200, ...)

Arguments

split

split	An object of class `split`, corresponding to one element of the total `cv_object` generated by one of the functions `predict_cv0()`, `predict_cv00()`, `predict_cv1()`, or `predict_cv2()`, and containing the following items: training: `data.frame` Training dataset test: `data.frame` Test dataset
geno	`data.frame` It corresponds to a `geno` element within an object of class `METData`.
num_pcs	`integer` Maximal number of principal components to extract.

An object of class split, corresponding to one element of the total cv_object generated by one of the functions predict_cv0(), predict_cv00(), predict_cv1(), or predict_cv2(), and containing the following items:

training: data.frame Training dataset
test: data.frame Test dataset

geno

data.frame It corresponds to a geno element within an object of class METData.

num_pcs

integer Maximal number of principal components to extract.

Value

pc_values A data.frame containing the principal components in columns and the names of all lines used in the study is contained in the first column 'geno_ID'. PCs for the lines present in the test set were computed based on the transformation done on the training set.

Author

Cathy C. Westhues cathy.jubin@uni-goettingen.de