solarius R packagedat30 data sethelp(dat30, package = "solarius")
# load library
library(solarius)
# load data set
data(dat30)
# univariate polygenic model
mod1 <- solarPolygenic(trait1 ~ 1, dat30)
# bivariate polygenic model
mod2 <- solarPolygenic(trait1 + trait2 ~ 1, dat30,
polygenic.options = '-testrhoe -testrhog')
# specify directory with IBD matrices and run linkage model
mibddir <- system.file('extdata', 'solarOutput',
'solarMibdsCsv', package = 'solarius')
link <- solarMultipoint(trait1 ~ 1, dat30,
mibddir = mibddir, chr = 5)
# run association model in parallel
assoc <- solarAssoc(trait1 ~ 1, dat30, cores = 2,
snpcovdata = genocovdat30, snpmap = mapdat30)
Estimation of the heritability of trait1 (mod1 model) is 0.83 \(\pm\) 0.10 with p-value 1.20e-10.
Estimation of the genetic correlation between trait1 and trait2 (mod2 model) is 0.97 \(\pm\) 0.04 with p-value 2.25e-09.
Estimation of the environmental correlation between trait1 and trait2 (mod2 model) is 0.41 \(\pm\) 0.20 with p-value 0.14.
The highest LOD score in the linkage analysis of trait1 on Chromosome 5 is 3.56 at 3 cM position.
The only significant SNP in the association analysis of trait1 on a set of 100 (synthetic) SNPs is snp_86 with p-value 9.10e-05.
The model with covariates can be fitted in the following way.
mod0 <- solarPolygenic(trait1 ~ age + sex, dat30, covtest = TRUE)
print method applied to mod0 shows that none of the covariates is significant.
mod0
##
## Call: solarPolygenic(formula = trait1 ~ age + sex, data = dat30, covtest = TRUE)
##
## File polygenic.out:
## Pedigree: dat.ped
## Phenotypes: dat.phe
## Trait: trait1 Individuals: 174
##
## H2r is 0.8061621 p = 6.1167535e-10 (Significant)
## H2r Std. Error: 0.1100465
##
## age p = 0.9753082 (Not Sig., but fixed)
## sex p = 0.1455341 (Not Sig., but fixed)
##
## Proportion of Variance Due to All Final Covariates Is
## 0.0330070
##
## Loglikelihoods and chi's are in trait1/polygenic.logs.out
## Best model is named poly and null0
## Final models are named poly, spor, nocovar
## Initial sporadic and polygenic models are s0 and p0
## Constrained covariate models are named no<covariate name>
##
## Residual Kurtosis is -0.3603, within normal range
The test statistics and p-values for the covariates are stored in cf (ccovariate frame) slot.
mod0$cf
## covariate Estimate SE Chi pval
## 1 mean 7.7823346631 0.33112424 NA NA
## 2 age 0.0004591117 0.01483782 0.0010 0.9753082
## 3 sex -0.4559509166 0.31250456 2.1184 0.1455341
The estimations of heritabilities in mod0 and mod1 models are slightly different.
mod1
##
## Call: solarPolygenic(formula = trait1 ~ 1, data = dat30)
##
## File polygenic.out:
## Pedigree: dat.ped
## Phenotypes: dat.phe
## Trait: trait1 Individuals: 174
##
## H2r is 0.8342730 p = 1.2005208e-10 (Significant)
## H2r Std. Error: 0.1013512
##
##
## Loglikelihoods and chi's are in trait1/polygenic.logs.out
## Best model is named poly and null0
## Final models are named poly, spor
##
## Residual Kurtosis is -0.4279, within normal range
The print method applied to mod2 shows estimation of the main parameters of the model.
mod2
##
## Call: solarPolygenic(formula = trait1 + trait2 ~ 1, data = dat30, polygenic.options = "-testrhoe -testrhog")
##
## File polygenic.out:
## Pedigree: dat.ped
## Phenotypes: dat.phe
## Trait: trait1 trait2 Individuals: 174
##
## H2r(trait1) is 0.8218823
## H2r(trait1) Std. Error: 0.1053258
##
## H2r(trait2) is 0.6270026
## H2r(trait2) Std. Error: 0.1158107
##
## RhoE is 0.4120487 p = 0.1424023
## RhoE Std. Error: 0.1969379
##
## RhoG is 0.9728759
## RhoG Std. Error: 0.0374442
##
##
## RhoG different from zero p = 2.2463156e-09
## RhoG different from 1.0 p = 0.2098041
## Derived Estimate of RhoP is 0.8045957
##
##
## Loglikelihoods and chi's are in trait1.trait2/polygenic.logs.out
## Best model is named poly and null0
## Final models are named poly, spor
The same values printed above are extracted and saved into vcf (variance component frame) slot .
mod2$vcf
## varcomp Estimate SE zscore
## 1 h2r(trait1) 0.8218823 0.10532575 7.803242
## 2 e2(trait1) 0.1781177 0.10532575 1.691112
## 3 h2r(trait2) 0.6270026 0.11581068 5.414031
## 4 e2(trait2) 0.3729974 0.11581068 3.220751
## 5 rhog 0.9728759 0.03744417 25.982039
## 6 rhoe 0.4120487 0.19693793 2.092277
The information about the test statistic and p-values are given in lf (likelihood frame) slot.
mod2$lf
## model loglik Chi deg pval
## 1 sporadic -384.5535 NA NA NA
## 2 polygenic -363.8273 NA NA NA
## 3 rhoe0 -364.9032 2.151806 1 0.142402300000000
## 4 rhog0 -381.7010 35.747400 1 0.000000002246316
## 5 rhog1 NA 0.651412 1 0.209804100000000
Table of the results with LOD scores and marker information is stored in lodf (LOD scores frame) slost.
link$lodf
## chr pos LOD Loglike H2r H2q1
## 1 5 0 1.4888 -208.503 0.448464 0.403780
## 2 5 1 2.4232 -206.352 0.244560 0.614310
## 3 5 2 3.2759 -204.388 0.123184 0.742126
## 4 5 3 3.5569 -203.741 0.097187 0.763608
## 5 5 4 3.1610 -204.653 0.179701 0.668166
## 6 5 5 2.4435 -206.305 0.372614 0.466113
Summary method reports the maximum LOD score.
summary(link)
##
## Call: solarMultipoint(formula = trait1 ~ 1, data = dat30, mibddir = mibddir,
## chr = 5)
##
## Multipoint model
## * Number of used markers: 6
## * Number of passes: 1
## * Maximum LOD score: 3.56
## -- chr: 5
## -- position: 3 cM
The plot method graphically shows the results.
plot(link)
summary method applies the Bonferroni correction with a given value of alpha (the default value is 0.05).
summary(assoc, alpha = 0.05)
##
## Call: solarAssoc(formula = trait1 ~ 1, data = dat30, snpcovdata = genocovdat30,
## snpmap = mapdat30, cores = 2)
##
## Association model
## * Number of SNPs: 100
## * Input format: snpcovdata
## * Number of significal SNPs: 1 (Bonferroni correction with alpha 0.05)
## SNP NAv chi pSNP bSNP bSNPse Varexp est_maf
## 1: snp_86 174 15.30657 0.000091 0.917024 0.234391 0.049701 0.524075
## est_mac dosage_sd pos chr
## 1: 182.3781 0.582544 22883925 1
Table of the results is stored in snpf (SNP frame) slot.
head(assoc$snpf)
## SNP NAv chi pSNP bSNP bSNPse Varexp est_maf
## 1: snp_1 174 0.226982 0.633771 -0.128559 0.269840 0.004560 0.526714
## 2: snp_10 174 0.634540 0.425695 -0.214068 0.268733 0.000000 0.482020
## 3: snp_100 174 0.917648 0.338093 0.245690 0.256478 0.000000 0.506701
## 4: snp_11 174 2.094976 0.147784 -0.383326 0.264837 0.033637 0.497552
## 5: snp_12 174 0.007396 0.931466 -0.020153 0.234334 0.001120 0.503480
## 6: snp_13 174 1.455744 0.227608 0.283974 0.235362 0.013418 0.469853
## est_mac dosage_sd pos chr
## 1: 183.2965 0.547291 2105324 1
## 2: 167.7430 0.583412 2637674 1
## 3: 176.3319 0.588925 22914702 1
## 4: 173.1481 0.574507 2640705 1
## 5: 175.2110 0.596207 7447088 1
## 6: 163.5090 0.598493 7447236 1
The default plot method is Manhattan plot. The black dashed line shows the Bonferroni correction level at a given value of alpha (the default value is 0.05).
plotManh(assoc, alpha = 0.01)
This call is equavalent to plot(assoc, alpha = 0.01), as plot method applied to assoc object calls plotManh function.
The second plot available for assoc object is QQ-plot. The inflation parameter \(\lambda\) (median method) is computed and dipicted at the bottom of the figure.
plotQQ(assoc)
This call is equavalent to plot(assoc, "qq"), as plot method (applied to assoc object) with the second argument equal to "qq" calls plotQQ function.
The procedure of SNPs annotation is possible via rsnps R package (loaded by annotate function invisibly to the user).
tab <- annotate(assoc)
tab
## Null data.table (0 rows and 0 cols)
Annotation of the synthetic SNPs generated for dat30 data sets returns an empty data frame, since the names expected by rsnps R package must begin with rs symbols.
Meantime, annotation of the genome-wide significant SNPs from GWAS of FXIc phenotype in GAIT1 data set (real data) is accomplished as following.
snps <- c("rs710446", "rs4241824", "rs4253399")
tab <- annotate(snps)
tab
## Query Chromosome Marker Class Gene Alleles Major Minor MAF
## 1: rs710446 3 rs710446 snp KNG1 A/G A G 0.4153
## 2: rs4241824 4 rs4241824 snp F11 A/G A G 0.4129
## 3: rs4253399 4 rs4253399 snp F11 G/T T G 0.2638
## BP
## 1: 186742137
## 2: 186270632
## 3: 186266939
In this case annotate function takes a vector of characters as its first argument.
sessionInfo()
## R version 3.2.2 (2015-08-14)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 14.04.3 LTS
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] solarius_0.3.1 rmarkdown_0.8 knitr_1.11 devtools_1.9.1
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.2 compiler_3.2.2 formatR_1.2.1 plyr_1.8.3
## [5] bitops_1.0-6 iterators_1.0.7 tools_3.2.2 digest_0.6.8
## [9] jsonlite_0.9.19 evaluate_0.8 memoise_0.2.1 gtable_0.1.2
## [13] foreach_1.4.2 curl_0.9.4 yaml_2.1.13 parallel_3.2.2
## [17] qqman_0.1.2 proto_0.3-10 stringr_1.0.0 httr_1.0.0
## [21] grid_3.2.2 data.table_1.9.6 R6_2.1.1 XML_3.98-1.3
## [25] ggplot2_1.0.1 reshape2_1.4.1 magrittr_1.5 scales_0.3.0
## [29] codetools_0.2-14 htmltools_0.2.6 rsnps_0.1.6 MASS_7.3-44
## [33] colorspace_1.2-6 labeling_0.3 stringi_1.0-1 RCurl_1.95-4.7
## [37] doParallel_1.0.8 munsell_0.4.2 chron_2.3-47