solarius
R packagedat30
data sethelp(dat30, package = "solarius")
# load library
library(solarius)
# load data set
data(dat30)
# univariate polygenic model
mod1 <- solarPolygenic(trait1 ~ 1, dat30)
# bivariate polygenic model
mod2 <- solarPolygenic(trait1 + trait2 ~ 1, dat30,
polygenic.options = '-testrhoe -testrhog')
# specify directory with IBD matrices and run linkage model
mibddir <- system.file('extdata', 'solarOutput',
'solarMibdsCsv', package = 'solarius')
link <- solarMultipoint(trait1 ~ 1, dat30,
mibddir = mibddir, chr = 5)
# run association model in parallel
assoc <- solarAssoc(trait1 ~ 1, dat30, cores = 2,
snpcovdata = genocovdat30, snpmap = mapdat30)
Estimation of the heritability of trait1
(mod1
model) is 0.83 \(\pm\) 0.10 with p-value 1.20e-10.
Estimation of the genetic correlation between trait1
and trait2
(mod2
model) is 0.97 \(\pm\) 0.04 with p-value 2.25e-09.
Estimation of the environmental correlation between trait1
and trait2
(mod2
model) is 0.41 \(\pm\) 0.20 with p-value 0.14.
The highest LOD score in the linkage analysis of trait1
on Chromosome 5 is 3.56 at 3 cM position.
The only significant SNP in the association analysis of trait1
on a set of 100 (synthetic) SNPs is snp_86 with p-value 9.10e-05.
The model with covariates can be fitted in the following way.
mod0 <- solarPolygenic(trait1 ~ age + sex, dat30, covtest = TRUE)
print
method applied to mod0
shows that none of the covariates is significant.
mod0
##
## Call: solarPolygenic(formula = trait1 ~ age + sex, data = dat30, covtest = TRUE)
##
## File polygenic.out:
## Pedigree: dat.ped
## Phenotypes: dat.phe
## Trait: trait1 Individuals: 174
##
## H2r is 0.8061621 p = 6.1167535e-10 (Significant)
## H2r Std. Error: 0.1100465
##
## age p = 0.9753082 (Not Sig., but fixed)
## sex p = 0.1455341 (Not Sig., but fixed)
##
## Proportion of Variance Due to All Final Covariates Is
## 0.0330070
##
## Loglikelihoods and chi's are in trait1/polygenic.logs.out
## Best model is named poly and null0
## Final models are named poly, spor, nocovar
## Initial sporadic and polygenic models are s0 and p0
## Constrained covariate models are named no<covariate name>
##
## Residual Kurtosis is -0.3603, within normal range
The test statistics and p-values for the covariates are stored in cf
(ccovariate frame) slot.
mod0$cf
## covariate Estimate SE Chi pval
## 1 mean 7.7823346631 0.33112424 NA NA
## 2 age 0.0004591117 0.01483782 0.0010 0.9753082
## 3 sex -0.4559509166 0.31250456 2.1184 0.1455341
The estimations of heritabilities in mod0
and mod1
models are slightly different.
mod1
##
## Call: solarPolygenic(formula = trait1 ~ 1, data = dat30)
##
## File polygenic.out:
## Pedigree: dat.ped
## Phenotypes: dat.phe
## Trait: trait1 Individuals: 174
##
## H2r is 0.8342730 p = 1.2005208e-10 (Significant)
## H2r Std. Error: 0.1013512
##
##
## Loglikelihoods and chi's are in trait1/polygenic.logs.out
## Best model is named poly and null0
## Final models are named poly, spor
##
## Residual Kurtosis is -0.4279, within normal range
The print
method applied to mod2
shows estimation of the main parameters of the model.
mod2
##
## Call: solarPolygenic(formula = trait1 + trait2 ~ 1, data = dat30, polygenic.options = "-testrhoe -testrhog")
##
## File polygenic.out:
## Pedigree: dat.ped
## Phenotypes: dat.phe
## Trait: trait1 trait2 Individuals: 174
##
## H2r(trait1) is 0.8218823
## H2r(trait1) Std. Error: 0.1053258
##
## H2r(trait2) is 0.6270026
## H2r(trait2) Std. Error: 0.1158107
##
## RhoE is 0.4120487 p = 0.1424023
## RhoE Std. Error: 0.1969379
##
## RhoG is 0.9728759
## RhoG Std. Error: 0.0374442
##
##
## RhoG different from zero p = 2.2463156e-09
## RhoG different from 1.0 p = 0.2098041
## Derived Estimate of RhoP is 0.8045957
##
##
## Loglikelihoods and chi's are in trait1.trait2/polygenic.logs.out
## Best model is named poly and null0
## Final models are named poly, spor
The same values printed above are extracted and saved into vcf
(variance component frame) slot .
mod2$vcf
## varcomp Estimate SE zscore
## 1 h2r(trait1) 0.8218823 0.10532575 7.803242
## 2 e2(trait1) 0.1781177 0.10532575 1.691112
## 3 h2r(trait2) 0.6270026 0.11581068 5.414031
## 4 e2(trait2) 0.3729974 0.11581068 3.220751
## 5 rhog 0.9728759 0.03744417 25.982039
## 6 rhoe 0.4120487 0.19693793 2.092277
The information about the test statistic and p-values are given in lf
(likelihood frame) slot.
mod2$lf
## model loglik Chi deg pval
## 1 sporadic -384.5535 NA NA NA
## 2 polygenic -363.8273 NA NA NA
## 3 rhoe0 -364.9032 2.151806 1 0.142402300000000
## 4 rhog0 -381.7010 35.747400 1 0.000000002246316
## 5 rhog1 NA 0.651412 1 0.209804100000000
Table of the results with LOD scores and marker information is stored in lodf
(LOD scores frame) slost.
link$lodf
## chr pos LOD Loglike H2r H2q1
## 1 5 0 1.4888 -208.503 0.448464 0.403780
## 2 5 1 2.4232 -206.352 0.244560 0.614310
## 3 5 2 3.2759 -204.388 0.123184 0.742126
## 4 5 3 3.5569 -203.741 0.097187 0.763608
## 5 5 4 3.1610 -204.653 0.179701 0.668166
## 6 5 5 2.4435 -206.305 0.372614 0.466113
Summary method reports the maximum LOD score.
summary(link)
##
## Call: solarMultipoint(formula = trait1 ~ 1, data = dat30, mibddir = mibddir,
## chr = 5)
##
## Multipoint model
## * Number of used markers: 6
## * Number of passes: 1
## * Maximum LOD score: 3.56
## -- chr: 5
## -- position: 3 cM
The plot method graphically shows the results.
plot(link)
summary
method applies the Bonferroni correction with a given value of alpha
(the default value is 0.05).
summary(assoc, alpha = 0.05)
##
## Call: solarAssoc(formula = trait1 ~ 1, data = dat30, snpcovdata = genocovdat30,
## snpmap = mapdat30, cores = 2)
##
## Association model
## * Number of SNPs: 100
## * Input format: snpcovdata
## * Number of significal SNPs: 1 (Bonferroni correction with alpha 0.05)
## SNP NAv chi pSNP bSNP bSNPse Varexp est_maf
## 1: snp_86 174 15.30657 0.000091 0.917024 0.234391 0.049701 0.524075
## est_mac dosage_sd pos chr
## 1: 182.3781 0.582544 22883925 1
Table of the results is stored in snpf
(SNP frame) slot.
head(assoc$snpf)
## SNP NAv chi pSNP bSNP bSNPse Varexp est_maf
## 1: snp_1 174 0.226982 0.633771 -0.128559 0.269840 0.004560 0.526714
## 2: snp_10 174 0.634540 0.425695 -0.214068 0.268733 0.000000 0.482020
## 3: snp_100 174 0.917648 0.338093 0.245690 0.256478 0.000000 0.506701
## 4: snp_11 174 2.094976 0.147784 -0.383326 0.264837 0.033637 0.497552
## 5: snp_12 174 0.007396 0.931466 -0.020153 0.234334 0.001120 0.503480
## 6: snp_13 174 1.455744 0.227608 0.283974 0.235362 0.013418 0.469853
## est_mac dosage_sd pos chr
## 1: 183.2965 0.547291 2105324 1
## 2: 167.7430 0.583412 2637674 1
## 3: 176.3319 0.588925 22914702 1
## 4: 173.1481 0.574507 2640705 1
## 5: 175.2110 0.596207 7447088 1
## 6: 163.5090 0.598493 7447236 1
The default plot method is Manhattan plot. The black dashed line shows the Bonferroni correction level at a given value of alpha
(the default value is 0.05).
plotManh(assoc, alpha = 0.01)
This call is equavalent to plot(assoc, alpha = 0.01)
, as plot
method applied to assoc
object calls plotManh
function.
The second plot available for assoc
object is QQ-plot. The inflation parameter \(\lambda\) (median method) is computed and dipicted at the bottom of the figure.
plotQQ(assoc)
This call is equavalent to plot(assoc, "qq")
, as plot
method (applied to assoc
object) with the second argument equal to "qq"
calls plotQQ
function.
The procedure of SNPs annotation is possible via rsnps
R package (loaded by annotate
function invisibly to the user).
tab <- annotate(assoc)
tab
## Null data.table (0 rows and 0 cols)
Annotation of the synthetic SNPs generated for dat30
data sets returns an empty data frame, since the names expected by rsnps
R package must begin with rs symbols.
Meantime, annotation of the genome-wide significant SNPs from GWAS of FXIc phenotype in GAIT1 data set (real data) is accomplished as following.
snps <- c("rs710446", "rs4241824", "rs4253399")
tab <- annotate(snps)
tab
## Query Chromosome Marker Class Gene Alleles Major Minor MAF
## 1: rs710446 3 rs710446 snp KNG1 A/G A G 0.4153
## 2: rs4241824 4 rs4241824 snp F11 A/G A G 0.4129
## 3: rs4253399 4 rs4253399 snp F11 G/T T G 0.2638
## BP
## 1: 186742137
## 2: 186270632
## 3: 186266939
In this case annotate
function takes a vector of characters as its first argument.
sessionInfo()
## R version 3.2.2 (2015-08-14)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 14.04.3 LTS
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] solarius_0.3.1 rmarkdown_0.8 knitr_1.11 devtools_1.9.1
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.2 compiler_3.2.2 formatR_1.2.1 plyr_1.8.3
## [5] bitops_1.0-6 iterators_1.0.7 tools_3.2.2 digest_0.6.8
## [9] jsonlite_0.9.19 evaluate_0.8 memoise_0.2.1 gtable_0.1.2
## [13] foreach_1.4.2 curl_0.9.4 yaml_2.1.13 parallel_3.2.2
## [17] qqman_0.1.2 proto_0.3-10 stringr_1.0.0 httr_1.0.0
## [21] grid_3.2.2 data.table_1.9.6 R6_2.1.1 XML_3.98-1.3
## [25] ggplot2_1.0.1 reshape2_1.4.1 magrittr_1.5 scales_0.3.0
## [29] codetools_0.2-14 htmltools_0.2.6 rsnps_0.1.6 MASS_7.3-44
## [33] colorspace_1.2-6 labeling_0.3 stringi_1.0-1 RCurl_1.95-4.7
## [37] doParallel_1.0.8 munsell_0.4.2 chron_2.3-47