1 dat30 data set

help(dat30, package = "solarius")

2 All code in one block

# load library
library(solarius)

# load data set
data(dat30)

# univariate polygenic model
mod1 <- solarPolygenic(trait1 ~ 1, dat30)

# bivariate polygenic model
mod2 <- solarPolygenic(trait1 + trait2 ~ 1, dat30,
  polygenic.options = '-testrhoe -testrhog')

# specify directory with IBD matrices and run linkage model
mibddir <- system.file('extdata', 'solarOutput',
  'solarMibdsCsv', package = 'solarius') 
link <- solarMultipoint(trait1 ~ 1, dat30,
  mibddir = mibddir, chr = 5)

# run association model in parallel
assoc <- solarAssoc(trait1 ~ 1, dat30, cores = 2,
  snpcovdata = genocovdat30, snpmap = mapdat30)

3 Results

Estimation of the heritability of trait1 (mod1 model) is 0.83 \(\pm\) 0.10 with p-value 1.20e-10.

Estimation of the genetic correlation between trait1 and trait2 (mod2 model) is 0.97 \(\pm\) 0.04 with p-value 2.25e-09.

Estimation of the environmental correlation between trait1 and trait2 (mod2 model) is 0.41 \(\pm\) 0.20 with p-value 0.14.

The highest LOD score in the linkage analysis of trait1 on Chromosome 5 is 3.56 at 3 cM position.

The only significant SNP in the association analysis of trait1 on a set of 100 (synthetic) SNPs is snp_86 with p-value 9.10e-05.

4 Polygenic model (univariate)

The model with covariates can be fitted in the following way.

mod0 <- solarPolygenic(trait1 ~ age + sex, dat30, covtest = TRUE)

print method applied to mod0 shows that none of the covariates is significant.

mod0
## 
## Call: solarPolygenic(formula = trait1 ~ age + sex, data = dat30, covtest = TRUE)
## 
## File polygenic.out:
##  Pedigree:    dat.ped 
##  Phenotypes:  dat.phe 
##  Trait:       trait1                Individuals:  174 
##  
##           H2r is 0.8061621  p = 6.1167535e-10  (Significant) 
##         H2r Std. Error:  0.1100465 
##  
##                                       age  p = 0.9753082  (Not Sig., but fixed) 
##                                       sex  p = 0.1455341  (Not Sig., but fixed) 
##  
##  Proportion of Variance Due to All Final Covariates Is 
##                0.0330070 
##  
##  Loglikelihoods and chi's are in trait1/polygenic.logs.out 
##  Best model is named poly and null0 
##  Final models are named poly, spor, nocovar 
##  Initial sporadic and polygenic models are s0 and p0 
##  Constrained covariate models are named no<covariate name> 
##  
##  Residual Kurtosis is -0.3603, within normal range

The test statistics and p-values for the covariates are stored in cf (ccovariate frame) slot.

mod0$cf
##   covariate      Estimate         SE    Chi      pval
## 1      mean  7.7823346631 0.33112424     NA        NA
## 2       age  0.0004591117 0.01483782 0.0010 0.9753082
## 3       sex -0.4559509166 0.31250456 2.1184 0.1455341

The estimations of heritabilities in mod0 and mod1 models are slightly different.

mod1
## 
## Call: solarPolygenic(formula = trait1 ~ 1, data = dat30)
## 
## File polygenic.out:
##  Pedigree:    dat.ped 
##  Phenotypes:  dat.phe 
##  Trait:       trait1                Individuals:  174 
##  
##           H2r is 0.8342730  p = 1.2005208e-10  (Significant) 
##         H2r Std. Error:  0.1013512 
##  
##  
##  Loglikelihoods and chi's are in trait1/polygenic.logs.out 
##  Best model is named poly and null0 
##  Final models are named poly, spor 
##  
##  Residual Kurtosis is -0.4279, within normal range

5 Polygenic model (bivariate)

The print method applied to mod2 shows estimation of the main parameters of the model.

mod2
## 
## Call: solarPolygenic(formula = trait1 + trait2 ~ 1, data = dat30, polygenic.options = "-testrhoe -testrhog")
## 
## File polygenic.out:
##  Pedigree:    dat.ped 
##  Phenotypes:  dat.phe 
##  Trait:       trait1 trait2         Individuals:  174 
##  
##           H2r(trait1) is 0.8218823   
##         H2r(trait1) Std. Error:  0.1053258 
##  
##           H2r(trait2) is 0.6270026   
##         H2r(trait2) Std. Error:  0.1158107 
##  
##           RhoE is 0.4120487  p = 0.1424023 
##         RhoE Std. Error:  0.1969379 
##  
##           RhoG is 0.9728759 
##         RhoG Std. Error:  0.0374442 
##  
##  
##         RhoG different from zero  p = 2.2463156e-09 
##         RhoG different from 1.0   p = 0.2098041 
##         Derived Estimate of RhoP is 0.8045957 
##  
##  
##  Loglikelihoods and chi's are in trait1.trait2/polygenic.logs.out 
##  Best model is named poly and null0 
##  Final models are named poly, spor

The same values printed above are extracted and saved into vcf (variance component frame) slot .

mod2$vcf
##       varcomp  Estimate         SE    zscore
## 1 h2r(trait1) 0.8218823 0.10532575  7.803242
## 2  e2(trait1) 0.1781177 0.10532575  1.691112
## 3 h2r(trait2) 0.6270026 0.11581068  5.414031
## 4  e2(trait2) 0.3729974 0.11581068  3.220751
## 5        rhog 0.9728759 0.03744417 25.982039
## 6        rhoe 0.4120487 0.19693793  2.092277

The information about the test statistic and p-values are given in lf (likelihood frame) slot.

mod2$lf
##       model    loglik       Chi deg              pval
## 1  sporadic -384.5535        NA  NA                NA
## 2 polygenic -363.8273        NA  NA                NA
## 3     rhoe0 -364.9032  2.151806   1 0.142402300000000
## 4     rhog0 -381.7010 35.747400   1 0.000000002246316
## 5     rhog1        NA  0.651412   1 0.209804100000000

6 Linkage model

Table of the results with LOD scores and marker information is stored in lodf (LOD scores frame) slost.

link$lodf
##   chr pos    LOD  Loglike      H2r     H2q1
## 1   5   0 1.4888 -208.503 0.448464 0.403780
## 2   5   1 2.4232 -206.352 0.244560 0.614310
## 3   5   2 3.2759 -204.388 0.123184 0.742126
## 4   5   3 3.5569 -203.741 0.097187 0.763608
## 5   5   4 3.1610 -204.653 0.179701 0.668166
## 6   5   5 2.4435 -206.305 0.372614 0.466113

Summary method reports the maximum LOD score.

summary(link)
## 
## Call: solarMultipoint(formula = trait1 ~ 1, data = dat30, mibddir = mibddir, 
##     chr = 5)
## 
## Multipoint model
##  * Number of used markers: 6 
##  * Number of passes: 1 
##  * Maximum LOD score: 3.56 
##   -- chr: 5 
##   -- position: 3 cM

The plot method graphically shows the results.

plot(link)

7 Asssociation model

summary method applies the Bonferroni correction with a given value of alpha (the default value is 0.05).

summary(assoc, alpha = 0.05)
## 
## Call: solarAssoc(formula = trait1 ~ 1, data = dat30, snpcovdata = genocovdat30, 
##     snpmap = mapdat30, cores = 2)
## 
## Association model
##  * Number of SNPs: 100 
##  * Input format: snpcovdata 
##  * Number of significal SNPs: 1 (Bonferroni correction with alpha 0.05)
##       SNP NAv      chi     pSNP     bSNP   bSNPse   Varexp  est_maf
## 1: snp_86 174 15.30657 0.000091 0.917024 0.234391 0.049701 0.524075
##     est_mac dosage_sd      pos chr
## 1: 182.3781  0.582544 22883925   1

Table of the results is stored in snpf (SNP frame) slot.

head(assoc$snpf)
##        SNP NAv      chi     pSNP      bSNP   bSNPse   Varexp  est_maf
## 1:   snp_1 174 0.226982 0.633771 -0.128559 0.269840 0.004560 0.526714
## 2:  snp_10 174 0.634540 0.425695 -0.214068 0.268733 0.000000 0.482020
## 3: snp_100 174 0.917648 0.338093  0.245690 0.256478 0.000000 0.506701
## 4:  snp_11 174 2.094976 0.147784 -0.383326 0.264837 0.033637 0.497552
## 5:  snp_12 174 0.007396 0.931466 -0.020153 0.234334 0.001120 0.503480
## 6:  snp_13 174 1.455744 0.227608  0.283974 0.235362 0.013418 0.469853
##     est_mac dosage_sd      pos chr
## 1: 183.2965  0.547291  2105324   1
## 2: 167.7430  0.583412  2637674   1
## 3: 176.3319  0.588925 22914702   1
## 4: 173.1481  0.574507  2640705   1
## 5: 175.2110  0.596207  7447088   1
## 6: 163.5090  0.598493  7447236   1

The default plot method is Manhattan plot. The black dashed line shows the Bonferroni correction level at a given value of alpha (the default value is 0.05).

plotManh(assoc, alpha = 0.01)

This call is equavalent to plot(assoc, alpha = 0.01), as plot method applied to assoc object calls plotManh function.

The second plot available for assoc object is QQ-plot. The inflation parameter \(\lambda\) (median method) is computed and dipicted at the bottom of the figure.

plotQQ(assoc)

This call is equavalent to plot(assoc, "qq"), as plot method (applied to assoc object) with the second argument equal to "qq" calls plotQQ function.

The procedure of SNPs annotation is possible via rsnps R package (loaded by annotate function invisibly to the user).

tab <- annotate(assoc)
tab
## Null data.table (0 rows and 0 cols)

Annotation of the synthetic SNPs generated for dat30 data sets returns an empty data frame, since the names expected by rsnps R package must begin with rs symbols.

Meantime, annotation of the genome-wide significant SNPs from GWAS of FXIc phenotype in GAIT1 data set (real data) is accomplished as following.

snps <- c("rs710446", "rs4241824", "rs4253399")
tab <- annotate(snps)
tab
##        Query Chromosome    Marker Class Gene Alleles Major Minor    MAF
## 1:  rs710446          3  rs710446   snp KNG1     A/G     A     G 0.4153
## 2: rs4241824          4 rs4241824   snp  F11     A/G     A     G 0.4129
## 3: rs4253399          4 rs4253399   snp  F11     G/T     T     G 0.2638
##           BP
## 1: 186742137
## 2: 186270632
## 3: 186266939

In this case annotate function takes a vector of characters as its first argument.

8 R session info

sessionInfo()
## R version 3.2.2 (2015-08-14)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 14.04.3 LTS
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] solarius_0.3.1 rmarkdown_0.8  knitr_1.11     devtools_1.9.1
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.2      compiler_3.2.2   formatR_1.2.1    plyr_1.8.3      
##  [5] bitops_1.0-6     iterators_1.0.7  tools_3.2.2      digest_0.6.8    
##  [9] jsonlite_0.9.19  evaluate_0.8     memoise_0.2.1    gtable_0.1.2    
## [13] foreach_1.4.2    curl_0.9.4       yaml_2.1.13      parallel_3.2.2  
## [17] qqman_0.1.2      proto_0.3-10     stringr_1.0.0    httr_1.0.0      
## [21] grid_3.2.2       data.table_1.9.6 R6_2.1.1         XML_3.98-1.3    
## [25] ggplot2_1.0.1    reshape2_1.4.1   magrittr_1.5     scales_0.3.0    
## [29] codetools_0.2-14 htmltools_0.2.6  rsnps_0.1.6      MASS_7.3-44     
## [33] colorspace_1.2-6 labeling_0.3     stringi_1.0-1    RCurl_1.95-4.7  
## [37] doParallel_1.0.8 munsell_0.4.2    chron_2.3-47

9 License

This document is licensed under the Creative Commons Attribution 4.0 International Public License.

Creative Commons License