Jung-Ying Tzeng

SIMreg: Similarity Regression

NEW: SIMreg is now available in c for performing gene-environment interaction tests for quantitative traits and binary traits. See the download section (at the end of the page) to download it.

SIMreg is available as an R package. The package contains a user manual and help text for the SIMreg functions. We also provide the shell scripts and instructions for performing similarity regression (SIMreg).

SIMreg is a tool to perform maker-set association analysis. Association analysis at gene, pathway, or exon levels (here by marker-set analysis) hold great promise in evaluating modest etiological effects of genes with GWAS or sequence data. However, currently available methods target detection of either rare or common variants but not both, assume additive and same-direction effects for loci within a marker set, use test-based frameworks that cannot accommodate covariates such as population structure, and do not have the capacity to assess interaction effects. SIMreg provides a flexible, powerful and computationally efficient alternative for conducting marker-set analysis. It has the following features that distinguish it from other methods.

  1. The method uses genetic similarity to aggregate information across markers, and incorporates adaptive weights depending on allele frequencies to accommodate rare and common variants.
  2. Collapsing information at the similarity level instead of genotype level bypasses the worry of cancelling signals of opposite etiological effects, and is applicable on any class of genetic variant without having to dichotomize the allele types.
  3. It is regression-based, naturally incorporates covariates, and is applicable to both observed and imputed (dosage) genotypes.
  4. We use a rigorous analytical derivation to demonstrate that collapsing information through similarity status explicitly captures the locus-locus interactions among all markers in a set.
  5. It provides a series of test statistics that can be used to assess (a) marginal genetic main effect (G test), (b) gene-environment interaction effects (GxE test), or (c) the joint effects of both types simultaneously. These tests do not require permutations to assess significance, and are fast to compute.
SIMreg is an extension of (incorporates all features and functions of) HSreg.

The methods implemented in this software are described in the following papers.

Haplotype-Based Association Analysis via Variance-Components Score Test  Tzeng and Zhang 2007 AJHG
Gene-Trait Similarity Regression for Multimarker-Based Association Analysis  Tzeng et al. 2009 Biometrics
Detecting gene and gene-environment effects of common and uncommon variants on quantitative traits: A marker-set approach using gene-trait similarity regression  Tzeng et al. 2011 AJHG
Assessing gene-environment interactions for common and rare variants with binary traits using gene-trait similarity regression  Zhao et al. 2015 Genetics


C package Download

(For genetic main effect tests and GxE tests with either binary or quantitative traits)

SIMreg C Package for Unix-based systems (readme file included).

R package Download

(For genetic main effect tests for binary or quantitative traits and GxE test for quantitative traits)

SIMreg; R Package User Manual.

For pre 2.14 versions of R

SIMreg 1.31: R Package for Unix-based systems.

SIMreg 1.31: R Package for Windows.

For 2.14, later 2.XX, 3.0 and later versions of R

SIMreg 1.32: R Package for Unix-based systems.

SIMreg 1.32: R Package for Windows.