SimHap is a comprehensive modelling framework and a multiple-imputation approach to haplotypic analysis of population-based data.


You will need to download both the SimHap R package relevant to your operating system and the Java-based installer.

Please refer to the User Manual (which we also suggest you download) for further details.


When inferring haplotypes for individuals with ambiguous phase (such as phase unknown genotype data), uncertainty is inherent.

SimHap uses biallelic SNP genotype data to impute haplotype frequencies at the individual level. SimHap also tests for haplotype associations with outcomes of interest while incorporating the uncertainty around inferred haplotypes into the modelling procedure.

SimHap allows both single SNP and haplotype association analyses of Normal, binary, longitudinal and right-censored outcomes under a range of genetic models. The software can accommodate large data sets, and can model genetic and environmental effects, including complex haplotype:environment interactions.

SimHap features cross-platform functionality via Java, and a sophisticated graphical user interface (GUI), so you need not have a comprehensive knowledge of statistical modelling or command line operation to perform complex analyses. This approach uses current estimation-maximisation based methods for the estimation of haplotypes from unphased genotype data1 and incorporates multiple-imputation techniques to model haplotypic associations in population-based samples.

SimHap will also perform association analysis on more simple epidemiological models, with or without the inclusion of single-SNPs or haplotypes. The current implementation of SimHap utilises a package written for the statistical computing package R2 to resolve haplotypes and provide their posterior probabilities; all possible haplotype configurations are resolved for each individual within the program itself, and the posterior probability of each configuration calculated.

This information is then passed into either a generalised-linear modelling, linear mixed effects or Cox proportional hazards framework where (using multiple-imputation to deal with the uncertainty around imputing haplotypes) association tests are performed.

  1. Excoffier L, Slatkin M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol Biol Evol 1995; 12: 921-7.
  2. Ihaka R, Gentleman R. R: A Language for Data Analysis and Graphics. Journal of Computational and Graphical Statistics 1996; 5(3): 299-314.

Keep up to date with the latest information about our software.

Further information

Contact details

For further information contact

How to reference SimHap

If you find this tool useful, or use any graphics produced in publications, please refer to:

Carter KW, McCaskie PA, Palmer LJ (2008). SimHap GUI: An intuitive graphical user interface for genetic association analysis. BMC Bioinformatics 2008 Dec 25;9(1):557

Version history

A full version history for SimHap is available