Introduction
These pages contain descriptions of the standards and guidelines that deCODE applies to ensure high quality reporting of genetic risk.
These guidelines are equivalent to the clinical validation and the analytical validation steps for genetic testing, i.e. they explain our criteria for selecting the genetic sequence variants (SNPs), and our procedures to ensure correct measurements of these variants and explain the basic statistical model used to calculate the genetic risk.
Selection of risk associated genetic variants (SNPs)
Replication in independent populations
Genetic risk variants are most commonly discovered in so called case-control studies, i.e. where deviations in the genetic code are observed between a set of patients and a set of healthy controls. It is important that the association of SNP markers with a particular disease be widely replicated in independent populations from different medical centers or countries. Otherwise, there will be concern that the initial observation is not applicable beyond the study population, or more likely, is incorrect false positive risk association. Since there is always a certain uncertainty in the risk estimate in each study, the definition of a successful replication does mean that the difference in the estimates is within the statistical confidence intervals. Only studies that are statistically well powered and published in respected scientific journals are accepted. When results from multiple populations are combined, the odds-ratio is either derived using a standard statistical procedure based on the Mantel-Haenszel model or the most informative study is taken as a reference.
Replication in at least two independent equivalently defined populations of patients and controls is the most important principle of clinical validation.
Ethnicity variability
Most of the disease associating markers currently used to assess risk have first been discovered and replicated in white populations of European descent, but some of the markers have also been replicated in other ethnic groups. Since the risk of a given variant can differ substantially between ethnic groups, independent replication and risk assessment must be carried out for each ethnic group. Thus, for all diseases, ethnicity-specific reports should be issued based on markers that have been validated in those major ethnic groups such as whites of European descent, Asians (East Asians), African-American, and Hispanic whites.
Identification of the disease associated risk allele
In various publications, the identity of the SNP risk allele can be ambiguous. This problem arises for certain SNP variant combinations (A/T and G/C) because the reverse complimentary nature of DNA and the fact that the reference genome sequence may not have been stable when the SNP was defined. If the particular region of the genome has been reversed, between reference builds of the genome at the time of discovery and the time of application, chances are that the risk allele has been switched. So called TOP/BOT method, defined by Illumina, that uses the actual SNP polymorphism (the contextual sequence surrounding the SNP) to designate the allele and strand provides a remedy to this problem. Nevertheless, selection of the correct risk allele is a major quality control step in the development of our genetic tests. We compare the reported allele frequencies in publications with information based on our measurements in our extensive proprietary research database and with the public HapMap database. The SNP alleles reported by us are however according to the dbSNP strand definition of the marker.
Selection of surrogate markers
In some instances when SNP chips are used for genotyping, the disease associating marker described in the literature is not present on the chip. Often, a surrogate marker that correlates strongly with the risk marker can be used to assess the risk. Only markers that have excellent correlation with the reported disease associated marker (r2 greater than 0.98 based on the appropriate ethnic dataset) are used. In some publications, specific odds-ratios for the suboptimal surrogates are reported as well. In such case, these odds-ratios are used to determine the risk, otherwise, the odds-ratios are assumed to be identical. In selected diseases, deCODE uses a non-chip based technology (e.g. the Nanogen Centaurus assays) to capture key risk SNPs that are not properly represented on the chip platform.

