HelmholtzZentrum munich
WCMC
Linkage Disequilibrium
Variant clipboard (0) reset
Release Notes

SNiPA v3.2 (March 2017)

Genome assembly: GRCh37.p13
Ensembl version: 87
1000 genomes: phase 3 version 5

This is only a minor update. It comprises updated variant-phenotype associations and annotations as contained in the Ensembl 87 release, as well as new data on mQTL and pQTL associations (see below).

Element associations/annotations

mQTL data

We updated SNiPA to contain new data from the metabolomics GWAS server and of two additional studies (Draisma et al. 2015, Long et al. 2017). SNiPA version 3.2 now includes more than half a million associations with metabolite concentrations from two biofluids (blood and urine).

pQTL data

We updated SNiPA to contain pQTL data from our proteomics GWAS server that is based on the largest pGWAS in blood to date (Suhre et al. 2017). SNiPA version 3.2 now includes almost 15,000 cis- and trans-associations with blood protein levels.

Variant conservation and deleteriousness scores

We updated SNiPA to contain the most recent version (v1.3) of the CADD score. We also used updated versions of phyloP and phastCons based on sequences of 100 vertebrate species with alignments to the current GRCh38 genome assembly (mapped back to GRCh37 using our variant-based mapping procedure).

Variant associations

Source N (unique) Reference
HGMD 61,489 (55,484) PMID: 24077912
dbGaP 40,253 (28,766) PMID: 17898773
ClinVar 190,165 (174,435) PMID: 24234437
OMIM variation 22,129 (20,501) http://omim.org/
UniProt 18,805 (17,575) PMID: 24253303
GWAS Catalog 26,193 (18,769) PMID: 19474294
DrugBank - 4.2 179 (169) PMID: 24203711




SNiPA v3.1 (October/November 2015)

Genome assembly: GRCh37.p13
Ensembl version: 82
1000 genomes: phase 3 version 5

This is only a minor update. It comprises updated variant-phenotype associations and annotations as contained in the Ensembl 82 release, as well as the new release of GTEx (see below). As of this version, we include the association data of the new GWAS Catalog at EMBL-EBI.

Element associations/annotations

GTEx eQTL associations (V6)

We updated SNiPA to contain the new V6 release of the GTEx project. SNiPA version 3.1 now includes about 20 mio. significant associations from GTEx across 44 tissues. Please refer to http://gtexportal.org/home/documentationPage for information on GTEx data release and publication policy.

Variant associations

Source N (unique) Reference
HGMD - 2015 53,420 (48,305) PMID: 24077912
dbGaP - October, 9th, 2015 40,254 (28,767) PMID: 17898773
ClinVar - October, 9th, 2015 156,160 (139,160) PMID: 24234437
OMIM variation - October, 9th, 2015 19,878 (18,442) http://omim.org/
UniProt - October, 9th, 2015 3,484 (3,219) PMID: 24253303
GWAS Catalog - October, 9th, 2015 19,950 (18,769) PMID: 19474294
DrugBank - 4.2 179 (169) PMID: 24203711




SNiPA v3 (June 2015)

Genome assembly: GRCh37.p13
Ensembl version: 80
1000 genomes: phase 3 version 5

Bug fixes

eQTL mapping

There was a minor bug in our probe mapping script that mapped array address IDs to probe IDs. Affected data sets were the associations reported by Westra et al. and Innocenti et al. The bug was fixed and as of Ensembl version 80, the associations should all be reported correctly. We thank ME Reyes for reporting this bug!

CADD scores

As the development of SNiPA started with 1000 genomes phase 1 version 3 data, we then used the precompiled CADD data set for 1000 genomes. With 1000 genomes phase 3 version 5, the variant count was more than doubled and the "new" variants were not contained in the provided CADD data file leading to a large amount of variants for which no CADD scores where available. In SNiPA version 3, we downloaded the genome-wide data set from the CADD website and retrieved allele-matched scores for all variants contained in SNiPA. We thank AM Nissen for reporting this bug!

New annotation data

GTEx eQTL associations

Several SNiPA-users asked if we could include the eQTL associations from the GTEx project. In SNiPA version 3, we included significant associations from GTEx release 4 data for 13 tissues. Please refer to http://gtexportal.org/home/documentationPage for information on GTEx data release and publication policy.

SnpEff effect impact

Ensembl's variant effect predictor (VEP) now includes SnpEff effect impact predictions (details). We added the prediction denoted as "effect impact" to the basic features table contained in SNiPAcards.

New features

Custom association maps

We were asked if it would be possible to customize the association maps feature to be able to create own figures of association results. Therefore, we added this as a new feature in the Association Maps module, including a howto and example input.

SNiPA annotation updates

SNiPA GeneBuild: Ngenes = 59,413 (based on GENCODE 22)
SNiPA RegBuild: Nelements = 1,471,812 (now includes FANTOM5 permissive promoters)

VEP-included utility versions:

PolyPhen: v2.2.2
SIFT: v5.2.2

Element associations/annotations

Gene associations/annotations - updated on June 3rd, 2015

Source N (unique) Reference
DECIPHER 1,829 (1,829) http://decipher.sanger.ac.uk/
OMIM gene 4,886 (4,882) http://omim.org/
OrphaNet 5,684 (5,684) http://orpha.net/

Variant associations

Source N (unique) Reference
HGMD - 2014.4 41,077 (36,807) PMID: 24077912
dbGaP - June 12th, 2015 41,426 (28,824) PMID: 17898773
ClinVar - June 12th, 2015 98,289 (85,835) PMID: 24234437
OMIM variation - June 12th, 2015 18,854 (17,586) http://omim.org/
UniProt - June 12th, 2015 3,399 (3,210) PMID: 24253303
GWAS Catalog - June 12th, 2015 17,703 (16,635) PMID: 19474294
DrugBank - 4.2 179 (169) PMID: 24203711




SNiPA v2.1 (January 2015)

Important notes for this release

Merged rs identifiers

We have included the variant identifiers contained in dbSNP's rsMergeArch table, so users can search for rs identifiers that were merged into newer rs numbers. SNiPA cards and the tooltips in the interactive plots now list these alias rs identifiers in addition to the one currently assigned to each variant. Also, an column labeled "RSALIAS" was added to the genomic data sets (available for download here).

Linkage Disequilibrium Plot

Users may optionally specify variants that should be highlighted in the linkage disequilibrium plots.




SNiPA v2 (November 2014)

Genome assembly: GRCh37
Ensembl version: 77
1000 genomes: phase 3 version 5

Important notes for this release

Genome assembly

SNiPA's data model is fully position-based. It would therefore be possible to update to the new GRCh38 genome assembly. However, as most annotation sets contained in SNiPA are not yet available for this assembly, we decided to stick to the old but fully annotated GRCh37 assembly in this version. This has some implications which are referred to in the following sections.

SNiPA gene build

The current Ensembl gene build (GENCODE 21) is based on GRCh38, a mapping to GRCh37 is not provided. We updated gene information as known from the previous SNiPA version, used the UCSC liftOver tool (command line executable) to convert genome coordinates, and retained all genes that could be mapped to the old genome assembly. The new SNiPA gene build contains 59,006 entries (Ndiff to SNiPA v1 = +1,764).

SNiPA regulatory build

The current Ensembl regulatory build (ENCODE) is based on GRCh38, a mapping to GRCh37 is not provided for batch retrieval. We updated information on regulatory elements from ENCODE as known from the previous SNiPA version, used the UCSC liftOver tool (command line executable) to convert genome coordinates, and retained all elements that could be mapped to the old genome assembly. Combined with the known datasets for promoter and enhancer regions, the new SNiPA regulatory build contains 1,127,068 elements (Ndiff to SNiPA v1 = -109,063). The lower number of regulatory elements is due to the new regulatory build of Ensembl where several regulatory clusters have been merged.

SNiPA variant set

We have used the new 1000 genomes release (phase 3 version 5) in SNiPA v2. This release has more than doubled the number of available variants. We want to point the user to the data usage policy of the 1000 genomes project.

dbSNP identifiers are not yet completely included in the 1000 genomes data files, but autosomal variants have already been integrated in dbSNP build 142. We downloaded dbSNP mapped to GRCh37 and merged both data sets. As X-chromosomal 1000 genomes markers were released after dbSNP 142 was created, we only supply those variants which could be mapped to a dbSNP rs-identifier.

populationrs-count
African (AFR)39,581,182
American (AMR)26,474,088
East Asian (EAS)22,128,163
European (EUR)22,541,970
South Asian (SAS)24,854,259
total (unique)78,471,927

SNiPA allows to combine all annotation releases with all variant sets. However, annotation data of SNiPA v1 (Ensembl v. 75) is not available for the new variant set. Therefore, if you use 1000 genomes phase 3 version 5 data and combine it with Ensembl 75 annotations, then variants not contained in phase 1 version 3 will show no annotations.

SNiPA annotation updates

The Ensembl VEP tool only features full annotation data for the new GRCh38 genome assembly. Therefore, SNiPA's annotation workflow had to be adjusted to provide both all the annotation data from the previous release and simultaneously a mapping to the newest gene and regulatory builds of Ensembl. To achieve that, we did effect predictions for both assemblies. Our custom annotation program then merged the VEP output for both assemblies using the variant mapping provided by dbSNP build 142 to again yield a full SNiPA build for GRCh37.

All phenotype annotation sets have been updated to Ensembl version 77. OrphaData and the GWAS Catalog have been accessed at October 20th 2014.

Variant associations & annotations:

Source N (unique) Reference
HGMD 35,326 (31,770) PMID: 24077912
dbGaP 41,426 (28,824) PMID: 17898773
ClinVar 89,522 (87,923) PMID: 24234437
OMIM variation 9,595 (8,968) http://omim.org/
UniProt 3,573 (3,366) PMID: 24253303
GWAS Catalog 16,342 (15,343) PMID: 19474294
DrugBank 4.0 179 (169) PMID: 24203711
Gene associations:

Source N (unique) Reference
DECIPHER 1,795 (1,795) http://decipher.sanger.ac.uk/
OMIM gene 5,055 (5,051) http://omim.org/
OrphaNet 5,684 (5,684) http://orpha.net/

SNiPA data release

We have decided to make all data contained in SNiPA available to the community. Please refer to the README for details on folder structure and data formats.

Data access




SNiPA v1 (July 2014)

Genome assembly: GRCh37
Ensembl version: 75
1000 genomes: phase 1 version 3

This is the original release of SNiPA as described in the original publication and the documentation.

SNiPA variant set

This version contains all bi-allelic variants present in 1000 genomes project, phase 1 version 3. These are the variant counts for the individual superpopulations:

populationrs-count
African (AFR)25,837,142
American (AMR)20,097,916
Asian (ASN)15,012,236
European (EUR)17,361,202

EnsEMBL release 75

SNiPA includes a total count of 57,241 genes with 195,586 associated transcripts and 104,763 protein products and >500,000 regulatory feature clusters, some of which are associated with JASPAR motifs.
Variant associations & annotations:

Source

N (unique)

Reference

HGMD

93,758 (86,491)

PMID: 24077912

dbGaP

41,181 (33,819)

PMID: 17898773

ClinVar

47,315 (44,141)

PMID: 24234437

OMIM variation

19,911 (19,184)

http://omim.org/

UniProt

5,055 (4,850)

PMID: 24253303

GWAS Catalog

15,500 (14,513)

PMID: 19474294

DrugBank 4.0

179 (169)

PMID: 24203711

Gene associations:

Source

N (unique)

Reference

DECIPHER

2,144 (2,143)

http://decipher.sanger.ac.uk/

OMIM gene

5,775 (5,775)

http://omim.org/

OrphaNet

5,705 (5,675)

http://orpha.net/