TogoVar datasets

Human genome reference sequence

Variant datasets derived from individual genome data in Japanese Genotype-phenotype Archive (JGA)

Dataset name Version/Last update Sample size Number of detected variants Number of variants after the exclusion Author
JGA-NGS 2018/06/01 125 13,338,968 4,679,025 NBDC
JGA-SNP 2018/06/01 183,884 1,966,919 1,249,724 NBDC

Note that variants with 5 alternative alleles or less were excluded from JGA-NGS and JGA-SNP datasets.

Variant datasets generated by third parties

Dataset name Version/Last update Sample size Number of variants Number of variant sites Author
ClinVar 2019/06/24 443,213 NCBI
Exome Aggregation Consortium (ExAC) Release 1 (2017/02/27) 60,706 10,195,872 9,362,319 Broad Institute
Human Genetic Variation Database (HGVD) Version 2.30 (2017/08/02) 1,208 554,461 501,556 Kyoto University
ToMMo 3.5KJPNv2 Allele Frequency Panel (3.5KJPN) v20181105open; Unfiltered 3,552 64,675,495 60,816,012 Tohoku Medical Megabank Organization

Note 1: TogoVar contains ClinVar variants only in the VCF file, GRCh37 positions of which were determined.

Note 2 : 3.5KJPNv2 consists of SNVs and INDELSs on Autosome, chrX (PAR1+PAR2+XTR) and chrMT. See Summary of ToMMo 3.5KJPNv2 (v20181105; Unfiltered).

Datasets other than those for variants

Dataset name Version/Last update Description Author
Colil 2018/01/29 Information on citation relationships in life sciences literature DBCLS
PubTator 2018/04/23 Information on papers in which the names of variants appear NCBI
TogoGenome hg19+Ensembl96 Various information on genomes DBCLS

Tools for data processing

Name Ver. Description Author
bcftools Normalize indels and split multiallelic sites into biallelic variants Genome Research Ltd.
BioReT Execute programs for variant discovery from NGS data in proper order Amelieff
Variant Effect Predictor (VEP) Ver.96 Add annotations like gene names or consequences to variants EMBL-EBI