TogoVar datasets

Variant frequencies for which you can apply for use of individual-level data∗1 to the NBDC human databases∗2

Click the links at the Included controlled-access datasets to apply for use of individual-level data

Variant dataset name Analysis method Target population Healthy subjects Affected subjects Sample size Number of variants
(# of sites)
Included controlled-access datasets
GEM Japan Whole Genome Aggregation (GEM-J WGA) Panel WGS Japanese 7,609 95,863,463
(90,280,248)
6 datasets
JGA-NGS WES Japanese 125 4,679,025
7 datasets
JGA-SNP SNP-Chip Japanese 183,884 1,249,724 3 datasets

∗1:fastq/bam/cel files and/or lists of genotype data etc.
∗2:Japanese Genotype-phenotype Archive (JGA) / AMED Genome group sharing Database (AGD)

Other variant datasets

Variant dataset name Analysis method Target population Healthy subjects Affected subjects Sample size Number of variants
(# of sites)
Author Version/Last updated
ClinVar - Mixed 674,792 NCBI 2020/06/04
Exome Aggregation Consortium (ExAC) WES Mixed 60,706 10,195,872
(9,362,319)
Broad Institute Release 1 (2017/02/27)
Human Genetic Variation Database (HGVD) WES Japanese 1,208 554,461
(501,556)
Kyoto University Version 2.30
(2017/08/02)
ToMMo 4.7KJPN Allele Frequency Panel(4.7KJPN) WGS Japanese 4,773 74,494,394
(70,101,710)
Tohoku Medical Megabank Organization v20190826

Note 1: TogoVar contains ClinVar variants only in the VCF file, GRCh37 positions of which were determined.
Note 2 : 4.7KJPN consists of SNVs (Autosome, chrX(PAR1+PAR2+XTR) and chrMT) and INDELs (Autosome and chrX(PAR1+PAR2+XTR)). See Summary of ToMMo 4.7KJPN.

Non-variant datasets

Dataset name Version/Last update Description Author
Colil 2019/01/25 Information on citation relationships in life sciences literature DBCLS
GRCh37.p13 2013/06/28 Human genome reference sequence GRC
HGNC symbol report 2020/03/30 Approved human gene nomenclature and associated gene information HGNC
LitVar Obtained by API Information on papers in which the names of variants appear NCBI
PubTatorCentral 2020/05/09 Information on papers in which the names of variants appear NCBI
TogoGenome Comprehensive information on genomes DBCLS

Tools for data processing

Name Ver. Description Author
bcftools Normalize indels and split multiallelic sites into biallelic variants Genome Research Ltd.
BioReT Execute programs for variant discovery from NGS data in proper order Amelieff
Variant Effect Predictor (VEP) Ensembl release 100 Add annotations like gene names or consequences to variants EMBL-EBI