TogoVar datasets

Variant frequencies for which you can apply for use of individual-level data∗1 to the NBDC human databases∗2

Click the links at the Included controlled-access datasets to apply for use of individual-level data

Variant dataset nameAnalysis methodTarget populationHealthy subjectsAffected subjectsSample sizeNumber of variants
(# of sites)
Included controlled-access datasets
GEM Japan Whole Genome Aggregation (GEM-J WGA) PanelWGSJapanese7,60995,863,463
(90,280,248)
6 datasets
JGA-NGSWESJapanese1254,679,025
7 datasets
JGA-SNPSNP-ChipJapanese183,8841,249,7243 datasets

∗1:fastq/bam/cel files and/or lists of genotype data etc.
∗2:Japanese Genotype-phenotype Archive (JGA) / AMED Genome group sharing Database (AGD)

Other variant datasets

Variant dataset nameAnalysis methodTarget populationHealthy subjectsAffected subjectsSample sizeNumber of variants
(# of sites)
AuthorVersion/Last updated
ClinVar-Mixed674,792NCBI2020/06/04
Exome Aggregation Consortium (ExAC)WESMixed60,70610,195,872
(9,362,319)
Broad InstituteRelease 1 (2017/02/27)
Human Genetic Variation Database (HGVD)WESJapanese1,208554,461
(501,556)
Kyoto UniversityVersion 2.30
(2017/08/02)
ToMMo 4.7KJPN Allele Frequency Panel(4.7KJPN)WGSJapanese4,77374,494,394
(70,101,710)
Tohoku Medical Megabank Organizationv20190826

Note 1: TogoVar contains ClinVar variants only in the VCF file, GRCh37 positions of which were determined.
Note 2 : 4.7KJPN consists of SNVs (Autosome, chrX(PAR1+PAR2+XTR) and chrMT) and INDELs (Autosome and chrX(PAR1+PAR2+XTR)). See Summary of ToMMo 4.7KJPN.

Non-variant datasets

Dataset nameVersion/Last updateDescriptionAuthor
Colil2019/01/25Information on citation relationships in life sciences literatureDBCLS
GRCh37.p132013/06/28Human genome reference sequenceGRC
HGNC symbol report2020/03/30Approved human gene nomenclature and associated gene informationHGNC
LitVarObtained by APIInformation on papers in which the names of variants appearNCBI
PubTatorCentral2020/05/09Information on papers in which the names of variants appearNCBI
TogoGenomeComprehensive information on genomesDBCLS

Tools for data processing

NameVer.DescriptionAuthor
bcftoolsNormalize indels and split multiallelic sites into biallelic variantsGenome Research Ltd.
BioReTExecute programs for variant discovery from NGS data in proper orderAmelieff
Variant Effect Predictor (VEP)Ensembl release 100Addannotations like gene names or consequences to variantsEMBL-EBI