dbNSFP version 4.9a

Release:
	August 8, 2024
	
Major sources:
	Variant determination:
		Gencode release 29/Ensembl 94, released October, 2018 (hg38)
	Functional predictions:
		SIFT ensembl 66, released Jan, 2015 http://provean.jcvi.org/index.php
		SIFT4G 2.4, released Nov. 1, 2016 http://sift.bii.a-star.edu.sg/sift4g/public//Homo_sapiens/
		PROVEAN 1.1 ensembl 66, released Jan, 2015 http://provean.jcvi.org/index.php
		Polyphen-2 v2.2.2, released Feb, 2012 http://genetics.bwh.harvard.edu/pph2/
		LRT, released November, 2009 http://www.genetics.wustl.edu/jflab/lrt_query.html
		MutationTaster 2, data retrieved in 2015 http://www.mutationtaster.org/
		MutationAssessor release 3, http://mutationassessor.org/
		FATHMM v2.3, http://fathmm.biocompute.org.uk
		fathmm-MKL, http://fathmm.biocompute.org.uk/fathmmMKL.htm
		fathmm-XF, http://fathmm.biocompute.org.uk/fathmm-xf/
		CADD v1.7, http://cadd.gs.washington.edu/
		VEST v4.0, http://karchinlab.org/apps/appVest.html
		fitCons v1.01, http://compgen.bscb.cornell.edu/fitCons/
		LINSIGHT, http://compgen.cshl.edu/~yihuang/LINSIGHT/
		DANN, https://cbcl.ics.uci.edu/public_data/DANN/
		MetaSVM and MetaLR, doi: 10.1093/hmg/ddu733
		GenoCanyon v1.0.3, http://genocanyon.med.yale.edu/index.html
		Eigen & Eigen PC v1.1, http://www.columbia.edu/~ii2135/eigen.html
		M-CAP v1.3, http://bejerano.stanford.edu/MCAP/
		REVEL release May 3, 2021, https://sites.google.com/site/revelgenomics/
		MutPred v1.2, http://mutpred.mutdb.org/
		MVP 1.0, https://github.com/ShenLab/missense
		MPC release1, ftp://ftp.broadinstitute.org/pub/ExAC_release/release1/regional_missense_constraint/
		PrimateAI, https://github.com/Illumina/PrimateAI
		deogen2, https://deogen2.mutaframe.com/
		ALoFT 1.0, http://aloft.gersteinlab.org/
		BayesDel v1, http://fengbj-laboratory.org/BayesDel/BayesDel.html
		ClinPred, https://sites.google.com/site/clinpred/home
		LIST-S2 Release: 2019_10, https://precomputed.list-s2.msl.ubc.ca/
		MetaRNN v1.0, http://www.liulab.science/metarnn.html
		gMVP, https://github.com/ShenLab/gMVP/
		VARITY, http://varity.varianteffect.org/
		ESM1b, https://huggingface.co/spaces/ntranoslab/esm_variants/tree/main
		EVE, https://evemodel.org/
		AlphaMissense, https://console.cloud.google.com/storage/browser/dm_alphamissense
		PHACTboost, https://github.com/CompGenomeLab/PHACTboost
		MutFormer, https://github.com/WGLab/mutformer
		MutScore, https://iob-genetic.shinyapps.io/mutscore/
	Conservation scores:
		phyloP100way_vertebrate (hg38) http://hgdownload.soe.ucsc.edu/goldenPath/hg38/phyloP100way/
		phyloP470way_mammalian (hg38) https://hgdownload.soe.ucsc.edu/goldenPath/hg38/phyloP470way/
		phyloP17way_primate (hg38) http://hgdownload.soe.ucsc.edu/goldenPath/hg38/phyloP17way/
		phastCons100way_vertebrate (hg38) http://hgdownload.soe.ucsc.edu/goldenPath/hg38/phastCons100way/
		phastCons470way_mammalian (hg38) https://hgdownload.cse.ucsc.edu/goldenpath/hg38/phastCons470way/
		phastCons17way_primate (hg38) http://hgdownload.soe.ucsc.edu/goldenPath/hg38/phastCons17way/
		GERP++ http://mendel.stanford.edu/SidowLab/downloads/gerp/
		GERP_91_mammals https://ftp.ensembl.org/pub/current_compara/conservation_scores/91_mammals.gerp_conservation_score/
		SiPhy https://www.broadinstitute.org/mammals-models/29-mammals-project-supplementary-info
		bStatistic http://cadd.gs.washington.edu/
	Other variant annotation sources:
		Interpro v71 http://www.ebi.ac.uk/interpro/
		1000 Genomes project http://www.1000genomes.org/
		ESP http://evs.gs.washington.edu/EVS/
		dbSNP b156 (hg38) https://ftp.ncbi.nih.gov/snp/archive/b156/VCF/GCF_000001405.40.gz
		clinvar release 20240805 (hg38) ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/
		ExAC v0.3 http://exac.broadinstitute.org/
		gnomAD exome v4.0.0 http://gnomad.broadinstitute.org/downloads
		gnomAD genome v4.0.0 http://gnomad.broadinstitute.org/downloads
		ALFA (Allele Frequency Aggregator) release 2 https://www.ncbi.nlm.nih.gov/snp/docs/gsr/alfa/
		UK10K COHORT http://www.uk10k.org/studies/cohorts.html
		Ancestral alleles (hg38) ftp://ftp.ensembl.org/pub/release-93/fasta/ancestral_alleles/homo_sapiens_ancestor_GRCh38.tar.gz
		Altai Neanderthal genotypes: http://cdna.eva.mpg.de/neandertal/Vindija/VCF/Altai/
		Denisova genotypes: http://cdna.eva.mpg.de/neandertal/Vindija/VCF/Denisova/
		Vindija33.19 genotypes: http://cdna.eva.mpg.de/neandertal/Vindija/VCF/Vindija33.19/
		Chagyrskaya genotype: http://cdna.eva.mpg.de/neandertal/Chagyrskaya/VCF/
		GTEx v8 https://www.gtexportal.org/home/datasets
		Geuvadis https://www.ebi.ac.uk/Tools/geuvadis-das/
		eQTLGen 2019-12-23 https://www.eqtlgen.org/phase1.html
	Other gene annotation sources:
		HGNC, downloaded on October 21, 2018
		Uniprot, Release 2019_01
		IntAct, downloaded on November 30, 2018
		GWAS catalog, r2018-11-26
		egenetics and GNF/Atlas expression data, downloaded from BioMart on Oct. 1, 2013
		BioGRID, version 3.5.167
		Haploinsufficiency probability data, from doi:10.1371/journal.pgen.1001154
		Recessive probability data, from DOI:10.1126/science.1215040
		Residual Variation Intolerance Score (RVIS), v3  http://genic-intolerance.org/
		Genome-wide haploinsufficiency score (GHIS), from doi: 10.1093/nar/gkv474
		ExAC Functional Gene Constraint, from release0.3.1
		ExAC CNV gene score, from release0.3.1
		GO, downloaded on December 6, 2018
		ConsensusPathDB, Release 33
		Essential genes, from doi:10.1371/journal.pgen.1003484, doi: 10.1126/science.aac7041, doi: 10.1016/j.cell.2015.11.015, doi: 10.1126/science.aac7557, doi:10.1371/journal.pcbi.1002886
		Mouse genes, from Mouse Genome Informatics (MGI), 6.13 
		Zebrafish genes, from The Zebrafish Information Network (ZFIN), downloaded on December 7,2018
		KEGG pathway, from http://www.openbioinformatics.org/gengen/tutorial_calculate_gsea.html
		BioCarta pathway, from http://www.openbioinformatics.org/gengen/tutorial_calculate_gsea.html
		GDI, from doi: 10.1073/pnas.1518646112
		LoFtool, from DOI:10.1093/bioinformatics/btv602
		SORVA, from doi: 10.1101/103218
		HIPred, from doi:10.1093/bioinformatics/btx028
		HPO, data release 20200608, https://hpo.jax.org/app/download/annotation
		

Files:
	dbNSFP4.9a_variant.chr<#>.gz      - gzipped dbNSFP variant database files by chromosomes
	dbNSFP4.9_gene.gz                 - gzipped dbNSFP gene database file
	dbNSFP4.9_gene.complete.gz        - gzipped dbNSFP gene database file with complete interaction columns
	dbNSFP4.9a.readme.txt             - this file
	search_dbNSFP49a.jar              - companion GUI Java program for searching dbNSFP4.9a
	search_dbNSFP49a.class            - companion command-line Java program for searching dbNSFP4.9a
	LICENSE.txt                       - the license for using the source code
	search_dbNSFP49a.readme.pdf       - README file for search_dbNSFP48a.class
	tryhg19.in                        - an example input file with hg19 genome positions
	tryhg18.in                        - an example input file with hg18 genome positions
	tryhg38.in                        - an example input file with hg38 genome positions
	try.vcf                           - an example of vcf input file


Description:
	The dbNSFP is an integrated database of functional annotations from multiple 
	sources for the comprehensive collection of human non-synonymous SNPs (nsSNVs). 
	Its current version includes a total of 84,013,490 nsSNVs and ssSNVs (splice site
	SNVs). It compiles prediction scores from 32 prediction algorithms (SIFT, SIFT4G, 
	Polyphen2-HDIV, Polyphen2-HVAR, LRT, MutationTaster2, MutationAssessor, FATHMM, MetaSVM, 
	MetaLR, CADD, VEST4, PROVEAN, FATHMM-MKL coding, FATHMM-XF coding, fitCons, LINSIGHT, 
	DANN, GenoCanyon, Eigen, Eigen-PC, M-CAP, REVEL, MutPred, MVP, MPC, PrimateAI, GEOGEN2, BayesDel, 
	ClinPred, LIST-S2, ALoFT), 9 conservation scores (bStatistic, phyloP100way_vertebrate, 
	phyloP30way_mammal, phyloP17way_primate, phastCons100way_vertebrate, phastCons30way_mammal, 
	phastCons17way_primate, GERP++ and SiPhy) and other function annotations. 
	Since version 2.0, dbNSFP is separated into two parts, dbNSFP_variant and 
	dbNSFP_gene. As their names indicate, the former focuses on variant annotations 
	(including prediction scores and conservation scores), and the latter focuses on 
	gene annotations.
	Since version 2.6, dbscSNV is added as an attached database, which includes all 
	potential human SNVs within splicing consensus regions (−3 to +8 at the 5’ splice site 
	and −12 to +2 at the 3’ splice site), i.e. scSNVs, and predictions for their potential 
	of altering splicing. 
	Since version 3, two branches of dbNSFP are provided: "a" branch is suitable for academic use, 
	which includes all the resources, and "c" branch is suitable for commercial use, which does not 
	include Polyphen2, VEST, REVEL, CADD, LINSIGHT, GenoCanyon.
	

Columns of dbNSFP_variant:
1	chr: chromosome number
2	pos(1-based): physical position on the chromosome as to hg38 (1-based coordinate).
		For mitochondrial SNV, this position refers to the rCRS (GenBank: NC_012920). 
3	ref: reference nucleotide allele (as on the + strand)
4	alt: alternative nucleotide allele (as on the + strand)
5	aaref: reference amino acid
		"." if the variant is a splicing site SNP (2bp on each end of an intron)
6	aaalt: alternative amino acid
		"." if the variant is a splicing site SNP (2bp on each end of an intron)
7	rs_dbSNP: rs number from dbSNP
8	hg19_chr: chromosome as to hg19, "." means missing
9	hg19_pos(1-based): physical position on the chromosome as to hg19 (1-based coordinate).
		For mitochondrial SNV, this position refers to a YRI sequence (GenBank: AF347015)
10	hg18_chr: chromosome as to hg18, "." means missing
11	hg18_pos(1-based): physical position on the chromosome as to hg18 (1-based coordinate)
		For mitochondrial SNV, this position refers to a YRI sequence (GenBank: AF347015)
12	aapos: amino acid position as to the protein.
		"-1" if the variant is a splicing site SNP (2bp on each end of an intron). 
		Multiple entries separated by ";", corresponding to Ensembl_proteinid
13	genename: gene name; if the nsSNV can be assigned to multiple genes, gene names are
		separated by ";"
14	Ensembl_geneid: Ensembl gene id
15	Ensembl_transcriptid: Ensembl transcript ids (Multiple entries separated by ";")
16	Ensembl_proteinid: Ensembl protein ids
		Multiple entries separated by ";",  corresponding to Ensembl_transcriptids
17	Uniprot_acc: Uniprot accession number matching the Ensembl_proteinid
		Multiple entries separated by ";".
18	Uniprot_entry: Uniprot entry ID matching the Ensembl_proteinid
		Multiple entries separated by ";".
19	HGVSc_ANNOVAR: HGVS coding variant presentation from ANNOVAR
		Multiple entries separated by ";", corresponds to Ensembl_transcriptid
20	HGVSp_ANNOVAR: HGVS protein variant presentation from ANNOVAR
		Multiple entries separated by ";", corresponds to Ensembl_proteinid
21	HGVSc_snpEff: HGVS coding variant presentation from snpEff
		Multiple entries separated by ";", corresponds to Ensembl_transcriptid
22	HGVSp_snpEff: HGVS protein variant presentation from snpEff
		Multiple entries separated by ";", corresponds to Ensembl_proteinid
23	HGVSc_VEP: HGVS coding variant presentation from VEP
		Multiple entries separated by ";", corresponds to Ensembl_transcriptid
24	HGVSp_VEP: HGVS protein variant presentation from VEP
		Multiple entries separated by ";", corresponds to Ensembl_proteinid
25	APPRIS: APPRIS annotation for the transcripts matching Ensembl_transcriptid
		Multiple entries separated by ";". Potential values: principal1, principal2, 
		principal3, principal4, principal5, alternative1, alternative2. 
		See https://useast.ensembl.org/info/genome/genebuild/transcript_quality_tags.html
26	GENCODE_basic: Whether the transcript belongs to GENCODE_basic (5' and 3' complete
		transcripts). Multiple entries separated by ";", matching Ensembl_transcriptid.
		See https://useast.ensembl.org/info/genome/genebuild/transcript_quality_tags.html
27	TSL: Transcript Support Level.
		Multiple entries separated by ";", matching Ensembl_transcriptid.
		Potential values: 1 to 5, NA. 
		See https://useast.ensembl.org/info/genome/genebuild/transcript_quality_tags.html
28	VEP_canonical: canonical transcript used in Ensembl.
		Multiple entries separated by ";", matching Ensembl_transcriptid.
		See https://useast.ensembl.org/Help/Glossary?id=521
29	cds_strand: coding sequence (CDS) strand (+ or -)
30	refcodon: reference codon
31	codonpos: position on the codon (1, 2 or 3)
32	codon_degeneracy: degenerate type (0, 2 or 3)
33	Ancestral_allele: ancestral allele based on 8 primates EPO.
		Ancestral alleles by Ensembl 84. The following comes from its original README file:
		ACTG - high-confidence call, ancestral state supported by the other two sequences
		actg - low-confidence call, ancestral state supported by one sequence only
		N    - failure, the ancestral state is not supported by any other sequence
		-    - the extant species contains an insertion at this position
		.    - no coverage in the alignment
34	AltaiNeandertal: genotype of a deep sequenced Altai Neanderthal
35	Denisova: genotype of a deep sequenced Denisova
36	VindijiaNeandertal: genotype of a deep sequenced Vindijia Neandertal
37	ChagyrskayaNeandertal: genotype of a deep sequenced Chagyrskaya Neandertal
38	SIFT_score: SIFT score (SIFTori). Scores range from 0 to 1. The smaller the score the
		more likely the SNP has damaging effect. 
		Multiple scores separated by ";", corresponding to Ensembl_proteinid.
39	SIFT_converted_rankscore: SIFTori scores were first converted to SIFTnew=1-SIFTori,
		then ranked among all SIFTnew scores in dbNSFP. The rankscore is the ratio of 
		the rank the SIFTnew score over the total number of SIFTnew scores in dbNSFP. 
		If there are multiple scores, only the most damaging (largest) rankscore is presented.
		The rankscores range from 0.00964 to 0.91255.
40	SIFT_pred: If SIFTori is smaller than 0.05 (rankscore>0.39575) the corresponding nsSNV is
		predicted as "D(amaging)"; otherwise it is predicted as "T(olerated)". 
		Multiple predictions separated by ";"
41	SIFT4G_score: SIFT 4G score (SIFT4G). Scores range from 0 to 1. The smaller the score the
		more likely the SNP has damaging effect. 
		Multiple scores separated by ",", corresponding to Ensembl_transcriptid
42	SIFT4G_converted_rankscore: SIFT4G scores were first converted to SIFT4Gnew=1-SIFT4G,
		then ranked among all SIFT4Gnew scores in dbNSFP. The rankscore is the ratio of 
		the rank the SIFT4Gnew score over the total number of SIFT4Gnew scores in dbNSFP. 
		If there are multiple scores, only the most damaging (largest) rankscore is presented.
43	SIFT4G_pred: If SIFT4G is < 0.05 the corresponding nsSNV is
		predicted as "D(amaging)"; otherwise it is predicted as "T(olerated)". 
		Multiple scores separated by ",", corresponding to Ensembl_transcriptid
44	Polyphen2_HDIV_score: Polyphen2 score based on HumDiv, i.e. hdiv_prob.
		The score ranges from 0 to 1. 
		Multiple entries separated by ";", corresponding to Uniprot_acc.
45	Polyphen2_HDIV_rankscore: Polyphen2 HDIV scores were first ranked among all HDIV scores
		in dbNSFP. The rankscore is the ratio of the rank the score over the total number of 
		the scores in dbNSFP. If there are multiple scores, only the most damaging (largest) 
		rankscore is presented. The scores range from 0.03061 to 0.91137.
46	Polyphen2_HDIV_pred: Polyphen2 prediction based on HumDiv, "D" ("probably damaging",
		HDIV score in [0.957,1] or rankscore in [0.55859,0.91137]), "P" ("possibly damaging", 
		HDIV score in [0.454,0.956] or rankscore in [0.37043,0.55681]) and "B" ("benign", 
		HDIV score in [0,0.452] or rankscore in [0.03061,0.36974]). Score cutoff for binary 
		classification is 0.5 for HDIV score or 0.38028 for rankscore, i.e. the prediction is 
		"neutral" if the HDIV score is smaller than 0.5 (rankscore is smaller than 0.38028), 
		and "deleterious" if the HDIV score is larger than 0.5 (rankscore is larger than 
		0.38028). Multiple entries are separated by ";", corresponding to Uniprot_acc.
47	Polyphen2_HVAR_score: Polyphen2 score based on HumVar, i.e. hvar_prob.
		The score ranges from 0 to 1. 
		Multiple entries separated by ";", corresponding to Uniprot_acc.
48	Polyphen2_HVAR_rankscore: Polyphen2 HVAR scores were first ranked among all HVAR scores
		in dbNSFP. The rankscore is the ratio of the rank the score over the total number of 
		the scores in dbNSFP. If there are multiple scores, only the most damaging (largest) 
		rankscore is presented. The scores range from 0.01493 to 0.97581.
49	Polyphen2_HVAR_pred: Polyphen2 prediction based on HumVar, "D" ("probably damaging",
		HVAR score in [0.909,1] or rankscore in [0.65694,0.97581]), "P" ("possibly damaging", 
		HVAR in [0.447,0.908] or rankscore in [0.47121,0.65622]) and "B" ("benign", HVAR 
		score in [0,0.446] or rankscore in [0.01493,0.47076]). Score cutoff for binary 
		classification is 0.5 for HVAR score or 0.48762 for rankscore, i.e. the prediction 
		is "neutral" if the HVAR score is smaller than 0.5 (rankscore is smaller than 
		0.48762), and "deleterious" if the HVAR score is larger than 0.5 (rankscore is larger 
		than 0.48762). Multiple entries are separated by ";", corresponding to Uniprot_acc.
50	LRT_score: The original LRT two-sided p-value (LRTori), ranges from 0 to 1.
51	LRT_converted_rankscore: LRTori scores were first converted as LRTnew=1-LRTori*0.5 if
		Omega<1, or LRTnew=LRTori*0.5 if Omega>=1. Then LRTnew scores were ranked among all 
		LRTnew scores in dbNSFP. The rankscore is the ratio of the rank over the total number 
		of the scores in dbNSFP. The scores range from 0.00162 to 0.8433.
52	LRT_pred: LRT prediction, D(eleterious), N(eutral) or U(nknown), which is not solely
		determined by the score. 
53	LRT_Omega: estimated nonsynonymous-to-synonymous-rate ratio (Omega, reported by LRT)
54	MutationTaster_score: MutationTaster p-value (MTori), ranges from 0 to 1. 
		Multiple scores are separated by ";". Information on corresponding transcript(s) can 
		be found by querying http://www.mutationtaster.org/ChrPos.html
55	MutationTaster_converted_rankscore: The MTori scores were first converted. If the prediction
		is "A" or "D" MTnew=MTori; if the prediction is "N" or "P", MTnew=1-MTori. Then MTnew 
		scores were ranked among all MTnew scores in dbNSFP. If there are multiple scores of a 
		SNV, only the largest MTnew was used in ranking. The rankscore is the ratio of the
		rank of the score over the total number of MTnew scores in dbNSFP. The scores range
		from 0.08979 to 0.81001.
56	MutationTaster_pred: MutationTaster prediction, "A" ("disease_causing_automatic"),
		"D" ("disease_causing"), "N" ("polymorphism") or "P" ("polymorphism_automatic"). The 
		score cutoff between "D" and "N" is 0.5 for MTnew and 0.31733 for the rankscore.
57	MutationTaster_model: MutationTaster prediction models.
58	MutationTaster_AAE: MutationTaster predicted amino acid change.
59	MutationAssessor_score: MutationAssessor functional impact combined score (MAori). The
		score ranges from -5.17 to 6.49 in dbNSFP. 
		Multiple entries are separated by ";", corresponding to Uniprot_entry.
60	MutationAssessor_rankscore: MAori scores were ranked among all MAori scores in dbNSFP.
		The rankscore is the ratio of the rank of the score over the total number of MAori 
		scores in dbNSFP. The scores range from 0 to 1.
61	MutationAssessor_pred: MutationAssessor's functional impact of a variant -
		predicted functional, i.e. high ("H") or medium ("M"), or predicted non-functional,
		i.e. low ("L") or neutral ("N"). The MAori score cutoffs between "H" and "M", 
		"M" and "L", and "L" and "N", are 3.5, 1.935 and 0.8, respectively. The rankscore cutoffs 
		between "H" and "M", "M" and "L", and "L" and "N", are 0.9307, 0.52043 and 0.19675, 
		respectively.
62	FATHMM_score: FATHMM default score (weighted for human inherited-disease mutations with
		Disease Ontology) (FATHMMori). Scores range from -16.13 to 10.64. The smaller the score 
		the more likely the SNP has damaging effect.
		Multiple scores separated by ";", corresponding to Ensembl_proteinid.
63	FATHMM_converted_rankscore: FATHMMori scores were first converted to
		FATHMMnew=1-(FATHMMori+16.13)/26.77, then ranked among all FATHMMnew scores in dbNSFP. 
		The rankscore is the ratio of the rank of the score over the total number of FATHMMnew 
		scores in dbNSFP. If there are multiple scores, only the most damaging (largest) 
		rankscore is presented. The scores range from 0 to 1.
64	FATHMM_pred: If a FATHMMori score is <=-1.5 (or rankscore >=0.81332) the corresponding nsSNV
		is predicted as "D(AMAGING)"; otherwise it is predicted as "T(OLERATED)".
		Multiple predictions separated by ";", corresponding to Ensembl_proteinid.
65	PROVEAN_score: PROVEAN score (PROVEANori). Scores range from -14 to 14. The smaller the score
		the more likely the SNP has damaging effect. 
		Multiple scores separated by ";", corresponding to Ensembl_proteinid.
66	PROVEAN_converted_rankscore: PROVEANori were first converted to PROVEANnew=1-(PROVEANori+14)/28,
		then ranked among all PROVEANnew scores in dbNSFP. The rankscore is the ratio of 
		the rank the PROVEANnew score over the total number of PROVEANnew scores in dbNSFP. 
		If there are multiple scores, only the most damaging (largest) rankscore is presented.
		The scores range from 0 to 1.
67	PROVEAN_pred: If PROVEANori <= -2.5 (rankscore>=0.54382) the corresponding nsSNV is
		predicted as "D(amaging)"; otherwise it is predicted as "N(eutral)". 
		Multiple predictions separated by ";", corresponding to Ensembl_proteinid.
68	VEST4_score: VEST 4.0 score. Score ranges from 0 to 1. The larger the score the more likely
		the mutation may cause functional change. 
		Multiple scores separated by ";", corresponding to Ensembl_transcriptid.
		Please note this score is free for non-commercial use. For more details please refer to 
		http://wiki.chasmsoftware.org/index.php/SoftwareLicense. Commercial users should contact 
		the Johns Hopkins Technology Transfer office.
69	VEST4_rankscore: VEST4 scores were ranked among all VEST4 scores in dbNSFP.
		The rankscore is the ratio of the rank of the score over the total number of VEST4 
		scores in dbNSFP. In case there are multiple scores for the same variant, the largest 
		score (most damaging) is presented. The scores range from 0 to 1. 
		Please note VEST score is free for non-commercial use. For more details please refer to 
		http://wiki.chasmsoftware.org/index.php/SoftwareLicense. Commercial users should contact 
		the Johns Hopkins Technology Transfer office.
70	MetaSVM_score: Our support vector machine (SVM) based ensemble prediction score, which
		incorporated 10 scores (SIFT, PolyPhen-2 HDIV, PolyPhen-2 HVAR, GERP++, MutationTaster, 
		Mutation Assessor, FATHMM, LRT, SiPhy, PhyloP) and the maximum frequency observed in 
		the 1000 genomes populations. Larger value means the SNV is more likely to be damaging. 
		Scores range from -2 to 3 in dbNSFP.
71	MetaSVM_rankscore: MetaSVM scores were ranked among all MetaSVM scores in dbNSFP.
		The rankscore is the ratio of the rank of the score over the total number of MetaSVM 
		scores in dbNSFP. The scores range from 0 to 1.
72	MetaSVM_pred: Prediction of our SVM based ensemble prediction score,"T(olerated)" or
		"D(amaging)". The score cutoff between "D" and "T" is 0. The rankscore cutoff between
		"D" and "T" is 0.82257.
73	MetaLR_score: Our logistic regression (LR) based ensemble prediction score, which
		incorporated 10 scores (SIFT, PolyPhen-2 HDIV, PolyPhen-2 HVAR, GERP++, MutationTaster, 
		Mutation Assessor, FATHMM, LRT, SiPhy, PhyloP) and the maximum frequency observed in 
		the 1000 genomes populations. Larger value means the SNV is more likely to be damaging. 
		Scores range from 0 to 1.
74	MetaLR_rankscore: MetaLR scores were ranked among all MetaLR scores in dbNSFP. The rankscore
		is the ratio of the rank of the score over the total number of MetaLR scores in dbNSFP. 
		The scores range from 0 to 1.
75	MetaLR_pred: Prediction of our MetaLR based ensemble prediction score,"T(olerated)" or
		"D(amaging)". The score cutoff between "D" and "T" is 0.5. The rankscore cutoff between 
		"D" and "T" is 0.81101.
76	Reliability_index: Number of observed component scores (except the maximum frequency in
		the 1000 genomes populations) for MetaSVM and MetaLR. Ranges from 1 to 10. As MetaSVM 
		and MetaLR scores are calculated based on imputed data, the less missing component 
		scores, the higher the reliability of the scores and predictions. 
77	MetaRNN_score: Our recurrent neural network (RNN) based ensemble prediction score, which
		incorporated 16 scores (SIFT, Polyphen2_HDIV, Polyphen2_HVAR, MutationAssessor, PROVEAN, 
		VEST4, M-CAP, REVEL, MutPred, MVP, PrimateAI, DEOGEN2, CADD, fathmm-XF, Eigen and GenoCanyon), 
		8 conservation scores (GERP, phyloP100way_vertebrate, phyloP30way_mammalian, 
		phyloP17way_primate, phastCons100way_vertebrate, phastCons30way_mammalian, 
		phastCons17way_primate and SiPhy), and allele frequency information from the 1000 Genomes 
		Project (1000GP), ExAC, and gnomAD. Larger value means the SNV is more likely to be damaging. 
		Scores range from 0 to 1.
78	MetaRNN_rankscore: MetaRNN scores were ranked among all MetaRNN scores in dbNSFP. The rankscore
		is the ratio of the rank of the score over the total number of MetaRNN scores in dbNSFP. 
		The scores range from 0 to 1.
79	MetaRNN_pred: Prediction of our MetaRNN based ensemble prediction score,"T(olerated)" or
		"D(amaging)". The score cutoff between "D" and "T" is 0.5. The rankscore cutoff between 
		"D" and "T" is 0.6149.
80	M-CAP_score: M-CAP is hybrid ensemble score (details in DOI: 10.1038/ng.3703). Scores range 
		from 0 to 1. The larger the score the more likely the SNP has damaging effect. 
81	M-CAP_rankscore: M-CAP scores were ranked among all M-CAP scores in dbNSFP. The rankscore is
		the ratio of the rank of the score over the total number of M-CAP scores in dbNSFP.
82	M-CAP_pred: Prediction of M-CAP score based on the authors' recommendation, "T(olerated)" or
		"D(amaging)". The score cutoff between "D" and "T" is 0.025.
83	REVEL_score: REVEL is an ensemble score based on 13 individual scores for predicting the
		pathogenicity of missense variants. Scores range from 0 to 1. The larger the score the more 
		likely the SNP has damaging effect. "REVEL scores are freely available for non-commercial use.  
		For other uses, please contact Weiva Sieh" (weiva.sieh@mssm.edu)
		Multiple entries are separated by ";", corresponding to Ensembl_transcriptid.
84	REVEL_rankscore: REVEL scores were ranked among all REVEL scores in dbNSFP. The rankscore is
		the ratio of the rank of the score over the total number of REVEL scores in dbNSFP.
85	MutPred_score: General MutPred score. Scores range from 0 to 1. The larger the score the more
		likely the SNP has damaging effect.
86	MutPred_rankscore: MutPred scores were ranked among all MutPred scores in dbNSFP. The rankscore is
		the ratio of the rank of the score over the total number of MutPred scores in dbNSFP.
87	MutPred_protID: UniProt accession or Ensembl transcript ID used for MutPred_score calculation.
88	MutPred_AAchange: Amino acid change used for MutPred_score calculation.
89	MutPred_Top5features: Top 5 features (molecular mechanisms of disease) as predicted by MutPred with
		p values. MutPred_score > 0.5 and p < 0.05 are referred to as actionable hypotheses.
		MutPred_score > 0.75 and p < 0.05 are referred to as confident hypotheses.
		MutPred_score > 0.75 and p < 0.01 are referred to as very confident hypotheses.
90	MVP_score: A pathogenicity prediction score for missense variants using deep learning approach.
		The range of MVP score is from 0 to 1. The larger the score, the more likely the variant is 
		pathogenic. The authors suggest thresholds of 0.7 and 0.75 for separating damaging vs tolerant 
		variants in constrained genes (ExAC pLI >=0.5) and non-constrained genes (ExAC pLI<0.5), respectively. 
		Details see doi: http://dx.doi.org/10.1101/259390
		Multiple entries are separated by ";", corresponding to Ensembl_transcriptid.
91	MVP_rankscore: MVP scores were ranked among all MVP scores in dbNSFP. The rankscore is
		the ratio of the rank of the score over the total number of MVP scores in dbNSFP. 
92	gMVP_score: A pathogenicity prediction score for missense variants using a graph attention neural network model.
		The range of gMVP score is from 0 to 1. The larger the score, the more likely the variant is 
		pathogenic. Details see doi: https://www.nature.com/articles/s42256-022-00561-w
		Multiple entries are separated by ";", corresponding to Ensembl_transcriptid.
93	gMVP_rankscore: gMVP scores were ranked among all gMVP scores in dbNSFP. The rankscore is
		the ratio of the rank of the score over the total number of gMVP scores in dbNSFP. 
94	MPC_score: A deleteriousness prediction score for missense variants based on regional missense
		constraint. The range of MPC score is 0 to 5. The larger the score, the more likely the variant is 
		pathogenic. Details see doi: http://dx.doi.org/10.1101/148353.
		Multiple entries are separated by ";", corresponding to Ensembl_transcriptid.
95	MPC_rankscore: MPC scores were ranked among all MPC scores in dbNSFP. The rankscore is
		the ratio of the rank of the score over the total number of MPC scores in dbNSFP. 
96	PrimateAI_score: A pathogenicity prediction score for missense variants based on common variants of
		non-human primate species using a deep neural network. The range of PrimateAI score is 0 to 1. 
		The larger the score, the more likely the variant is pathogenic. The authors suggest a threshold
		of 0.803 for separating damaging vs tolerant variants. 
		Details see https://doi.org/10.1038/s41588-018-0167-z
97	PrimateAI_rankscore: PrimateAI scores were ranked among all PrimateAI scores in dbNSFP. The rankscore is
		the ratio of the rank of the score over the total number of PrimateAI scores in dbNSFP. 
98	PrimateAI_pred: Prediction of PrimateAI score based on the authors' recommendation, "T(olerated)" or
		"D(amaging)". The score cutoff between "D" and "T" is 0.803.
99	DEOGEN2_score: A deleteriousness prediction score "which incorporates heterogeneous information about
		the molecular effects of the variants, the domains involved, the relevance of the gene and the 
		interactions in which it participates". It ranges from 0 to 1. The larger the score, the more 
		likely the variant is deleterious. The authors suggest a threshold of 0.5 for separating damaging 
		vs tolerant variants.
100	DEOGEN2_rankscore: DEOGEN2 scores were ranked among all DEOGEN2 scores in dbNSFP. The rankscore is
		the ratio of the rank of the score over the total number of DEOGEN2 scores in dbNSFP. 
101	DEOGEN2_pred: Prediction of DEOGEN2 score based on the authors' recommendation, "T(olerated)" or
		"D(amaging)". The score cutoff between "D" and "T" is 0.5.
102	BayesDel_addAF_score: A deleteriousness preidction meta-score for SNVs and indels with inclusion of MaxAF.
		See https://doi.org/10.1002/humu.23158 for details. The range of the score in dbNSFP is from -1.11707 to 
		0.750927. The higher the score, the more likely the variant is pathogenic. The author suggested cutoff 
		between deleterious ("D") and tolerated ("T") is 0.0692655.
103	BayesDel_addAF_rankscore: BayesDel_addAF scores were ranked among all BayesDel_addAF scores in dbNSFP.
		The rankscore is the ratio of the rank of the score over the total number of BayesDel_addAF scores in dbNSFP. 
104	BayesDel_addAF_pred: Prediction of BayesDel_addAF score based on the authors' recommendation, "T(olerated)" or
		"D(amaging)". The score cutoff between "D" and "T" is 0.0692655.
105	BayesDel_noAF_score: A deleteriousness preidction meta-score for SNVs and indels without inclusion of MaxAF.
		See https://doi.org/10.1002/humu.23158 for details. The range of the score in dbNSFP is from -1.31914 to 0.840878.
		The higher the score, the more likely the variant is pathogenic. The author suggested cutoff between 
		deleterious ("D") and tolerated ("T") is -0.0570105.
106	BayesDel_noAF_rankscore: BayesDel_noAF scores were ranked among all BayesDel_noAF scores in dbNSFP.
		The rankscore is the ratio of the rank of the score over the total number of BayesDel_noAF scores in dbNSFP.
107	BayesDel_noAF_pred: Prediction of BayesDel_noAF score based on the authors' recommendation, "T(olerated)" or
		"D(amaging)". The score cutoff between "D" and "T" is -0.0570105.
108	ClinPred_score: A deleteriousness preidction meta-score for nonsynonymous SNVs. See https://doi.org/10.1016/j.ajhg.2018.08.005.
		for details. The range of the score in dbNSFP is from 0 to 1.
		The higher the score, the more likely the variant is pathogenic. The author suggested cutoff between 
		deleterious ("D") and tolerated ("T") is 0.5.
109	ClinPred_rankscore: ClinPred scores were ranked among all ClinPred scores in dbNSFP. The rankscore is the ratio
		of the rank of the score over the total number of ClinPred scores in dbNSFP.
110	ClinPred_pred: Prediction of ClinPred score based on the authors' recommendation, "T(olerated)" or "D(amaging)".
		The score cutoff between "D" and "T" is 0.5.
111	LIST-S2_score: A deleteriousness preidction score for nonsynonymous SNVs. See https://doi.org/10.1093/nar/gkaa288.
		for details. The range of the score in dbNSFP is from 0 to 1.
		The higher the score, the more likely the variant is pathogenic. The author suggested cutoff between 
		deleterious ("D") and tolerated ("T") is 0.85.
112	LIST-S2_rankscore: LIST-S2 scores were ranked among all LIST-S2 scores in dbNSFP. The rankscore is the ratio
		of the rank of the score over the total number of LIST-S2 scores in dbNSFP.
113	LIST-S2_pred: Prediction of LIST-S2 score based on the authors' recommendation, "T(olerated)" or "D(amaging)".
		The score cutoff between "D" and "T" is 0.85.
114	VARITY_R_score: VARITY_R scores are pathogenicity prediction scores for rare human missense variants.
		The range of VARITY_R score is from 0 to 1. The larger the score, the more likely the variant is 
		pathogenic. Details see doi: https://doi.org/10.1016/j.ajhg.2021.08.012
115	VARITY_R_rankscore: VARITY_R scores were ranked among all VARITY_R scores in dbNSFP. The rankscore is
		the ratio of the rank of the score over the total number of VARITY_R scores in dbNSFP.
116	VARITY_ER_score: VARITY_ER scores are pathogenicity prediction scores for extreme rare human missense variants.
		The range of VARITY_ER score is from 0 to 1. The larger the score, the more likely the variant is 
		pathogenic. Details see doi: https://doi.org/10.1016/j.ajhg.2021.08.012
117	VARITY_ER_rankscore: VARITY_ER scores were ranked among all VARITY_ER scores in dbNSFP. The rankscore is
		the ratio of the rank of the score over the total number of VARITY_ER scores in dbNSFP.
118	VARITY_R_LOO_score: "Same as VARITY_R except the prediction on the variants used for training was made using
		Leave-One-Variant out." Details see doi: https://doi.org/10.1016/j.ajhg.2021.08.012
119	VARITY_R_LOO_rankscore: VARITY_R_LOO scores were ranked among all VARITY_R_LOO scores in dbNSFP. The rankscore is
		the ratio of the rank of the score over the total number of VARITY_R_LOO scores in dbNSFP.
120	VARITY_ER_LOO_score: "Same as VARITY_ER except the prediction on the variants used for training was made using
		Leave-One-Variant out." Details see https://doi.org/10.1016/j.ajhg.2021.08.012
121	VARITY_ER_LOO_rankscore: VARITY_ER_LOO scores were ranked among all VARITY_ER_LOO scores in dbNSFP. The rankscore is
		the ratio of the rank of the score over the total number of VARITY_ER_LOO scores in dbNSFP.
122	ESM1b_score: ESM1b scores are log-likelihood ratio (LLR) scores for predicting the pathogenic effects of coding
		variants based on a 650-million-parameter protein language model, ESM1b. The range of ESM1b score in dbNSFP 
		is from -24.538 to 6.937. The smaller the score, the more likely the variant is pathogenic. 
		Details see doi: https://doi.org/10.1038/s41588-023-01465-0
123	ESM1b_rankscore: ESM1b scores were firstly negated (i.e., -ESM1b_score), then ranked among all -ESM1b_score scores
		in dbNSFP. The rankscore is the ratio of the rank of the -ESM1b_score over the total number of scores in dbNSFP.
124	ESM1b_pred: The authors do not recommend a threshold for separating deleterious (D) variants versus tolerated (T) variants.
		This prediction is based on the threshold of -7.5 described in their paper that yields a true-positive rate of 81% 
		and a true-negative rate of 82% in their ClinVar and HGMD test datasets.
125	EVE_score: EVE is a unsupervised model designed to predict the clinical relevance of human single amino acid variants
		by examining the sequences of various organisms throughout evolutionary history. The EVE score ranges from 0 to 1. 
		The larger the score, the more likely the variant is pathogenic. Detals see https://doi.org/10.1038/s41586-021-04043-8.
126	EVE_rankscore: EVE scores were ranked among all EVE scores in dbNSFP. The rankscore is the ratio of the rank of the
		score over the total number of EVE scores in dbNSFP.
127	EVE_Class10_pred: The EVE classification (B)enign, (U)ncertain, or (P)athogenic when setting 10% as uncertain in a
		Gaussian mixture model.
128	EVE_Class20_pred: The EVE classification (B)enign, (U)ncertain, or (P)athogenic when setting 20% as uncertain in a
		Gaussian mixture model.
129	EVE_Class25_pred: The EVE classification (B)enign, (U)ncertain, or (P)athogenic when setting 25% as uncertain in a
		Gaussian mixture model.
130	EVE_Class30_pred: The EVE classification (B)enign, (U)ncertain, or (P)athogenic when setting 30% as uncertain in a
		Gaussian mixture model.
131	EVE_Class40_pred: The EVE classification (B)enign, (U)ncertain, or (P)athogenic when setting 40% as uncertain in a
		Gaussian mixture model.
132	EVE_Class50_pred: The EVE classification (B)enign, (U)ncertain, or (P)athogenic when setting 50% as uncertain in a
		Gaussian mixture model.
133	EVE_Class60_pred: The EVE classification (B)enign, (U)ncertain, or (P)athogenic when setting 60% as uncertain in a
		Gaussian mixture model.
134	EVE_Class70_pred: The EVE classification (B)enign, (U)ncertain, or (P)athogenic when setting 70% as uncertain in a
		Gaussian mixture model.
135	EVE_Class75_pred: The EVE classification (B)enign, (U)ncertain, or (P)athogenic when setting 75% as uncertain in a
		Gaussian mixture model.
136	EVE_Class80_pred: The EVE classification (B)enign, (U)ncertain, or (P)athogenic when setting 80% as uncertain in a
		Gaussian mixture model.
137	EVE_Class90_pred: The EVE classification (B)enign, (U)ncertain, or (P)athogenic when setting 90% as uncertain in a
		Gaussian mixture model.
138	AlphaMissense_score: AlphaMissense is a unsupervised model for predicting the pathogenicity of human missense variants
		by incorporating structural context of an AlphaFold-derived system. The AlphaMissense score ranges from 0 to 1. 
		The larger the score, the more likely the variant is pathogenic. Details see https://doi.org/10.1126/science.adg7492.
		License information: Copyright (2023) DeepMind Technologies Limited. All materials are licensed under the Creative 
		Commons Attribution 4.0 International License (CC-BY) (the “License”).  You may obtain a copy of the License at: 
		https://creativecommons.org/licenses/by/4.0/legalcode.
139	AlphaMissense_rankscore: AlphaMissense scores were ranked among all AlphaMissense scores in dbNSFP. The rankscore
		is the ratio of the rank of the AlphaMissense_score over the total number of scores in dbNSFP.
140	AlphaMissense_pred: The AlphaMissense classification of likely (B)enign, (A)mbiguous, or likely (P)athogenic with
		90% expected precision estimated from ClinVar for likely benign and likely pathogenic classes.
141	PHACTboost_score: "PHACTboost is a gradiatent boosting tree based classifier that combines PHACT scores with
		information from multiple sequence alignment, phylogenetic trees, and ancestral reconstruction." The range of the score
		is from 0 to 1, the larger the score the more likely the variant is pathegenic. Details see 
		https://doi.org/10.1093/molbev/msae136. The authors recommend to use 0.62 as the cutoff for binary prediction 
		(personal communication). 
142	PHACTboost_rankscore: PHACTboost scores were ranked among all PHACTboost scores in dbNSFP. The rankscore
		is the ratio of the rank of the PHACTboost_score over the total number of scores in dbNSFP.
143	MutFormer_score: "MutFormer is an application of the BERT (Bidirectional Encoder Representations from Transformers)
		NLP (Natural Language Processing) model with an added adaptive vocabulary to protein context, for the purpose of 
		predicting the effect of missense mutations on protein function."  The range of the score is from 0 to 1, the larger 
		the score the more likely the variant is pathegenic. Details see https://doi.org/10.1016/j.xinn.2023.100487. The
		authors recommend to use 0.8838 as the cutoff for binary prediction (personal communication). 
144	MutFormer_rankscore: MutFormer scores were ranked among all MutFormer scores in dbNSFP. The rankscore
		is the ratio of the rank of the MutFormer_score over the total number of scores in dbNSFP.
145	MutScore_score: MutScore is an ensemble score which integerate multiple unsupervised scores for DNA substitutions
		with additional positional clustering information. The range of the score is from 0 to 1, the larger 
		the score the more likely the variant is pathegenic. Details see https://doi.org/10.1016/j.ajhg.2022.01.006. 
		The authors recommend to use 0.5 as the cutoff for binary prediction (personal communication). 
146	MutScore_rankscore: MutScore scores were ranked among all MutScore scores in dbNSFP. The rankscore
		is the ratio of the rank of the MutScore_score over the total number of scores in dbNSFP.
147	Aloft_Fraction_transcripts_affected: the fraction of the transcripts of the gene affected
		i.e. No. of transcripts affected by the SNP/Total no. of protein_coding transcripts for the gene
		multiple values separated by ";", corresponding to Ensembl_proteinid.
148	Aloft_prob_Tolerant: Probability of the SNP being classified as benign by ALoFT
		multiple values separated by ";", corresponding to Ensembl_proteinid.
149	Aloft_prob_Recessive: Probability of the SNP being classified as recessive disease-causing by ALoFT
		multiple values separated by ";", corresponding to Ensembl_proteinid.
150	Aloft_prob_Dominant:  Probability of the SNP being classified as dominant disease-causing by ALoFT
		multiple values separated by ";", corresponding to Ensembl_proteinid.
151	Aloft_pred: final classification predicted by ALoFT;
		values can be Tolerant, Recessive or Dominant
		multiple values separated by ";", corresponding to Ensembl_proteinid.
152	Aloft_Confidence: Confidence level of Aloft_pred;
		values can be "High Confidence" (p < 0.05) or "Low Confidence" (p > 0.05)
		multiple values separated by ";", corresponding to Ensembl_proteinid.
153	CADD_raw: CADD raw score for functional prediction of a SNP. Please refer to Kircher et al.
		(2014) Nature Genetics 46(3):310-5 for details. The larger the score the more likely
		the SNP has damaging effect. Scores range from -28.377575 to 25.511592 in dbNSFP. 
		Please note the following copyright statement for CADD: 
		"CADD scores (http://cadd.gs.washington.edu/) are Copyright 2013 University of 
		Washington and Hudson-Alpha Institute for Biotechnology (all rights reserved) but are 
		freely available for all academic, non-commercial applications. For commercial 
		licensing information contact Jennifer McCullar (mccullaj@uw.edu)."
154	CADD_raw_rankscore: CADD raw scores were ranked among all CADD raw scores in dbNSFP. The
		rankscore is the ratio of the rank of the score over the total number of CADD 
		raw scores in dbNSFP. Please note the following copyright statement for CADD: "CADD 
		scores (http://cadd.gs.washington.edu/) are Copyright 2013 University of Washington 
		and Hudson-Alpha Institute for Biotechnology (all rights reserved) but are freely 
		available for all academic, non-commercial applications. For commercial licensing 
		information contact Jennifer McCullar (mccullaj@uw.edu)."
155	CADD_phred: CADD phred-like score. This is phred-like rank score based on whole genome
		CADD raw scores. Please refer to Kircher et al. (2014) Nature Genetics 46(3):310-5 
		for details. The larger the score the more likely the SNP has damaging effect. 
		Please note the following copyright statement for CADD: "CADD scores 
		(http://cadd.gs.washington.edu/) are Copyright 2013 University of Washington and 
		Hudson-Alpha Institute for Biotechnology (all rights reserved) but are freely 
		available for all academic, non-commercial applications. For commercial licensing 
		information contact Jennifer McCullar (mccullaj@uw.edu)."
156	CADD_raw_hg19: CADD raw score for functional prediction of a SNP using the hg19 model. 
		Please refer to Kircher et al. (2014) Nature Genetics 46(3):310-5 for details. The 
		larger the score the more likely the SNP has damaging effect. Scores range from 
		-9.777566 to 30.05914 in dbNSFP. Please note the following copyright statement for CADD: 
		"CADD scores (http://cadd.gs.washington.edu/) are Copyright 2013 University of 
		Washington and Hudson-Alpha Institute for Biotechnology (all rights reserved) but are 
		freely available for all academic, non-commercial applications. For commercial 
		licensing information contact Jennifer McCullar (mccullaj@uw.edu)."
157	CADD_raw_rankscore_hg19: CADD raw scores were ranked among all CADD_raw_hg19 in dbNSFP. The
		rankscore is the ratio of the rank of the score over the total number of CADD 
		raw scores in dbNSFP. Please note the following copyright statement for CADD: "CADD 
		scores (http://cadd.gs.washington.edu/) are Copyright 2013 University of Washington 
		and Hudson-Alpha Institute for Biotechnology (all rights reserved) but are freely 
		available for all academic, non-commercial applications. For commercial licensing 
		information contact Jennifer McCullar (mccullaj@uw.edu)."
158	CADD_phred_hg19: CADD phred-like score using the hg19 model. This is phred-like rank score 
		based on whole genome CADD raw scores. Please refer to Kircher et al. (2014) Nature Genetics 
		46(3):310-5 for details. The larger the score the more likely the SNP has damaging effect. 
		Please note the following copyright statement for CADD: "CADD scores 
		(http://cadd.gs.washington.edu/) are Copyright 2013 University of Washington and 
		Hudson-Alpha Institute for Biotechnology (all rights reserved) but are freely 
		available for all academic, non-commercial applications. For commercial licensing 
		information contact Jennifer McCullar (mccullaj@uw.edu)."
159	DANN_score: DANN is a functional prediction score retrained based on the training data
		of CADD using deep neural network. Scores range from 0 to 1. A larger number indicate 
		a higher probability to be damaging. More information of this score can be found in
		doi: 10.1093/bioinformatics/btu703. 
160	DANN_rankscore: DANN scores were ranked among all DANN scores in dbNSFP. The rankscore is
		the ratio of the rank of the score over the total number of DANN scores in dbNSFP.
161	fathmm-MKL_coding_score: fathmm-MKL p-values. Scores range from 0 to 1. SNVs with scores >0.5
		are predicted to be deleterious, and those <0.5 are predicted to be neutral or benign. 
		Scores close to 0 or 1 are with the highest-confidence. Coding scores are trained using 10
		groups of features. More details of the score can be found in 
		doi: 10.1093/bioinformatics/btv009.
162	fathmm-MKL_coding_rankscore: fathmm-MKL coding scores were ranked among all fathmm-MKL coding
		scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number 
		of fathmm-MKL coding scores in dbNSFP.
163	fathmm-MKL_coding_pred: If a fathmm-MKL_coding_score is >0.5 (or rankscore >0.28317) 
		the corresponding nsSNV is predicted as "D(AMAGING)"; otherwise it is predicted as "N(EUTRAL)".
164	fathmm-MKL_coding_group: the groups of features (labeled A-J) used to obtained the score. More
		details can be found in doi: 10.1093/bioinformatics/btv009.
165	fathmm-XF_coding_score: fathmm-XF p-values. Scores range from 0 to 1. SNVs with scores >0.5
		are predicted to be deleterious, and those <0.5 are predicted to be neutral or benign. 
		Scores close to 0 or 1 are with the highest-confidence. Coding scores are trained using 10
		groups of features. More details of the score can be found in 
		doi: 10.1093/bioinformatics/btx536.
166	fathmm-XF_coding_rankscore: fathmm-XF coding scores were ranked among all fathmm-XF coding
		scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number 
		of fathmm-XF coding scores in dbNSFP.
167	fathmm-XF_coding_pred: If a fathmm-XF_coding_score is >0.5, the corresponding nsSNV is predicted
		as "D(AMAGING)"; otherwise it is predicted as "N(EUTRAL)".
168	Eigen-raw_coding: Eigen score for coding SNVs. A functional prediction score based on conservation,
		allele frequencies, and deleteriousness prediction using an unsupervised learning method 
		(doi: 10.1038/ng.3477). 
169	Eigen-raw_coding_rankscore: Eigen-raw scores were ranked among all Eigen-raw scores in dbNSFP. The rankscore
		is the ratio of the rank of the score over the total number of Eigen-raw scores in dbNSFP.
170	Eigen-phred_coding: Eigen score in phred scale.
171	Eigen-PC-raw_coding: Eigen PC score for genome-wide SNVs. A functional prediction score based on
		conservation, allele frequencies, deleteriousness prediction (for missense SNVs) and
		epigenomic signals (for synonymous and non-coding SNVs) using an unsupervised learning 
		method (doi: 10.1038/ng.3477). 
172	Eigen-PC-raw_coding_rankscore: Eigen-PC-raw scores were ranked among all Eigen-PC-raw scores in
		dbNSFP. The rankscore is the ratio of the rank of the score over the total number 
		of Eigen-PC-raw scores in dbNSFP.
173	Eigen-PC-phred_coding: Eigen PC score in phred scale.
174	GenoCanyon_score: A functional prediction score based on conservation and biochemical
		annotations using an unsupervised statistical learning. (doi:10.1038/srep10576)
175	GenoCanyon_rankscore: GenoCanyon_score scores were ranked among all integrated fitCons
		scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number 
		of GenoCanyon_score scores in dbNSFP.
176	integrated_fitCons_score: fitCons score predicts the fraction of genomic positions belonging to
		a specific function class (defined by epigenomic "fingerprint") that are under selective 
		pressure. Scores range from 0 to 1, with a larger score indicating a higher proportion of 
		nucleic sites of the functional class the genomic position belong to are under selective 
		pressure, therefore more likely to be functional important. Integrated (i6) scores are
		integrated across three cell types (GM12878, H1-hESC and HUVEC). More details can be found
		in doi:10.1038/ng.3196.
177	integrated_fitCons_rankscore: integrated fitCons scores were ranked among all integrated fitCons
		scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number 
		of integrated fitCons scores in dbNSFP.
178	integrated_confidence_value: 0 - highly significant scores (approx. p<.003); 1 - significant scores
		(approx. p<.05); 2 - informative scores (approx. p<.25); 3 - other scores (approx. p>=.25).
179	GM12878_fitCons_score: fitCons score predicts the fraction of genomic positions belonging to
		a specific function class (defined by epigenomic "fingerprint") that are under selective 
		pressure. Scores range from 0 to 1, with a larger score indicating a higher proportion of 
		nucleic sites of the functional class the genomic position belong to are under selective 
		pressure, therefore more likely to be functional important. GM12878 fitCons scores are
		based on cell type GM12878. More details can be found in doi:10.1038/ng.3196.
180	GM12878_fitCons_rankscore: GM12878 fitCons scores were ranked among all GM12878 fitCons
		scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number 
		of GM12878 fitCons scores in dbNSFP.
181	GM12878_confidence_value: 0 - highly significant scores (approx. p<.003); 1 - significant scores
		(approx. p<.05); 2 - informative scores (approx. p<.25); 3 - other scores (approx. p>=.25).
182	H1-hESC_fitCons_score: fitCons score predicts the fraction of genomic positions belonging to
		a specific function class (defined by epigenomic "fingerprint") that are under selective 
		pressure. Scores range from 0 to 1, with a larger score indicating a higher proportion of 
		nucleic sites of the functional class the genomic position belong to are under selective 
		pressure, therefore more likely to be functional important. GM12878 fitCons scores are
		based on cell type H1-hESC. More details can be found in doi:10.1038/ng.3196.
183	H1-hESC_fitCons_rankscore: H1-hESC fitCons scores were ranked among all H1-hESC fitCons
		scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number 
		of H1-hESC fitCons scores in dbNSFP.
184	H1-hESC_confidence_value: 0 - highly significant scores (approx. p<.003); 1 - significant scores
		(approx. p<.05); 2 - informative scores (approx. p<.25); 3 - other scores (approx. p>=.25).
185	HUVEC_fitCons_score: fitCons score predicts the fraction of genomic positions belonging to
		a specific function class (defined by epigenomic "fingerprint") that are under selective 
		pressure. Scores range from 0 to 1, with a larger score indicating a higher proportion of 
		nucleic sites of the functional class the genomic position belong to are under selective 
		pressure, therefore more likely to be functional important. GM12878 fitCons scores are
		based on cell type HUVEC. More details can be found in doi:10.1038/ng.3196.
186	HUVEC_fitCons_rankscore: HUVEC fitCons scores were ranked among all HUVEC fitCons
		scores in dbNSFP. The rankscore is the ratio of the rank of the score over the total number 
		of HUVEC fitCons scores in dbNSFP.
187	HUVEC_confidence_value: 0 - highly significant scores (approx. p<.003); 1 - significant scores
		(approx. p<.05); 2 - informative scores (approx. p<.25); 3 - other scores (approx. p>=.25).
188	LINSIGHT: "The LINSIGHT score measures the probability of negative selection on noncoding sites"
		Details refer to doi:10.1038/ng.3810. 
189	LINSIGHT_rankscore: LINSIGHT scores were ranked among all LINSIGHT scores in dbNSFP. The rankscore
		is the ratio of the rank of the score over the total number of LINSIGHT scores in dbNSFP.
190	GERP++_NR: GERP++ neutral rate
191	GERP++_RS: GERP++ RS score, the larger the score, the more conserved the site. Scores range from
		-12.3 to 6.17.
192	GERP++_RS_rankscore: GERP++ RS scores were ranked among all GERP++ RS scores in dbNSFP.
		The rankscore is the ratio of the rank of the score over the total number of GERP++ RS 
		scores in dbNSFP.
193	GERP_91_mammals: GERP conservation score calculated based on multiple sequence alignments of 91 mammals.
194	GERP_91_mammals_rankscore: GERP (91 mammals) scores were ranked among all GERP (91 mammals) scores in dbNSFP.
		The rankscore is the ratio of the rank of the score over the total number of GERP_91_mammals
		scores in dbNSFP.
195	phyloP100way_vertebrate: phyloP (phylogenetic p-values) conservation score based on the
		multiple alignments of 100 vertebrate genomes (including human). The larger the score, 
		the more conserved the site. Scores range from -20.0 to 10.003 in dbNSFP.
196	phyloP100way_vertebrate_rankscore: phyloP100way_vertebrate scores were ranked among all
		phyloP100way_vertebrate scores in dbNSFP. The rankscore is the ratio of the rank of the 
		score over the total number of phyloP100way_vertebrate scores in dbNSFP.
197	phyloP470way_mammalian: phyloP (phylogenetic p-values) conservation score based on the
		multiple alignments of 470 mammalian genomes (including human). The larger the score, 
		the more conserved the site. Scores range from -20 to 11.936 in dbNSFP.
198	phyloP470way_mammalian_rankscore: phyloP470way_mammalian scores were ranked among all
		phyloP470way_mammalian scores in dbNSFP. The rankscore is the ratio of the rank of the 
		score over the total number of phyloP470way_mammalian scores in dbNSFP.
199	phyloP17way_primate: a conservation score based on 17way alignment primate set,
		the higher the more conservative. Scores range from -13.362 to 0.756 in dbNSFP.
200	phyloP17way_primate_rankscore: the rank of the phyloP17way_primate score among
		all phyloP17way_primate scores in dbNSFP.
201	phastCons100way_vertebrate: phastCons conservation score based on the multiple alignments
		of 100 vertebrate genomes (including human). The larger the score, the more conserved 
		the site. Scores range from 0 to 1. 
202	phastCons100way_vertebrate_rankscore: phastCons100way_vertebrate scores were ranked among
		all phastCons100way_vertebrate scores in dbNSFP. The rankscore is the ratio of the rank 
		of the score over the total number of phastCons100way_vertebrate scores in dbNSFP.
203	phastCons470way_mammalian: phastCons conservation score based on the multiple alignments
		of 470 mammalian genomes (including human). The larger the score, the more conserved 
		the site. Scores range from 0 to 1. 
204	phastCons470way_mammalian_rankscore: phastCons470way_mammalian scores were ranked among
		all phastCons470way_mammalian scores in dbNSFP. The rankscore is the ratio of the rank 
		of the score over the total number of phastCons470way_mammalian scores in dbNSFP.
205	phastCons17way_primate: a conservation score based on 17way alignment primate set,
		The larger the score, the more conserved the site. Scores range from 0 to 1. 
206	phastCons17way_primate_rankscore: the rank of the phastCons17way_primate score among
		all phastCons17way_primate scores in dbNSFP.
207	SiPhy_29way_pi: The estimated stationary distribution of A, C, G and T at the site,
		using SiPhy algorithm based on 29 mammals genomes. 
208	SiPhy_29way_logOdds: SiPhy score based on 29 mammals genomes. The larger the score,
		the more conserved the site. Scores range from 0 to 37.9718 in dbNSFP.
209	SiPhy_29way_logOdds_rankscore: SiPhy_29way_logOdds scores were ranked among all
		SiPhy_29way_logOdds scores in dbNSFP. The rankscore is the ratio of the rank 
		of the score over the total number of SiPhy_29way_logOdds scores in dbNSFP.
210	bStatistic: Background selection (B) value estimates from doi.org/10.1371/journal.pgen.1000471.
		Ranges from 0 to 1000. It estimates the expected fraction (*1000) of neutral diversity present 
		at a site. Values close to 0 represent near complete removal of diversity as a result of 
		background selection and values near 1000 indicating absent of background selection. 
		Data from CADD v1.4.
211	bStatistic_converted_rankscore: bStatistic scores were first converted to -bStatistic, then 
		ranked among all -bStatistic scores in dbNSFP. The rankscore is the ratio of the rank of 
		-bStatistic over the total number of -bStatistic scores in dbNSFP.
212	1000Gp3_AC: Alternative allele counts in the whole 1000 genomes phase 3 (1000Gp3) data.
213	1000Gp3_AF: Alternative allele frequency in the whole 1000Gp3 data.
214	1000Gp3_AFR_AC: Alternative allele counts in the 1000Gp3 African descendent samples.
215	1000Gp3_AFR_AF: Alternative allele frequency in the 1000Gp3 African descendent samples.
216	1000Gp3_EUR_AC: Alternative allele counts in the 1000Gp3 European descendent samples.
217	1000Gp3_EUR_AF: Alternative allele frequency in the 1000Gp3 European descendent samples.
218	1000Gp3_AMR_AC: Alternative allele counts in the 1000Gp3 American descendent samples.
219	1000Gp3_AMR_AF: Alternative allele frequency in the 1000Gp3 American descendent samples.
220	1000Gp3_EAS_AC: Alternative allele counts in the 1000Gp3 East Asian descendent samples.
221	1000Gp3_EAS_AF: Alternative allele frequency in the 1000Gp3 East Asian descendent samples.
222	1000Gp3_SAS_AC: Alternative allele counts in the 1000Gp3 South Asian descendent samples.
223	1000Gp3_SAS_AF: Alternative allele frequency in the 1000Gp3 South Asian descendent samples.
224	TWINSUK_AC: Alternative allele count in called genotypes in UK10K TWINSUK cohort.
225	TWINSUK_AF: Alternative allele frequency in called genotypes in UK10K TWINSUK cohort.
226	ALSPAC_AC: Alternative allele count in called genotypes in UK10K ALSPAC cohort.
227	ALSPAC_AF: Alternative allele frequency in called genotypes in UK10K ALSPAC cohort.
228	UK10K_AC: Alternative allele count in combined genotypes in UK10K cohort (TWINSUK+ALSPAC).
229	UK10K_AF: Alternative allele frequency in combined genotypes in UK10K cohort (TWINSUK+ALSPAC).
230	ESP6500_AA_AC: Alternative allele count in the African American samples of the
		NHLBI GO Exome Sequencing Project (ESP6500 data set).
231	ESP6500_AA_AF: Alternative allele frequency in the African American samples of the
		NHLBI GO Exome Sequencing Project (ESP6500 data set).
232	ESP6500_EA_AC: Alternative allele count in the European American samples of the
		NHLBI GO Exome Sequencing Project (ESP6500 data set).
233	ESP6500_EA_AF: Alternative allele frequency in the European American samples of the
		NHLBI GO Exome Sequencing Project (ESP6500 data set).
234	ExAC_AC: Allele count in total ExAC samples (60,706 samples)
235	ExAC_AF: Allele frequency in total ExAC samples
236	ExAC_Adj_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in total ExAC samples
237	ExAC_Adj_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in total ExAC samples
238	ExAC_AFR_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in African & African American
		ExAC samples
239	ExAC_AFR_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in African & African American
		ExAC samples
240	ExAC_AMR_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in American ExAC samples
241	ExAC_AMR_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in American ExAC samples
242	ExAC_EAS_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in East Asian ExAC samples
243	ExAC_EAS_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in East Asian ExAC samples
244	ExAC_FIN_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in Finnish ExAC samples
245	ExAC_FIN_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in Finnish ExAC samples
246	ExAC_NFE_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in Non-Finnish European ExAC
		samples
247	ExAC_NFE_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in Non-Finnish European ExAC
		samples
248	ExAC_SAS_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in South Asian ExAC samples
249	ExAC_SAS_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in South Asian ExAC samples
250	ExAC_nonTCGA_AC: Allele count in total ExAC_nonTCGA samples (53,105 samples)
251	ExAC_nonTCGA_AF: Allele frequency in total ExAC_nonTCGA samples
252	ExAC_nonTCGA_Adj_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in total ExAC_nonTCGA samples
253	ExAC_nonTCGA_Adj_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in total ExAC_nonTCGA samples
254	ExAC_nonTCGA_AFR_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in African & African American
		ExAC_nonTCGA samples
255	ExAC_nonTCGA_AFR_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in African & African American
		ExAC_nonTCGA samples
256	ExAC_nonTCGA_AMR_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in American ExAC_nonTCGA samples
257	ExAC_nonTCGA_AMR_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in American ExAC_nonTCGA samples
258	ExAC_nonTCGA_EAS_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in East Asian ExAC_nonTCGA samples
259	ExAC_nonTCGA_EAS_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in East Asian ExAC_nonTCGA samples
260	ExAC_nonTCGA_FIN_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in Finnish ExAC_nonTCGA samples
261	ExAC_nonTCGA_FIN_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in Finnish ExAC_nonTCGA samples
262	ExAC_nonTCGA_NFE_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in Non-Finnish European ExAC_nonTCGA
		samples
263	ExAC_nonTCGA_NFE_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in Non-Finnish European ExAC_nonTCGA
		samples
264	ExAC_nonTCGA_SAS_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in South Asian ExAC_nonTCGA samples
265	ExAC_nonTCGA_SAS_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in South Asian ExAC_nonTCGA samples
266	ExAC_nonpsych_AC: Allele count in total ExAC_nonpsych samples (45,376 samples)
267	ExAC_nonpsych_AF: Allele frequency in total ExAC_nonpsych samples
268	ExAC_nonpsych_Adj_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in total ExAC_nonpsych samples
269	ExAC_nonpsych_Adj_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in total ExAC_nonpsych samples
270	ExAC_nonpsych_AFR_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in African & African American
		ExAC_nonpsych samples
271	ExAC_nonpsych_AFR_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in African & African American
		ExAC_nonpsych samples
272	ExAC_nonpsych_AMR_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in American ExAC_nonpsych samples
273	ExAC_nonpsych_AMR_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in American ExAC_nonpsych samples
274	ExAC_nonpsych_EAS_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in East Asian ExAC_nonpsych samples
275	ExAC_nonpsych_EAS_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in East Asian ExAC_nonpsych samples
276	ExAC_nonpsych_FIN_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in Finnish ExAC_nonpsych samples
277	ExAC_nonpsych_FIN_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in Finnish ExAC_nonpsych samples
278	ExAC_nonpsych_NFE_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in Non-Finnish European ExAC_nonpsych
		samples
279	ExAC_nonpsych_NFE_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in Non-Finnish European ExAC_nonpsych
		samples
280	ExAC_nonpsych_SAS_AC: Adjusted Alt allele counts (DP >= 10 & GQ >= 20) in South Asian ExAC_nonpsych samples
281	ExAC_nonpsych_SAS_AF: Adjusted Alt allele frequency (DP >= 10 & GQ >= 20) in South Asian ExAC_nonpsych samples
282	gnomAD_exomes_flag: information from gnomAD exome data indicating whether the variant falling within low-complexity
		(lcr) or segmental duplication (segdup) or decoy regions. The flag can be either "." for high-quality PASS or not 
		reported/polymorphic in gnomAD exomes, "lcr" for within lcr, "segdup" for within segdup, or "decoy" for
		with decoy region.
283	gnomAD_exomes_AC: Alternative allele count in the whole gnomAD exome samples v4.0.0
284	gnomAD_exomes_AN: Total allele count in the whole gnomAD exome samples v4.0.0
285	gnomAD_exomes_AF: Alternative allele frequency in the whole gnomAD exome samples v4.0.0
286	gnomAD_exomes_nhomalt: Count of individuals with homozygous alternative allele in the whole gnomAD exome samples v4.0.0
287	gnomAD_exomes_POPMAX_AC: Allele count in the population with the maximum AF
288	gnomAD_exomes_POPMAX_AN: Total number of alleles in the population with the maximum AF
289	gnomAD_exomes_POPMAX_AF: Maximum allele frequency across populations (excluding samples of Ashkenazi, Finnish, and indeterminate ancestry)
290	gnomAD_exomes_POPMAX_nhomalt: Count of homozygous individuals in the population with the maximum allele frequency
291	gnomAD_exomes_AFR_AC: Alternative allele count in the African/African American gnomAD exome samples v4.0.0
292	gnomAD_exomes_AFR_AN: Total allele count in the African/African American gnomAD exome samples v4.0.0
293	gnomAD_exomes_AFR_AF: Alternative allele frequency in the African/African American gnomAD exome samples v4.0.0
294	gnomAD_exomes_AFR_nhomalt: Count of individuals with homozygous alternative allele in the African/African American gnomAD exome samples v4.0.0
295	gnomAD_exomes_AMR_AC: Alternative allele count in the Latino gnomAD exome samples v4.0.0
296	gnomAD_exomes_AMR_AN: Total allele count in the Latino gnomAD exome samples v4.0.0
297	gnomAD_exomes_AMR_AF: Alternative allele frequency in the Latino gnomAD exome samples v4.0.0
298	gnomAD_exomes_AMR_nhomalt: Count of individuals with homozygous alternative allele in the Latino gnomAD exome samples v4.0.0
299	gnomAD_exomes_ASJ_AC: Alternative allele count in the Ashkenazi Jewish gnomAD exome samples v4.0.0
300	gnomAD_exomes_ASJ_AN: Total allele count in the Ashkenazi Jewish gnomAD exome samples v4.0.0
301	gnomAD_exomes_ASJ_AF: Alternative allele frequency in the Ashkenazi Jewish gnomAD exome samples v4.0.0
302	gnomAD_exomes_ASJ_nhomalt: Count of individuals with homozygous alternative allele in the Ashkenazi Jewish gnomAD exome samples v4.0.0
303	gnomAD_exomes_EAS_AC: Alternative allele count in the East Asian gnomAD exome samples v4.0.0
304	gnomAD_exomes_EAS_AN: Total allele count in the East Asian gnomAD exome samples v4.0.0
305	gnomAD_exomes_EAS_AF: Alternative allele frequency in the East Asian gnomAD exome samples v4.0.0
306	gnomAD_exomes_EAS_nhomalt: Count of individuals with homozygous alternative allele in the East Asian gnomAD exome samples v4.0.0
307	gnomAD_exomes_FIN_AC: Alternative allele count in the Finnish gnomAD exome samples v4.0.0
308	gnomAD_exomes_FIN_AN: Total allele count in the Finnish gnomAD exome samples v4.0.0
309	gnomAD_exomes_FIN_AF: Alternative allele frequency in the Finnish gnomAD exome samples v4.0.0
310	gnomAD_exomes_FIN_nhomalt: Count of individuals with homozygous alternative allele in the Finnish gnomAD exome samples v4.0.0
311	gnomAD_exomes_MID_AC: Alternative allele count in the Middle Eastern gnomAD exome samples v4.0.0
312	gnomAD_exomes_MID_AN: Total allele count in the Middle Eastern gnomAD exome samples v4.0.0
313	gnomAD_exomes_MID_AF: Alternative allele frequency in the Middle Eastern gnomAD exome samples v4.0.0
314	gnomAD_exomes_MID_nhomalt: Count of individuals with homozygous alternative allele in the Middle Eastern gnomAD exome samples v4.0.0
315	gnomAD_exomes_NFE_AC: Alternative allele count in the Non-Finnish European gnomAD exome samples v4.0.0
316	gnomAD_exomes_NFE_AN: Total allele count in the Non-Finnish European gnomAD exome samples v4.0.0
317	gnomAD_exomes_NFE_AF: Alternative allele frequency in the Non-Finnish European gnomAD exome samples v4.0.0
318	gnomAD_exomes_NFE_nhomalt: Count of individuals with homozygous alternative allele in the Non-Finnish European gnomAD exome samples v4.0.0
319	gnomAD_exomes_SAS_AC: Alternative allele count in the South Asian gnomAD exome samples v4.0.0
320	gnomAD_exomes_SAS_AN: Total allele count in the South Asian gnomAD exome samples v4.0.0
321	gnomAD_exomes_SAS_AF: Alternative allele frequency in the South Asian gnomAD exome samples v4.0.0
322	gnomAD_exomes_SAS_nhomalt: Count of individuals with homozygous alternative allele in the South Asian gnomAD exome samples v4.0.0
323	gnomAD_exomes_non_ukb_AC: Alternative allele count in the non-UKBiobank subset of whole gnomAD exome samples v4.0.0
324	gnomAD_exomes_non_ukb_AN: Total allele count in the non-UKBiobank subset of whole gnomAD exome samples v4.0.0
325	gnomAD_exomes_non_ukb_AF: Alternative allele frequency in the non-UKBiobank subset of whole gnomAD exome samples v4.0.0
326	gnomAD_exomes_non_ukb_nhomalt: Count of individuals with homozygous alternative allele in the non-UKBiobank subset of whole gnomAD exome samples v4.0.0
327	gnomAD_exomes_non_ukb_AFR_AC: Alternative allele count in the non-UKBiobank subset of African/African American gnomAD exome samples v4.0.0
328	gnomAD_exomes_non_ukb_AFR_AN: Total allele count in the non-UKBiobank subset of African/African American gnomAD exome samples v4.0.0
329	gnomAD_exomes_non_ukb_AFR_AF: Alternative allele frequency in the non-UKBiobank subset of African/African American gnomAD exome samples v4.0.0
330	gnomAD_exomes_non_ukb_AFR_nhomalt: Count of individuals with homozygous alternative allele in the non-UKBiobank subset of African/African American gnomAD exome samples v4.0.0
331	gnomAD_exomes_non_ukb_AMR_AC: Alternative allele count in the non-UKBiobank subset of Latino gnomAD exome samples v4.0.0
332	gnomAD_exomes_non_ukb_AMR_AN: Total allele count in the non-UKBiobank subset of Latino gnomAD exome samples v4.0.0
333	gnomAD_exomes_non_ukb_AMR_AF: Alternative allele frequency in the non-UKBiobank subset of Latino gnomAD exome samples v4.0.0
334	gnomAD_exomes_non_ukb_AMR_nhomalt: Count of individuals with homozygous alternative allele in the non-UKBiobank subset of Latino gnomAD exome samples v4.0.0
335	gnomAD_exomes_non_ukb_ASJ_AC: Alternative allele count in the non-UKBiobank subset of Ashkenazi Jewish gnomAD exome samples v4.0.0
336	gnomAD_exomes_non_ukb_ASJ_AN: Total allele count in the non-UKBiobank subset of Ashkenazi Jewish gnomAD exome samples v4.0.0
337	gnomAD_exomes_non_ukb_ASJ_AF: Alternative allele frequency in the non-UKBiobank subset of Ashkenazi Jewish gnomAD exome samples v4.0.0
338	gnomAD_exomes_non_ukb_ASJ_nhomalt: Count of individuals with homozygous alternative allele in the non-UKBiobank subset of Ashkenazi Jewish gnomAD exome samples v4.0.0
339	gnomAD_exomes_non_ukb_EAS_AC: Alternative allele count in the non-UKBiobank subset of East Asian gnomAD exome samples v4.0.0
340	gnomAD_exomes_non_ukb_EAS_AN: Total allele count in the non-UKBiobank subset of East Asian gnomAD exome samples v4.0.0
341	gnomAD_exomes_non_ukb_EAS_AF: Alternative allele frequency in the non-UKBiobank subset of East Asian gnomAD exome samples v4.0.0
342	gnomAD_exomes_non_ukb_EAS_nhomalt: Count of individuals with homozygous alternative allele in the non-UKBiobank subset of East Asian gnomAD exome samples v4.0.0
343	gnomAD_exomes_non_ukb_FIN_AC: Alternative allele count in the non-UKBiobank subset of Finnish gnomAD exome samples v4.0.0
344	gnomAD_exomes_non_ukb_FIN_AN: Total allele count in the non-UKBiobank subset of Finnish gnomAD exome samples v4.0.0
345	gnomAD_exomes_non_ukb_FIN_AF: Alternative allele frequency in the non-UKBiobank subset of Finnish gnomAD exome samples v4.0.0
346	gnomAD_exomes_non_ukb_FIN_nhomalt: Count of individuals with homozygous alternative allele in the non-UKBiobank subset of Finnish gnomAD exome samples v4.0.0
347	gnomAD_exomes_non_ukb_MID_AC: Alternative allele count in the non-UKBiobank subset of Middle Eastern gnomAD exome samples v4.0.0
348	gnomAD_exomes_non_ukb_MID_AN: Total allele count in the non-UKBiobank subset of Middle Eastern gnomAD exome samples v4.0.0
349	gnomAD_exomes_non_ukb_MID_AF: Alternative allele frequency in the non-UKBiobank subset of Middle Eastern gnomAD exome samples v4.0.0
350	gnomAD_exomes_non_ukb_MID_nhomalt: Count of individuals with homozygous alternative allele in the non-UKBiobank subset of Middle Eastern gnomAD exome samples v4.0.0
351	gnomAD_exomes_non_ukb_NFE_AC: Alternative allele count in the non-UKBiobank subset of Non-Finnish European gnomAD exome samples v4.0.0
352	gnomAD_exomes_non_ukb_NFE_AN: Total allele count in the non-UKBiobank subset of Non-Finnish European gnomAD exome samples v4.0.0
353	gnomAD_exomes_non_ukb_NFE_AF: Alternative allele frequency in the non-UKBiobank subset of Non-Finnish European gnomAD exome samples v4.0.0
354	gnomAD_exomes_non_ukb_NFE_nhomalt: Count of individuals with homozygous alternative allele in the non-UKBiobank subset of Non-Finnish European gnomAD exome samples v4.0.0
355	gnomAD_exomes_non_ukb_SAS_AC: Alternative allele count in the non-UKBiobank subset of South Asian gnomAD exome samples v4.0.0
356	gnomAD_exomes_non_ukb_SAS_AN: Total allele count in the non-UKBiobank subset of South Asian gnomAD exome samples v4.0.0
357	gnomAD_exomes_non_ukb_SAS_AF: Alternative allele frequency in the non-UKBiobank subset of South Asian gnomAD exome samples v4.0.0
358	gnomAD_exomes_non_ukb_SAS_nhomalt: Count of individuals with homozygous alternative allele in the non-UKBiobank subset of South Asian gnomAD exome samples v4.0.0
359	gnomAD_genomes_flag: information from gnomAD genome data indicating whether the variant falling within low-complexity
		(lcr) or segmental duplication (segdup) or decoy regions. The flag can be either "." for high-quality PASS or not 
		reported/polymorphic in gnomAD exomes, "lcr" for within lcr, "segdup" for within segdup, or "decoy" for
		with decoy region.
360	gnomAD_genomes_AC: Alternative allele count in the whole gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95") and AC_het ("Allele count restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
361	gnomAD_genomes_AN: Total allele count in the whole gnomAD genome samples v4.0.0
362	gnomAD_genomes_AF: Alternative allele frequency in the whole gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AF_hom ("Allele frequency restricted to variants with a heteroplasmy level >= 0.95") and AF_het ("Allele frequency restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
363	gnomAD_genomes_nhomalt: Count of individuals with homozygous alternative allele in the whole gnomAD genome samples v4.0.0
		For mtDNA, this is AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95")
364	gnomAD_genomes_POPMAX_AC: Allele count in the population with the maximum AF
365	gnomAD_genomes_POPMAX_AN: Total number of alleles in the population with the maximum AF
366	gnomAD_genomes_POPMAX_AF: Maximum allele frequency across populations (excluding samples of Ashkenazi, Finnish, and indeterminate ancestry)
367	gnomAD_genomes_POPMAX_nhomalt: Count of homozygous individuals in the population with the maximum allele frequency
368	gnomAD_genomes_AFR_AC: Alternative allele count in the African/African American gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95") and AC_het ("Allele count restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
369	gnomAD_genomes_AFR_AN: Total allele count in the African/African American gnomAD genome samples v4.0.0
370	gnomAD_genomes_AFR_AF: Alternative allele frequency in the African/African American gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AF_hom ("Allele frequency restricted to variants with a heteroplasmy level >= 0.95") and AF_het ("Allele frequency restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
371	gnomAD_genomes_AFR_nhomalt: Count of individuals with homozygous alternative allele in the African/African American gnomAD genome samples v4.0.0
		For mtDNA, this is AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95")
372	gnomAD_genomes_AMI_AC: Alternative allele count in the Amish gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95") and AC_het ("Allele count restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
373	gnomAD_genomes_AMI_AN: Total allele count in the Amish gnomAD genome samples v4.0.0
374	gnomAD_genomes_AMI_AF: Alternative allele frequency in the Amish gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AF_hom ("Allele frequency restricted to variants with a heteroplasmy level >= 0.95") and AF_het ("Allele frequency restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
375	gnomAD_genomes_AMI_nhomalt: Count of individuals with homozygous alternative allele in the Amish gnomAD genome samples v4.0.0
		For mtDNA, this is AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95")
376	gnomAD_genomes_AMR_AC: Alternative allele count in the Latino gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95") and AC_het ("Allele count restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
377	gnomAD_genomes_AMR_AN: Total allele count in the Latino gnomAD genome samples v4.0.0
378	gnomAD_genomes_AMR_AF: Alternative allele frequency in the Latino gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AF_hom ("Allele frequency restricted to variants with a heteroplasmy level >= 0.95") and AF_het ("Allele frequency restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
379	gnomAD_genomes_AMR_nhomalt: Count of individuals with homozygous alternative allele in the Latino gnomAD genome samples v4.0.0
		For mtDNA, this is AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95")
380	gnomAD_genomes_ASJ_AC: Alternative allele count in the Ashkenazi Jewish gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95") and AC_het ("Allele count restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
381	gnomAD_genomes_ASJ_AN: Total allele count in the Ashkenazi Jewish gnomAD genome samples v4.0.0
382	gnomAD_genomes_ASJ_AF: Alternative allele frequency in the Ashkenazi Jewish gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AF_hom ("Allele frequency restricted to variants with a heteroplasmy level >= 0.95") and AF_het ("Allele frequency restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
383	gnomAD_genomes_ASJ_nhomalt: Count of individuals with homozygous alternative allele in the Ashkenazi Jewish gnomAD genome samples v4.0.0
		For mtDNA, this is AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95")
384	gnomAD_genomes_EAS_AC: Alternative allele count in the East Asian gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95") and AC_het ("Allele count restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
385	gnomAD_genomes_EAS_AN: Total allele count in the East Asian gnomAD genome samples v4.0.0
386	gnomAD_genomes_EAS_AF: Alternative allele frequency in the East Asian gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AF_hom ("Allele frequency restricted to variants with a heteroplasmy level >= 0.95") and AF_het ("Allele frequency restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
387	gnomAD_genomes_EAS_nhomalt: Count of individuals with homozygous alternative allele in the East Asian gnomAD genome samples v4.0.0
		For mtDNA, this is AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95")
388	gnomAD_genomes_FIN_AC: Alternative allele count in the Finnish gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95") and AC_het ("Allele count restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
389	gnomAD_genomes_FIN_AN: Total allele count in the Finnish gnomAD genome samples v4.0.0
390	gnomAD_genomes_FIN_AF: Alternative allele frequency in the Finnish gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AF_hom ("Allele frequency restricted to variants with a heteroplasmy level >= 0.95") and AF_het ("Allele frequency restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
391	gnomAD_genomes_FIN_nhomalt: Count of individuals with homozygous alternative allele in the Finnish gnomAD genome samples v4.0.0
		For mtDNA, this is AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95")
392	gnomAD_genomes_MID_AC: Alternative allele count in the Middle Eastern gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95") and AC_het ("Allele count restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
393	gnomAD_genomes_MID_AN: Total allele count in the Middle Eastern gnomAD genome samples v4.0.0
394	gnomAD_genomes_MID_AF: Alternative allele frequency in the Middle Eastern gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AF_hom ("Allele frequency restricted to variants with a heteroplasmy level >= 0.95") and AF_het ("Allele frequency restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
395	gnomAD_genomes_MID_nhomalt: Count of individuals with homozygous alternative allele in the Middle Eastern gnomAD genome samples v4.0.0
		For mtDNA, this is AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95")
396	gnomAD_genomes_NFE_AC: Alternative allele count in the Non-Finnish European gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95") and AC_het ("Allele count restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
397	gnomAD_genomes_NFE_AN: Total allele count in the Non-Finnish European gnomAD genome samples v4.0.0
398	gnomAD_genomes_NFE_AF: Alternative allele frequency in the Non-Finnish European gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AF_hom ("Allele frequency restricted to variants with a heteroplasmy level >= 0.95") and AF_het ("Allele frequency restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
399	gnomAD_genomes_NFE_nhomalt: Count of individuals with homozygous alternative allele in the Non-Finnish European gnomAD genome samples v4.0.0
		For mtDNA, this is AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95")
400	gnomAD_genomes_SAS_AC: Alternative allele count in the South Asian gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95") and AC_het ("Allele count restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
401	gnomAD_genomes_SAS_AN: Total allele count in the South Asian gnomAD genome samples v4.0.0
402	gnomAD_genomes_SAS_AF: Alternative allele frequency in the South Asian gnomAD genome samples v4.0.0
		For mtDNA, this is sum of AF_hom ("Allele frequency restricted to variants with a heteroplasmy level >= 0.95") and AF_het ("Allele frequency restricted to variants with a heteroplasmy level >= 0.10 and < 0.95")
403	gnomAD_genomes_SAS_nhomalt: Count of individuals with homozygous alternative allele in the South Asian gnomAD genome samples v4.0.0
		For mtDNA, this is AC_hom ("Allele count restricted to variants with a heteroplasmy level >= 0.95")
404	ALFA_European_AC: Alternative allele count of the European samples in the Allele Frequency Aggregator
405	ALFA_European_AN: Total allele count of the European samples in the Allele Frequency Aggregator
406	ALFA_European_AF: Alternative allele frequency of the European samples in the Allele Frequency Aggregator
407	ALFA_African_Others_AC: Alternative allele count of the individuals with African ancestry in the Allele Frequency Aggregator
408	ALFA_African_Others_AN: Total allele count of the individuals with African ancestry in the Allele Frequency Aggregator
409	ALFA_African_Others_AF: Alternative allele frequency of the individuals with African ancestry in the Allele Frequency Aggregator
410	ALFA_East_Asian_AC: Alternative allele count of the East Asian samples in the Allele Frequency Aggregator
411	ALFA_East_Asian_AN: Total allele count of the East Asian samples in the Allele Frequency Aggregator
412	ALFA_East_Asian_AF: Alternative allele frequency of the East Asian samples in the Allele Frequency Aggregator
413	ALFA_African_American_AC: Alternative allele count of the African American samples in the Allele Frequency Aggregator
414	ALFA_African_American_AN: Total allele count of the African American samples in the Allele Frequency Aggregator
415	ALFA_African_American_AF: Alternative allele frequency of the African American samples in the Allele Frequency Aggregator
416	ALFA_Latin_American_1_AC: Alternative allele count of the Latin American individiuals with Afro-Caribbean ancestry in the Allele Frequency Aggregator
417	ALFA_Latin_American_1_AN: Total allele count of the Latin American individiuals with Afro-Caribbean ancestry in the Allele Frequency Aggregator
418	ALFA_Latin_American_1_AF: Alternative allele frequency of the Latin American individiuals with Afro-Caribbean ancestry in the Allele Frequency Aggregator
419	ALFA_Latin_American_2_AC: Alternative allele count of the Latin American individiuals with mostly European and Native American Ancestry in the Allele Frequency Aggregator
420	ALFA_Latin_American_2_AN: Total allele count of the Latin American individiuals with mostly European and Native American Ancestry in the Allele Frequency Aggregator
421	ALFA_Latin_American_2_AF: Alternative allele frequency of the Latin American individiuals with mostly European and Native American Ancestry in the Allele Frequency Aggregator
422	ALFA_Other_Asian_AC: Alternative allele count of the Asian individiuals excluding South or East Asian in the Allele Frequency Aggregator
423	ALFA_Other_Asian_AN: Total allele count of the Asian individiuals excluding South or East Asian in the Allele Frequency Aggregator
424	ALFA_Other_Asian_AF: Alternative allele frequency of the Asian individiuals excluding South or East Asian in the Allele Frequency Aggregator
425	ALFA_South_Asian_AC: Alternative allele count of the South Asian samples in the Allele Frequency Aggregator
426	ALFA_South_Asian_AN: Total allele count of the South Asian samples in the Allele Frequency Aggregator
427	ALFA_South_Asian_AF: Alternative allele frequency of the South Asian samples in the Allele Frequency Aggregator
428	ALFA_Other_AC: Alternative allele count of the samples whose self-reported population is inconsistent with the GRAF-assigned population in the Allele Frequency Aggregator
429	ALFA_Other_AN: Total allele count of the samples whose self-reported population is inconsistent with the GRAF-assigned population in the Allele Frequency Aggregator
430	ALFA_Other_AF: Alternative allele frequency of the samples whose self-reported population is inconsistent with the GRAF-assigned population in the Allele Frequency Aggregator
431	ALFA_African_AC: Alternative allele count of the all African samples (African_Others and African_American) in the Allele Frequency Aggregator
432	ALFA_African_AN: Total allele count of the all African samples (African_Others and African_American) in the Allele Frequency Aggregator
433	ALFA_African_AF: Alternative allele frequency of the all African samples (African_Others and African_American) in the Allele Frequency Aggregator
434	ALFA_Asian_AC: Alternative allele count of the all Asian individuals (East_Asian and Other_Asian, excluding South_Asian) in the Allele Frequency Aggregator
435	ALFA_Asian_AN: Total allele count of the all Asian individuals (East_Asian and Other_Asian, excluding South_Asian) in the Allele Frequency Aggregator
436	ALFA_Asian_AF: Alternative allele frequency of the all Asian individuals (East_Asian and Other_Asian, excluding South_Asian) in the Allele Frequency Aggregator
437	ALFA_Total_AC: Alternative allele count of the total samples in the Allele Frequency Aggregator
438	ALFA_Total_AN: Total allele count of the total samples in the Allele Frequency Aggregator
439	ALFA_Total_AF: Alternative allele frequency of the total samples in the Allele Frequency Aggregator
440	clinvar_id: clinvar variation ID
441	clinvar_clnsig: clinical significance by clinvar
		Possible values: Benign, Likely_benign, Likely_pathogenic, Pathogenic, drug_response, 
		histocompatibility. A negative score means the score is for the ref allele
442	clinvar_trait: the trait/disease the clinvar_clnsig referring to
443	clinvar_review: ClinVar Review Status summary
		Possible values:  no assertion criteria provided, criteria provided, single submitter,
		criteria provided, multiple submitters, no conflicts, reviewed by expert panel, practice guideline
444	clinvar_hgvs: variant in HGVS format
445	clinvar_var_source: source of the variant
446	clinvar_MedGen_id: MedGen ID of the trait/disease the clinvar_trait referring to
447	clinvar_OMIM_id: OMIM ID of the trait/disease the clinvar_trait referring to
448	clinvar_Orphanet_id: Orphanet ID of the trait/disease the clinvar_trait referring to
449	Interpro_domain: domain or conserved site on which the variant locates. Domain
		annotations come from Interpro database. The number in the brackets following
		a specific domain is the count of times Interpro assigns the variant position to  
		that domain, typically coming from different predicting databases. Multiple entries 
		separated by ";".
450	GTEx_V8_eQTL_gene: target gene of the (significant) eQTL SNP
451	GTEx_V8_eQTL_tissue: tissue type of the expression data with which the eQTL/gene pair is detected
452	GTEx_V8_sQTL_gene: target gene of the (significant) sQTL SNP
453	GTEx_V8_sQTL_tissue: tissue type of the expression data with which the sQTL/gene pair is detected
454	eQTLGen_snp_id: id of the eQTL SNP
455	eQTLGen_gene_id: id of the target gene of the (significant) eQTL SNP
456	eQTLGen_gene_symbol: symbol of the target gene of the (significant) eQTL SNP
457	eQTLGen_cis_or_trans: eQTL type, cis or trans
458	Geuvadis_eQTL_target_gene: Ensembl gene ID of the eQTL associated with, from the Geuvadis project


	
	Note 1: Missing data is designated as '.'. 

Columns of dbNSFP_gene:
	Gene_name: Gene symbol from HGNC
	Ensembl_gene: Ensembl gene id (from HGNC)
	chr: Chromosome number (from HGNC)
459	Gene_old_names: Old gene symbol (from HGNC)
460	Gene_other_names: Other gene names (from HGNC)
461	Uniprot_acc(HGNC/Uniprot): Uniprot acc number (from HGNC and Uniprot)
462	Uniprot_id(HGNC/Uniprot): Uniprot id (from HGNC and Uniprot)
463	Entrez_gene_id: Entrez gene id (from HGNC)
464	CCDS_id: CCDS id (from HGNC)
465	Refseq_id: Refseq gene id (from HGNC)
466	ucsc_id: UCSC gene id (from HGNC)
467	MIM_id: MIM gene id (from HGNC)
468	OMIM_id: MIM gene id from OMIM
469	Gene_full_name: Gene full name (from HGNC)
470	Pathway(Uniprot): Pathway description from Uniprot
471	Pathway(BioCarta)_short: Short name of the Pathway(s) the gene belongs to (from BioCarta)
472	Pathway(BioCarta)_full: Full name(s) of the Pathway(s) the gene belongs to (from BioCarta)
473	Pathway(ConsensusPathDB): Pathway(s) the gene belongs to (from ConsensusPathDB)
474	Pathway(KEGG)_id: ID(s) of the Pathway(s) the gene belongs to (from KEGG)
475	Pathway(KEGG)_full: Full name(s) of the Pathway(s) the gene belongs to (from KEGG)
476	Function_description: Function description of the gene (from Uniprot)
477	Disease_description: Disease(s) the gene caused or associated with (from Uniprot)
478	MIM_phenotype_id: MIM id(s) of the phenotype the gene caused or associated with (from Uniprot)
479	MIM_disease: MIM disease name(s) with MIM id(s) in "[]" (from Uniprot)
480	Orphanet_disorder_id: Orphanet Number of the disorder the gene caused or associated with
481	Orphanet_disorder: Disorder name from Orphanet
482	Orphanet_association_type: the type of association beteen the gene and the disorder
483	Trait_association(GWAS): Trait(s) the gene associated with (from GWAS catalog)
484	HPO_id: ID of the mapped Human Phenotype Ontology. Multiple IDs are separated by ";"
485	HPO_name: Name of the mapped Human Phenotype Ontology. Multiple names are separated by ";"
486	GO_biological_process: GO terms for biological process
487	GO_cellular_component: GO terms for cellular component
488	GO_molecular_function: GO terms for molecular function
489	Tissue_specificity(Uniprot): Tissue specificity description from Uniprot
490	Expression(egenetics): Tissues/organs the gene expressed in (egenetics data from BioMart)
491	Expression(GNF/Atlas): Tissues/organs the gene expressed in (GNF/Atlas data from BioMart)
492	Interactions(IntAct): The number of other genes this gene interacting with (from IntAct).
		Full information (gene name followed by Pubmed id in "[]") can be found in the ".complete"
		table
493	Interactions(BioGRID): The number of other genes this gene interacting with (from BioGRID)
		Full information (gene name followed by Pubmed id in "[]") can be found in the ".complete"
		table
494	Interactions(ConsensusPathDB): The number of other genes this gene interacting with
		(from ConsensusPathDB). Full information (gene name followed by interaction confidence in "[]") can be 
		found in the ".complete" table
495	P(HI): Estimated probability of haploinsufficiency of the gene
		(from doi:10.1371/journal.pgen.1001154)
496	HIPred_score: Estimated probability of haploinsufficiency of the gene
		(from doi:10.1093/bioinformatics/btx028)
497	HIPred: HIPred prediction of haploinsufficiency of the gene. Y(es) or N(o).
		(from doi:10.1093/bioinformatics/btx028)
498	GHIS: A score predicting the gene haploinsufficiency. The higher the score the more likely the gene is
		haploinsufficient. (from doi: 10.1093/nar/gkv474) 
499	P(rec): Estimated probability that gene is a recessive disease gene
		(from DOI:10.1126/science.1215040)
500	Known_rec_info: Known recessive status of the gene (from DOI:10.1126/science.1215040)
		"lof-tolerant = seen in homozygous state in at least one 1000G individual"
		"recessive = known OMIM recessive disease" 
		(original annotations from DOI:10.1126/science.1215040)
501	RVIS_EVS: Residual Variation Intolerance Score, a measure of intolerance of mutational burden,
		the higher the score the more tolerant to mutational burden the gene is. Based on EVS (ESP6500) data.
		from doi:10.1371/journal.pgen.1003709
502	RVIS_percentile_EVS: The percentile rank of the gene based on RVIS, the higher the percentile
		the more tolerant to mutational burden the gene is. Based on EVS (ESP6500) data.
503	LoF-FDR_ExAC: "A gene's corresponding FDR p-value for preferential LoF depletion among the ExAC population.
		Lower FDR corresponds with genes that are increasingly depleted of LoF variants." cited from RVIS document.
504	RVIS_ExAC: "ExAC-based RVIS; setting 'common' MAF filter at 0.05% in at least one of the six individual
		ethnic strata from ExAC." cited from RVIS document.
505	RVIS_percentile_ExAC: "Genome-Wide percentile for the new ExAC-based RVIS; setting 'common' MAF filter at 0.05%
		in at least one of the six individual ethnic strata from ExAC." cited from RVIS document.
506	ExAC_pLI: "the probability of being loss-of-function intolerant (intolerant of both heterozygous and 
		homozygous lof variants)" based on ExAC r0.3 data
507	ExAC_pRec: "the probability of being intolerant of homozygous, but not heterozygous lof variants"
		based on ExAC r0.3 data
508	ExAC_pNull: "the probability of being tolerant of both heterozygous and homozygous lof variants"
		based on ExAC r0.3 data
509	ExAC_nonTCGA_pLI: "the probability of being loss-of-function intolerant (intolerant of both heterozygous and 
		homozygous lof variants)" based on ExAC r0.3 nonTCGA subset
510	ExAC_nonTCGA_pRec: "the probability of being intolerant of homozygous, but not heterozygous lof variants"
		based on ExAC r0.3 nonTCGA subset
511	ExAC_nonTCGA_pNull: "the probability of being tolerant of both heterozygous and homozygous lof variants"
		based on ExAC r0.3 nonTCGA subset
512	ExAC_nonpsych_pLI: "the probability of being loss-of-function intolerant (intolerant of both heterozygous and 
		homozygous lof variants)" based on ExAC r0.3 nonpsych subset
513	ExAC_nonpsych_pRec: "the probability of being intolerant of homozygous, but not heterozygous lof variants"
		based on ExAC r0.3 nonpsych subset
514	ExAC_nonpsych_pNull: "the probability of being tolerant of both heterozygous and homozygous lof variants"
		based on ExAC r0.3 nonpsych subset
515	gnomAD_pLI: "the probability of being loss-of-function intolerant (intolerant of both heterozygous and 
		homozygous lof variants)" based on gnomAD 2.1 data
516	gnomAD_pRec: "the probability of being intolerant of homozygous, but not heterozygous lof variants"
		based on gnomAD 2.1 data
517	gnomAD_pNull: "the probability of being tolerant of both heterozygous and homozygous lof variants"
		based on gnomAD 2.1 data
518	ExAC_del.score: "Winsorised deletion intolerance z-score" based on ExAC r0.3.1 CNV data
519	ExAC_dup.score: "Winsorised duplication intolerance z-score" based on ExAC r0.3.1 CNV data
520	ExAC_cnv.score: "Winsorised cnv intolerance z-score" based on ExAC r0.3.1 CNV data
521	ExAC_cnv_flag: "Gene is in a known region of recurrent CNVs mediated by tandem segmental duplications and
		intolerance scores are more likely to be biased or noisy." from ExAC r0.3.1 CNV release
522	GDI: gene damage index score, "a genome-wide, gene-level metric of the mutational damage that has
		accumulated in the general population" from doi: 10.1073/pnas.1518646112. The higher the score
		the less likely the gene is to be responsible for monogenic diseases.
523	GDI-Phred: Phred-scaled GDI scores
524	Gene damage prediction (all disease-causing genes): gene damage prediction (low/medium/high) by GDI
		for all diseases
525	Gene damage prediction (all Mendelian disease-causing genes): gene damage prediction (low/medium/high)
		by GDI for all Mendelian diseases
526	Gene damage prediction (Mendelian AD disease-causing genes): gene damage prediction (low/medium/high)
		by GDI for Mendelian autosomal dominant diseases
527	Gene damage prediction (Mendelian AR disease-causing genes): gene damage prediction (low/medium/high)
		by GDI for Mendelian autosomal recessive diseases
528	Gene damage prediction (all PID disease-causing genes): gene damage prediction (low/medium/high)
		by GDI for all primary immunodeficiency diseases
529	Gene damage prediction (PID AD disease-causing genes): gene damage prediction (low/medium/high)
		by GDI for primary immunodeficiency autosomal dominant diseases
530	Gene damage prediction (PID AR disease-causing genes): gene damage prediction (low/medium/high)
		by GDI for primary immunodeficiency autosomal recessive diseases
531	Gene damage prediction (all cancer disease-causing genes): gene damage prediction (low/medium/high)
		by GDI for all cancer disease
532	Gene damage prediction (cancer recessive disease-causing genes): gene damage prediction (low/medium/high)
		by GDI for cancer recessive disease
533	Gene damage prediction (cancer dominant disease-causing genes): gene damage prediction (low/medium/high)
		by GDI for cancer dominant disease
534	LoFtool_score: a percentile score for gene intolerance to functional change. The lower the score the higher
		gene intolerance to functional change. For details see doi: 10.1093/bioinformatics/btv602.
535	SORVA_LOF_MAF0.005_HetOrHom: the fraction of individuals in the 1000 Genomes Project data (N=2504)
		who are either Heterozygote or Homozygote of LOF SNVs whose MAF<0.005. This fraction is from 
		a method for ranking genes based on mutational burden called SORVA (Significance Of Rare VAriants). 
		Please see doi: 10.1101/103218 for details.
536	SORVA_LOF_MAF0.005_HomOrCompoundHet: the fraction of individuals in the 1000 Genomes Project data (N=2504)
		who are either Compound Heterozygote or Homozygote of LOF SNVs whose MAF<0.005. This fraction is from 
		a method for ranking genes based on mutational burden called SORVA (Significance Of Rare VAriants). 
		Please see doi: 10.1101/103218 for details.
537	SORVA_LOF_MAF0.001_HetOrHom: the fraction of individuals in the 1000 Genomes Project data (N=2504)
		who are either Heterozygote or Homozygote of LOF SNVs whose MAF<0.001. This fraction is from 
		a method for ranking genes based on mutational burden called SORVA (Significance Of Rare VAriants). 
		Please see doi: 10.1101/103218 for details.
538	SORVA_LOF_MAF0.001_HomOrCompoundHet: the fraction of individuals in the 1000 Genomes Project data (N=2504)
		who are either Compound Heterozygote or Homozygote of LOF SNVs whose MAF<0.001. This fraction is from 
		a method for ranking genes based on mutational burden called SORVA (Significance Of Rare VAriants). 
		Please see doi: 10.1101/103218 for details.
539	SORVA_LOForMissense_MAF0.005_HetOrHom: the fraction of individuals in the 1000 Genomes Project data (N=2504)
		who are either Heterozygote or Homozygote of LOF or missense SNVs whose MAF<0.005. This fraction is from 
		a method for ranking genes based on mutational burden called SORVA (Significance Of Rare VAriants). 
		Please see doi: 10.1101/103218 for details.
540	SORVA_LOForMissense_MAF0.005_HomOrCompoundHet: the fraction of individuals in the 1000 Genomes Project data (N=2504)
		who are either Compound Heterozygote or Homozygote of LOF or missense SNVs whose MAF<0.005. This fraction is from 
		a method for ranking genes based on mutational burden called SORVA (Significance Of Rare VAriants). 
		Please see doi: 10.1101/103218 for details.
541	SORVA_LOForMissense_MAF0.001_HetOrHom: the fraction of individuals in the 1000 Genomes Project data (N=2504)
		who are either Heterozygote or Homozygote of LOF or missense SNVs whose MAF<0.001. This fraction is from 
		a method for ranking genes based on mutational burden called SORVA (Significance Of Rare VAriants). 
		Please see doi: 10.1101/103218 for details.
542	SORVA_LOForMissense_MAF0.001_HomOrCompoundHet: the fraction of individuals in the 1000 Genomes Project data (N=2504)
		who are either Compound Heterozygote or Homozygote of LOF or missense SNVs whose MAF<0.001. This fraction is from 
		a method for ranking genes based on mutational burden called SORVA (Significance Of Rare VAriants). 
		Please see doi: 10.1101/103218 for details.
543	Essential_gene: Essential ("E") or Non-essential phenotype-changing ("N") based on
		Mouse Genome Informatics database. from doi:10.1371/journal.pgen.1003484
544	Essential_gene_CRISPR: Essential ("E") or Non-essential phenotype-changing ("N") based on 
		large scale CRISPR experiments. from doi: 10.1126/science.aac7041
545	Essential_gene_CRISPR2: Essential ("E"), context-Specific essential ("S"), or Non-essential phenotype-changing ("N") 
		based on large scale CRISPR experiments. from http://dx.doi.org/10.1016/j.cell.2015.11.015
546	Essential_gene_gene-trap: Essential ("E"), HAP1-Specific essential ("H"), KBM7-Specific essential ("K"),
		or Non-essential phenotype-changing ("N"), based on large scale mutagenesis experiments. 
		from doi: 10.1126/science.aac7557
547	Gene_indispensability_score: A probability prediction of the gene being essential. From doi:10.1371/journal.pcbi.1002886
548	Gene_indispensability_pred: Essential ("E") or loss-of-function tolerant ("N") based on Gene_indispensability_score.
549	MGI_mouse_gene: Homolog mouse gene name from MGI
550	MGI_mouse_phenotype: Phenotype description for the homolog mouse gene from MGI
551	ZFIN_zebrafish_gene: Homolog zebrafish gene name from ZFIN
552	ZFIN_zebrafish_structure: Affected structure of the homolog zebrafish gene from ZFIN
553	ZFIN_zebrafish_phenotype_quality: Phenotype description for the homolog zebrafish gene
		from ZFIN
554	ZFIN_zebrafish_phenotype_tag: Phenotype tag for the homolog zebrafish gene from ZFIN

Columns of dbscSNV1.1:
	chr: chromosome number
	pos: physical position on the chromosome as to hg19 (1-based coordinate)
	ref: reference nucleotide allele (as on the + strand)
	alt: alternative nucleotide allele (as on the + strand)
	hg38_chr: chromosome number as to hg38
	hg38_pos: physical position on the chromosome as to hg38 (1-based coordinate)
	RefSeq?: whether the SNV is a scSNV according to RefSeq
	Ensembl?: whether the SNV is a scSNV according to Ensembl
	RefSeq_region: functional region the SNV located according to RefSeq
	RefSeq_gene: gene name according to RefSeq
	RefSeq_functional_consequence: functional consequence of the SNV according to RefSeq
	RefSeq_id_c.change_p.change: SNV in format of c.change and p.change according to RefSeq
	Ensembl_region: functional region the SNV located according to Ensembl
	Ensembl_gene: gene id according to Ensembl
	Ensembl_functional_consequence: functional consequence of the SNV according to Ensembl
	Ensembl_id_c.change_p.change: SNV in format of c.change and p.change according to Ensembl
	ada_score: ensemble prediction score based on ada-boost. Ranges 0 to 1. The larger the 
		score the higher probability the scSNV will affect splicing. The suggested cutoff for
		a binary prediction (affecting splicing vs. not affecting splicing) is 0.6.
	rf_score: ensemble prediction score based on random forests. Ranges 0 to 1. The larger the 
		score the higher probability the scSNV will affect splicing. The suggested cutoff for
		a binary prediction (affecting splicing vs. not affecting splicing) is 0.6.
	
	Note 1: Missing data is designated as '.'. 
	Note 2: Multiple annotations are separated by ';'

Please cite:
	Liu X, Jian X, and Boerwinkle E. 2011. dbNSFP: a lightweight database of human 
		non-synonymous SNPs and their functional predictions. Human Mutation. 32:894-899.
	Liu X, Wu C, Li C and Boerwinkle E. 2016. dbNSFP v3.0: A One-Stop Database of Functional 
		Predictions and Annotations for Human Non-synonymous and Splice Site SNVs. 
		Human Mutation. 37(3):235-241.

Contact:
	Xiaoming Liu, Ph.D.
	Associate Professor,
	USF Genomics,
	College of Public Health,
	University of South Florida
	Email: xmliu.uth{at}gmail.com 
	
Changelog:
	February 23, 2011: dbNSFP and search_dbNSFP v0.9 released.

	April 4, 2011: A bug related to the prediction scores of MutationTaster is fixed. dbNSFP v1.0 
	released. A change to the chromosome search order of the search_dbNSFP. A readme file added. 
	search_dbNSFP v1.0 released.

	May 30, 2011: dbNSFP and search_dbNSFP v1.1 released. Version 1.1 added the following entries: 
	rs numbers from UniSNP (a cleaned version of dbSNP build 129), allele frequency recorded in dbSNP, 
	allele frequency reported by 1000 Genomes Project, alternative gene names, descriptive gene name, 
	database cross references (gene IDs of HGNC, MIM, Ensembl and HPRD). The unziped database is 18Gb.

	May 31, 2011: dbNSFP_light and search_dbNSFP_light v1.0 released. dbNSFP_light v1.0 is a light 
	version of dbNSFP, which contains less annotation entries but some additional 9,285,316 NSs that 
	are not in CCDS version 20090327. Scores of PhyloP, SIFT, Polyphen2, LRT and MutationTaster are 
	included but missing data are not  imputed. Prediction of LRT and MutationTaster are also included, 
	as well as the omega estimated by LRT. The unziped database is 6Gb.
 
	October 24, 2011: dbNSFP_light v1.1 and search_dbNSFP_light v1.1 released. dbNSFP v1.2 and 
	search_dbNSFP v1.2 released. The new versions added GERP++ neutral rates and RS scores.

	October 25, 2011: dbNSFP v1.3 released. It added Uniprot ID, accession number and amino acid 
	position based on Polyphen-2 annotation. Users now can search amino acid change directly referring 
	to a Uniprot ID or accession number. 
 
	November 3, 2011: dbNSFP_light v1.2 released. It added Uniprot ID, accession number and amino acid 
	position based on Polyphen-2 annotation. Users now can search amino acid change directly referring 
	to a Uniprot ID or accession number.
 
	November 10, 2011:  A bug fixed in the companion search program for dbNSFP v1.3, which causes invalid 
	search using AA mutations with Uniprot ID or accession number.
 
	December 16, 2011: dbNSFP_light v1.3 released. It updated SIFT scores (August, 2011 version) and 
	Polyphen-2 scores (May, 2011 version). Uniprot ID, accession number and amino acid position based 
	on the Polyphen-2 annotations have been updated too.

	April 11, 2012: dbNSFP2.0b1_variant released. This is beta test version of the variant sub-database 
	of dbNSFP v2.0, which is rebuilt based on Gencode release 9 / Ensembl version 64.

	June 2, 2012: dbNSFP v2.0b2 released. It includes both the dbNSFP_variant and dbNSFP_gene sub-databases. 
	Slight changes have been made to the Ensembl gene and transcript ids of dbNSFP_variant in order to be 
	compatible to other database sources.

	July 2, 2012: dbNSFP v2.0b3 released. An additional 2.2 million splicing site SNPs have been added to 
	dbNSFP_variant. In the table those SNPs have missing (".") in aaref, aaalt and "-1" in aapos. There's 
	no change to the format of search input file.

	August 28, 2012: The companion java search program search_dbNSFP20b3 is updated. Added features include 
	supporting vcf file as input file and options for output contents (columns).

	October 27, 2012: dbNSFP v2.0b4 is released. A new functional prediction score MutationAssessor is added 
	(I thank Mr. Yevgeniy Antipin for his recommendation). Allele frequencies from ESP 5400 data set are 
	replaced by ESP 6500 data set.

	February 25, 2013: dbNSFP v2.0 is released. A new functional prediction score FATHMM is added.

	March 22, 2013: A bug which caused a lot of missing FATHMM scores has been fixed.  

	May 31, 2013: The source code of the companion Java search program is now available under the RECEX SHARED 
	SOURCE LICENSE.

	October 3, 2013: dbNSFP v2.1 is released. MutationTaster and FATHMM scores have been updated. Converted 
	scores of SIFT, LRT, MutationTaster, MutationAssessor and FATHMM have been added. Columns of SIFT and FATHMM 
	predictions have been added. The gene database has also been updated. Database IDs are updated. GO Slim terms, 
	pathway and protein interaction information from the ConsensusPathDB, and list of essential and non-essential 
	genes (based on phenotypes of mouse homologs) have been added.

	January 23, 2014: dbNSFP v2.2 is released. SIFT and FATHMM now have multiple scores corresponding to different 
	Ensembl ENSP ids and amino acid positions (aapos_SIFT and aapos_FATHMM). Accordingly, our companion search 
	program now supports SNP searches based on Ensembl ENSP ids and amino acid positions. A bug is fixed for a 
	small proportion of MutationTaster scores.

	January 26, 2014: dbNSFP v2.3 is released.  Two ensemble scores (RadialSVM and LR) and their predictions have 
	been added.

	February 12, 2014: A bug was fixed in dbNSFP v2.2 and v2.3, which caused missing delimiters in columns 
	aapos_SIFT, SIFT_score_converted and SIFT_pred. (I thank Mr. Yevgeniy Antipin for his reminder). 

	March 5, 2014: dbNSFP v2.4 is released. A whole genome functional prediction score called CADD was added, 
	along with five more conservation scores (phyloP46way_primate, phyloP100way_vertebrate, phastCons46way_primate, 
	phastCons46way_placental, phastCons100way_vertebarate). To facilitate comparison between scores, we added rank 
	scores for most functional prediction scores and conservation scores, and replacing the  "converted" scores in 
	the previous versions.

	June 1, 2014: dbNSFP v2.5 is released. A new functional score VEST 3.0 has been added. We thank Dr. Karchin for 
	kindly providing the score. A bug that causes the MutationTaster score error since v2.1 for variants with a 
	prediction of  "Polymorphism_automatic" has been fixed. We thank John McGuigan and James Ireland for reporting 
	this bug. As MutationTaster can also predict splicing change and other functional effects, in case a variant has 
	multiple predictions based on their different model, we took the most damaging score and prediction for dbNSFP. 
	
	July 26, 2014: dbNSFP v2.6 is released. rs numbers from dbSNP 141 have been added to the variant database files. 
	Mouse and zebra fish homolog genes and phenotypes have been added to the gene database file (I thank Alex Li for
	his suggestion and helps). Trait_association(GWAS) was also updated. An attached database called dbscSNV is 
	available for download. It includes all potential human SNVs within splicing consensus regions (−3 to +8 at the 
	5’ splice site and −12 to +2 at the 3’ splice site), i.e. scSNVs, related functional annotations and two ensemble 
	prediction scores for predicting their potential of altering splicing. A manuscript describing those scores have 
	been submitted. search_dbNSFP26 now supports searching dbNSFP along with dbscSNV using option "-s". 
	
	September 12, 2014: dbNSFP v2.7 is released. Chromosomes and positions of human reference hg38 have been added. 
	search_ dbNSFP27.class now supports query dbNSFP using the positions based on hg38 with the "-v hg38" option. 
	clinvar (freeze 20140902) annotations have been added. Allele frequencies from 2303 exomes of African Americans 
	and 3203 exomes of European Americans from the Atherosclerosis Risk in Communities Study (ARIC) cohort study 
	have been added. As the columns for gene interactions in dbNSFP_gene table contain very long strings, especially 
	for gene UBC, which may cause problems when viewing the results in Excel, now we only report the number of 
	interacting genes in those columns. Full information is retained in the dbNSFP_gene.complete table.
	
	November 21, 2014:  dbNSFP v2.8 is released. COSMIC (Catalogue Of Somatic Mutations In Cancer) annotation have 
	been added. Pathway information from BioCarta and KEGG (old version) has been added to the dbNSFP2.8_gene. A bug 
	causing inconsistency between MutationTaster scores and MutationTaster_pred, which affects v2.5 to v2.7, has 
	been fixed. I thank Adam Novak for reporting this bug.
	
	February 3, 2015: dbNSFP v2.9 is released. SIFT score has been updated to ensembl66 version. PROVEAN score 
	(Protein Variation Effect Analyzer) v1.1 has been added. I thank Yongwook Choi from jcvi for providing the SIFT 
	and PROVEAN scores. CADD score has been updated to 1.3 version. Please note the following copyright statement 
	for CADD: "CADD scores (http://cadd.gs.washington.edu/) are Copyright 2013 University of Washington and 
	Hudson-Alpha Institute for Biotechnology (all rights reserved) but are freely available for all academic, 
	non-commercial applications. For commercial licensing information contact Jennifer McCullar (mccullaj@uw.edu)."
	Allele frequency v0.3 of ~60,706 unrelated individuals from The Exome Aggregation Consortium (ExAC) has been added. 
	ExAC data are released under a Fort Lauderdale Agreement. Please refer to http://exac.broadinstitute.org/terms 
	for terms of use. I also want to thank Dr. CS (Jonathan) Liu from Softgenetics for providing hosting space.
	
	April 6, 2015: dbNSFP v3.0b1 is released. The core set of nsSNVs and ssSNVs has been rebuilt based on Gencode 22/
	Ensembl 79 with human reference sequence hg38. Putative genes have been included. Genes with incomplete 5' have
	been excluded (I thank Chris Gillies for reporting the issues for genes with incomplete 5' end.) Genes on 
	mitochondrial DNA have been included. Allele frequencies from the UK10K cohorts and genotypes of two Neanderthals 
	have been added. Some resources have been updated, including the MutationTaster (I thank Dr. Dominik Seelow 
	for kindly providing the scores), allele frequencies from the 1000 Genomes Project populations, ancestral alleles, 
	dbSNP, ClinVar and InterPro. The presentation of the prediction scores has been improved by adding columns for 
	the corresponding transcript/protein ids. PhyloP and PhastCons conservation scores based on hg19 have been 
	replaced by the scores based on hg38. Some resources have been dropped due to various reasons, including SLR 
	test statistic, UniSNP ids, allele frequencies from the ARIC cohorts and allele counts in COSMIC. dbNSFP_gene 
	has also been completely rebuilt using the up-to-date resources. Residual Variation Intolerance Scores (RVIS) 
	have been added. GO Slim terms have been replaced by full GO terms. Two branches of dbNSFP are now provided:
	dbNSFP3.0b1a suitable for academic use, which includes all the resources, and dbNSFP3.0b1c suitable for commercial 
	use, which does not include VEST3 and CADD. 
	
	April 12, 2015: dbNSFP v3.0b2 is released. This update fixed the issues due to inconsistent mitochondrial reference
	sequences used by different resources. I thank Dr. Lishuang Shen at MEEI for helping solving the issues. For 
	mitochondrial SNV, the pos (i.e. hg38) refers to the rCRS (GenBank: NC_012920) and hg19_pos refers to a YRI 
	sequence (GenBank: AF347015). The ancestral allele of mitochondrial SNV now comes from the Reconstructed Sapiens 
	Reference Sequence (RSRS, doi:10.1016/j.ajhg.2012.03.002). The affected content include ancestral alleles, 
	Neanderthal/Denisova genotypes and MutationTaster columns of the chrM file. The rankscores of MutationTaster has
	also been updated to reflect the update of its chrM scores. dbscSNV has been updated to v1.1 and added hg38 
	positions liftovered from its hg19 positions. Using search_dbNSFP30b2a or search_dbNSFP30b2c you can search 
	dbscSNV1.1 along with dbNSFP v3.0b2 with either hg19 coordinates or hg38 coordinates.
	
	August 3, 2015: dbNSFP v3.0 is released. Three new functional prediction scores (DANN, fathmm-MKL and fitCons) and
	two conservation scores (phyloP20way_mammalian and phastCons20way_mammalian) have been added to dbNSFP v3.0a. 
	All five scores except DANN are also included in bNSFP v3.0c. For commercial application of DANN, please contact 
	Daniel Quang (dxquang@uci.edu). CADD scores have been updated to v1.3. I thank Dr. Xueqiu Jian and Kirill Prusov for 
	suggestions on README files. dbNSFP v3.0 will be integrated into our new whole genome annotation pipeline WGSA version
	0.6. Please join our Email group for news and updates from dbNSFP. 
	Columns updated: CADD_raw (dbNSFP v3.0a only), CADD_raw_rankscore (dbNSFP v3.0a only), CADD_phred (dbNSFP v3.0a only). 
	New columns: DANN_score (dbNSFP v3.0a only), DANN_rankscore (dbNSFP v3.0a only), fathmm-MKL_coding_score, 
	fathmm-MKL_coding_rankscore, fathmm-MKL_coding_pred, fathmm-MKL_coding_group, integrated_fitCons_score, 
	integrated_fitCons_rankscore, integrated_confidence_value, GM12878_fitCons_score, GM12878_fitCons_rankscore, 
	GM12878_confidence_value, H1-hESC_fitCons_score, H1-hESC_fitCons_rankscore, H1-hESC_confidence_value, HUVEC_fitCons_score, 
	HUVEC_fitCons_rankscore, HUVEC_confidence_value.

	November 24, 2015: dbNSFP v3.1 is released. Significant eQTLs from GTEx V6 has been added. dbSNP rs has been updated to 
	build 144. Gene expression information (rpkm of RNAseq) of 53 tissues from GTEx V6 has been added to dbNSFP_gene. Three
	gene intolerance scores (RVIS based on ExAC r0.3, GDI and LoFtool) has been added to dbNSFP_gene. 
	
	March 20, 2016: dbNSFP v3.2 is released. Eigen score, Eigen PC score (doi: 10.1038/ng.3477) and GenoCanyon score 
	(doi:10.1038/srep10576) have been added. Allele frequencies of two commonly used subsets of ExAC data (nonTCGA
	and nonpsych) have been added. Mutation Assessor scores have been updated to release 3. PhyloP7way_vertebrate 
	and PhastCons7way_vertebrate conservation scores have been updated to PhyloP100way_vertebrate and PhastCons100way_vertebrate, 
	respectively. rankscores have been updated accordingly. Ancestral alleles have been updated based on Ensembl 84. 
	dbSNP has been updated to build 146. Clinvar has been updated to 20160302. InterPro has been updated to v56. 
	Gene name cross-links, IntAct, Uniprot, GWAS catalog, BioGRID, GO, ConsensusPathDB, mouse genes and zebra fish 
	genes information for the dbNSFP_gene table have been updated.
	
	November 30, 2016: dbNSFP v3.3 and v2.9.2 are released. M-CAP score (DOI: 10.1038/ng.3703) has been added. We thank Dr. Gill Bejerano
	for providing the score. Eigen and Eigen PC scores have been updated to v1.1. dbSNP has been updated to v147. clinvar has
	been updated to 20161101. 
	
	March 12, 2017: dbNSFP v3.4 and v2.9.3 are released. REVEL score ( doi: 10.1016/j.ajhg.2016.08.016) and MutPred score 
	(doi: 10.1093/bioinformatics/btp528) have been added. SORVA gene ranking scores (doi: 10.1101/103218) have been added to 
	gene annotation. 
	
	August 6, 2017: dbNSFP v3.5 is released. Allele frequencies from the exomes and genomes of the Genome Aggregation Database (gnomAD)
	have been added. Interpro, dbSNP, clinvar, ancestral alleles, Altai Neanderthal genotypes, Denisova genotypes and GTEx eQTLs have been updated. 
	dbNSFP_gene has been rebuilt with updated annotations. Other changes to dbNSFP_gene include: Interactions columns now show the gene list 
	instead of the total number; GTEx gene expression annotations have been removed; LoF FDR p-value from RVIS has been added; 
	Genome-wide haploinsufficiency score (GHIS) has been added; LoF and CNV intolerance/tolerance scores based on ExAC data have been added.
	
	December 8, 2018: dbNSFP v4.0b1 is released for beta testing. The core set of nsSNVs and ssSNVs has been rebuilt based on Gencode 29/
	Ensembl 94 with human reference sequence hg38. Eight deleteriousness prediction scores (ALoFT, DEOGEN2, FATHMM-XF, MPC, MVP, 
	PrimateAI, LINSIGHT, SIFT4G) have been added. Three conservation scores (phyloP17way_primate, phastCons17way_primate, bStatistic) 
	have been added. Allele frequencies from the gnomAD consortium, eQTLs from the Geuvadis project, and genotypes of a Vindija33.19 
	Neanderthal have been added. Some resources have been updated, including VEST (We thank Dr. Karchin), CADD, M-CAP, ancestral 
	alleles, dbSNP, ClinVar, GTEx and InterPro. The presentation of the prediction scores has been further improved by adding 
	the correspondence to transcript/protein ids in a systematic way. APPRIS, GENCODE_basic, TSL and VEP_canonical have been added
	to facilitate the choice of appropriate transcripts. dbNSFP_gene has also been completely rebuilt using the up-to-date 
	resources. HIPred, gene constraint scores from the gnomAD data, essential genes predictions based on CRISPR, gene-trap and gene 
	networks have been added. Two branches of dbNSFP are provided: dbNSFP4.0b1a suitable for academic use, which includes all the resources, 
	and dbNSFP4.0b1c suitable for commercial use, which does not include Polyphen2, VEST, REVEL, CADD, LINSIGHT, and GenoCanyon. 
	Please contact Dr. Xiaoming Liu (xmliu.uth{at}gmail.com) for commercial usage of dbNSFP. 
	
	December 30, 2018: A bug causing id mapping issue from Uniprot to Ensembl, which further causing increased missing rates of Polyphen2,
	MutationAssessor and DEOGEN2, has been found and fixed (We thank Dr. Daniele Raimondi). 
	
	February 20, 2019: sprot_varsplic was included in the mapping from Uniprot to Ensembl. Fixed column title inconsistency
	between the README file and data file. (We thank Kevin Xin and Julius Jacobsen for pointing out the inconsistency.) dbMTS was added as
	an attached database. search_dbNSFP added support for searching dbMTS with option '-m'. 
	
	May 3, 2019: dbNSFP v4.0 is released. HGVS c. and p. presentations from ANNOVAR, SnpEff and VEP have been added. search_dbNSFP now 
	supports search based on HGVS c. and p. presentations. Please refer to search_dbNSFP40a.readme.pdf or search_dbNSFP40c.readme.pdf for details.
	MedGen ID, OMIM ID and Orphanet ID from clinvar have been added. 
	
	December 5, 2019: A minor bug is fixed in dbNSFP v4.0. In the previous release the content of the following columns were compressed, 
	i.e. if annotations for all transcripts are identical, only one annotation was presented: genename, cds_strand, refcodon, codonpos, 
	codon_degeneracy, FATHMM_score, FATHMM_pred, Interpro_domain. In this release those columns are decompressed, i.e. have the same
	number of annotations as the number of transcripts. A Java-based graphic user interface (GUI) search program (search_dbNSFP40a.jar or 
	search_dbNSFP40c.jar) has been added. Users can double-click the jar file to launch the GUI (it supports commandline also, please
	check the search_dbNSFP readme pdf for details).
	
	May 15, 2020:  A minor bug is fixed in dbNSFP v4.0. In the previous release, the column Primate_AI_pred was not 100% correct. 
	We thank Alex Kouris for reporting this issue.
	
	June 16, 2020: dbNSFP v4.1 is released. BayesDel (https://doi.org/10.1002/humu.23158), ClinPred (https://doi.org/10.1016/j.ajhg.2018.08.005) 
	and LIST-S2 (https://doi.org/10.1093/nar/gkaa288) scores have been added. CADD has been updated to v1.6, CADD score based on hg19 model has 
	been added. Clinvar, GTEx and gnomAD genomes have been updated. HPO terms have been added to the dbNSFP_gene. search_dbNSFP programs now 
	support searching SpliceAI as an attached database.
	
	Jan 27, 2021: The command-line only version of the search programs for v4.1a and v4.1c were added. 
	
	Feb 10, 2021: A bug fixed. In the previous release, the gnomAD_pLI, gnomAD_pRec and gnomAD_pNull scores in dbNSFP4.1_gene.gz and 
	dbNSFP4.1_gene.complete.gz have a problem that the scores are not always corresponding to the canonical transcripts of the genes. 
	We thank Dr. Raphaël Helaers for reporting this bug.
	
	March 12, 2021: A bug fixed. In the previous release, some ALoFT scores/information are missing in dbNSFP. We thank Dr. Shuwei Li for reporting 
	this bug.
	
	April 6, 2021: dbNSFP v4.2 is released. MetaRNN scores have been added. Allele frequencies of gnomAD exome have been updated to r2.1.1.
	Allele Frequencies of gnomAD genome have been updated to v3.1. dbSNP has been updated to 154. clinvar has been updated to 20210131.
	
	February 18, 2022: dbNSFP v4.3 is released. REVEL scores have been updated with transcript ids, i.e., the scores are now transcript-specific. 
	Genotypes of Chagyrskaya neandertals have been added. dbSNP has been updated to b155. clinvar has been updated to 20220122.
	
	May 6, 2023: dbNSFP v4.4 is released. gMVP and VARITY scores have been added. Allele frequencies of ALFA (Allele Frequency Aggregator) have
	been added. dbSNP has been updated to b156. clinvar has been updated to 20230430. phyloP30way_mammalian has been replaced by phyloP470way_mammalian.
	phastCons30way_mammalian has been replaced by phastCons470way_mammalian. A bug in MutPred scores (not all SNVs causing the same AA change have scores)
	has been fixed. 
	
	November 2, 2023: dbNSFP v4.5 is released. ClinVar has been updated to 20231028. ESM1b, EVE and AlphaMissense scores have been added. AlphaMissense 
	scores are for non-commercial research use only: "AlphaMissense Database Copyright (2023) DeepMind Technologies Limited. All predictions are provided 
	for non-commercial research use only under CC BY-NC-SA license." This distribution of the derived AlphaMissense_score, AlphaMissense_rankscore, and 
	AlphaMissense_pred in dbNSFP are also under CC BY-NC-SA license and only included in the "a" branch of dbNSFP. A copy of CC BY-NC-SA license can be found 
	at https://creativecommons.org/licenses/by-nc-sa/4.0/. 
	
	February 18, 2024: dbNSFP v4.6 is released. ClinVar has been updated to 20240215. GTEx V8 splicing QTLs (sQTLs) have been added. eQTLs from 
	eQTLGen phase I have been added. There was a bug in v4.5 causing a large proportion of ESM1b scores to be misaligned. It has been fixed. We thank 
	Dr. In-Hee Lee for reporting this bug.
	
	March 3, 2024: dbNSFP v4.7 is released. CADD has been updated to v1.7. Allele frequencies of gnomAD exomes and genomes have been updated to v4.0.0.
	One bug in v4.6 causing eQTLGen eQTLs of some tissues missing has been fixed. 
	
	March 13, 2024: AlphaMissense scores are now licensed under the Creative Commons Attribution 4.0 International License (CC-BY), thereby been added to 
	dbNSFP v4.7 "c" branch. 
	
	June 13, 2024: dbNSFP v4.8 is released. MutFormer and PHACTboost scores have been added. GERP conservation score calculated based on 91 mammals has been 
	added.
	
	August 8, 2024: dbNSFP v4.9 is released. MutScore has been added. ClinVar has been updated.