samocha_enrichment_background |
enrichment/samocha_background
|
0 |
4 |
1.38 MB |
## Samocha's Enrichment Background Model
From Samocha KE, et al. A framework for the interpretation of
de novo mutation in human disease. Nat Genet. 2014 Sep;46(9):944-50.
doi: 10.1038/ng.3050. Epub 2014 Aug 3. PMID: 25086666; PMCID: PMC4222185. |
gene_score |
gene_properties/gene_scores/Iossifov_Wigler_PNAS_2015
|
0 |
6 |
7.8 MB |
Probability for gene associations with autims computed based
on the genes vulnerability to damaging coding mutation and the load of
damaging de novo mutations in individuals with autism.
|
gene_score |
gene_properties/gene_scores/LGD
|
0 |
7 |
533.06 KB |
Gene vulnerability/intollerance score based on the rare LGD variants.
|
gene_score |
gene_properties/gene_scores/RVIS
|
0 |
7 |
451.29 KB |
The Residual Variation Intolerance Score (RVIS) gene intollerance score.
|
gene_score |
gene_properties/gene_scores/SFARI_gene_score
|
0 |
6 |
124.38 KB |
|
gene_score |
gene_properties/gene_scores/Satterstrom_Buxbaum_Cell_2020
|
0 |
6 |
613.41 KB |
TADA derived gene autism association score.
|
gene_score |
gene_properties/gene_scores/pLI
|
0 |
7 |
893.56 KB |
|
gene_score |
gene_properties/gene_scores/pRec
|
0 |
7 |
895.39 KB |
|
gene_set |
gene_properties/gene_sets/GO
|
0 |
4 |
12.07 MB |
|
gene_set |
gene_properties/gene_sets/MSigDB_curated
|
0 |
3 |
3.12 MB |
|
gene_set |
gene_properties/gene_sets/autism
|
0 |
7 |
4.37 KB |
|
gene_set |
gene_properties/gene_sets/disease
|
0 |
3 |
66.76 KB |
|
gene_set |
gene_properties/gene_sets/domain
|
0 |
3 |
391.19 KB |
|
gene_set |
gene_properties/gene_sets/main
|
0 |
17 |
103.89 KB |
|
gene_set |
gene_properties/gene_sets/miRNA
|
0 |
4 |
1.6 MB |
|
gene_set |
gene_properties/gene_sets/miRNA_Darnell
|
0 |
3 |
95.15 KB |
|
gene_set |
gene_properties/gene_sets/relevant
|
0 |
15 |
101.75 KB |
|
gene_set |
gene_properties/gene_sets/sfari
|
0 |
11 |
11.53 KB |
|
gene_set |
gene_properties/gene_sets/spark
|
0 |
5 |
1.22 KB |
|
gene_score |
hg19/enrichment/coding_length_in_target_ref_gene_v20190211
|
0 |
5 |
106.76 KB |
Coding length in target enrichment background
using refGene gene models for HG19 from 20190211. Target regions are from
the SSC WES study.
|
gene_score |
hg19/enrichment/coding_length_ref_gene_v20190211
|
0 |
5 |
157.76 KB |
Coding length enrichment background
using refGene gene models for HG19 from 20190211.
|
gene_models |
hg19/gene_models/ccds_v201309
|
0 |
5 |
2.56 MB |
|
gene_models |
hg19/gene_models/knownGene_v201304
|
0 |
5 |
5.49 MB |
|
gene_models |
hg19/gene_models/refGeneMito_v201309
|
0 |
5 |
3.98 MB |
|
gene_models |
hg19/gene_models/refGene_v201309
|
0 |
5 |
3.97 MB |
|
gene_models |
hg19/gene_models/refGene_v20190211
|
0 |
5 |
5.47 MB |
|
genome |
hg19/genomes/GATK_ResourceBundle_5777_b37_phiX174
|
0 |
93 |
2.94 GB |
## HG19 Reference Genome
Default HG19 reference genome used by GPF
|
np_score |
hg19/scores/CADD
|
0 |
10 |
79.37 GB |
CADD score for functional prediction of a SNP. Please refer to Kircher
et al. (2014) Nature Genetics 46(3):310-5 for details. The larger the score the
more likely the SNP has damaging effect.
|
position_score |
hg19/scores/FitCons-i6-merged
|
0 |
8 |
105.17 MB |
fitCons score predicts the fraction of genomic positions belonging to a specific
function class (defined by epigenomic "fingerprint") that are under selective
pressure. Scores range from 0 to 1, with a larger score indicating a higher proportion
of nucleic sites of the functional class the genomic position belong to are under
selective pressure, therefore more likely to be functional important. Integrated
(i6) scores are integrated across three cell types (GM12878, H1-hESC and HUVEC).
More details can be found in doi:10.1038/ng.3196.
|
position_score |
hg19/scores/FitCons2_E035
|
0 |
8 |
291.52 MB |
FitCons2 score computed for the Primary haematopoietic stem cells (HSCs) (E035).
|
position_score |
hg19/scores/FitCons2_E067
|
0 |
8 |
260.97 MB |
FitCons2 score computed for the Brain Angular Gyrus (E067) tissue.
|
position_score |
hg19/scores/FitCons2_E068
|
0 |
8 |
270.23 MB |
FitCons2 score computed for the Brain Anterior Caudate (E068) tissue.
|
position_score |
hg19/scores/FitCons2_E069
|
0 |
8 |
262.13 MB |
FitCons2 score computed for the Brain Cingulate Gyrus (E069) tissue.
|
position_score |
hg19/scores/FitCons2_E070
|
0 |
8 |
262.32 MB |
FitCons2 score computed for the Brain Germinal Matrix (E070) tissue.
|
position_score |
hg19/scores/FitCons2_E071
|
0 |
8 |
255.46 MB |
FitCons2 score computed for the Brain Hippocampus Middle (E071) tissue.
|
position_score |
hg19/scores/FitCons2_E072
|
0 |
8 |
257.61 MB |
FitCons2 score computed for the Brain Inferior Temporal Lobe (E072) tissue.
|
position_score |
hg19/scores/FitCons2_E073
|
0 |
8 |
266.95 MB |
FitCons2 score computed for the Brain Dorsolateral Prefrontal Cortex (E073) tissue.
|
position_score |
hg19/scores/FitCons2_E074
|
0 |
8 |
262.12 MB |
FitCons2 score computed for the Brain Substantia Nigra (E074) tissue.
|
position_score |
hg19/scores/FitCons2_E081
|
0 |
8 |
276.04 MB |
FitCons2 score computed for the Fetal Brain Male (E081) tissue.
|
position_score |
hg19/scores/FitCons2_E082
|
0 |
8 |
278.88 MB |
FitCons2 score computed for the Fetal Brain Female (E082) tissue.
|
position_score |
hg19/scores/Linsight
|
0 |
8 |
1.35 GB |
LINSIGHT improves the prediction of noncoding nucleotide sites at which
mutations are likely to have deleterious fitness consequences, and which,
therefore, are likely to be phenotypically important. LINSIGHT combines
a generalized linear model for functional genomic data with a probabilistic
model of molecular evolution.
Huang, YF., Gulko, B. & Siepel, A. Fast, scalable prediction of deleterious
noncoding variants from functional and population genomic data.
*Nat Genet* **49**, 618-624 (2017). https://doi.org/10.1038/ng.3810
|
np_score |
hg19/scores/MPC
|
0 |
9 |
2.26 GB |
A deleteriousness prediction score for missense variants based on regional
missense constraint. The range of MPC score is 0 to 5. The larger the score, the
more likely the variant is pathogenic.
Given increasing numbers of patients who are undergoing exome or genome
sequencing, it is critical to establish tools and methods to interpret the
impact of genetic variation. While the ability to predict deleteriousness
for any given variant is limited, missense variants remain a particularly
challenging class of variation to interpret, since they can have drastically
different effects depending on both the precise location and specific
amino acid substitution of the variant. In order to better evaluate
missense variation, we leveraged the exome sequencing data of 60,706
individuals from the Exome Aggregation Consortium (ExAC) dataset to
identify sub-genic regions that are depleted of missense variation.
We further used this depletion as part of a novel missense deleteriousness
metric named MPC. We applied MPC to de novo missense variants and identified
a category of de novo missense variants with the same impact on
neurodevelopmental disorders as truncating mutations in intolerant
genes, supporting the value of incorporating regional missense constraint
in variant interpretation.
Details see doi: http://dx.doi.org/10.1101/148353.
|
position_score |
hg19/scores/phastCons46_placentals
|
0 |
8 |
10.55 GB |
phastCons46_placentals is a conservation score based on the
placental mammal subset of species.
The larger the score, the more conserved the site. Scores range from 0 to 1.
|
position_score |
hg19/scores/phastCons46_primates
|
0 |
8 |
14.02 GB |
phastCons46_primates is a conservation score based on the
primates subset of species.
The larger the score, the more conserved the site. Scores range from 0 to 1.
|
position_score |
hg19/scores/phastCons46_vertebrates
|
0 |
8 |
10.81 GB |
phastCons46_vertebrates is a conservation score based on a multiple alignments
of 45 vertebrate genomes to the human genome.
The larger the score, the more conserved the site. Scores range from 0 to 1.
|
position_score |
hg19/scores/phyloP46_placentals
|
0 |
8 |
14.62 GB |
phyloP (phylogenetic p-values) conservation score based on the multiple
alignments of the placental mammal species in the
phyloP46way alignment.
The higher the more conservative.
|
position_score |
hg19/scores/phyloP46_primates
|
0 |
8 |
10.81 GB |
phyloP (phylogenetic p-values) conservation score based on the multiple
alignments of the primate species in the
phyloP46way alignment.
The higher the more conservative.
|
position_score |
hg19/scores/phyloP46_vertebrates
|
0 |
8 |
14.72 GB |
phyloP (phylogenetic p-values) conservation score based on the multiple
alignments of 45 vertebrate genomes to the human genome. The higher the
more conservative.
|
allele_score |
hg19/variant_frequencies/gnomAD_v2.1.1/exomes
|
0 |
32 |
959.44 MB |
gnomAD exomes v2.1.1 variants build from ~260,000 whole exome samples
published by the Broad Institute.
|
allele_score |
hg19/variant_frequencies/gnomAD_v2.1.1/genomes
|
0 |
31 |
12.27 GB |
gnomAD genomes v2.1.1 variants build from ~32,000 whole genome sequencing samples
published by the Broad Institute.
|
gene_score |
hg38/enrichment/coding_length_ref_gene_v20170601
|
0 |
5 |
154.79 KB |
Coding length enrichment background
using refGene gene models for HG38 from 20170601
|
gene_score |
hg38/enrichment/ur_synonymous_AGRE_WG38_859
|
0 |
5 |
168.8 KB |
Ultra rare synonymous enrichment background build from AGRE WGS CSHL.
|
gene_score |
hg38/enrichment/ur_synonymous_SFARI_SSC_WGS_2
|
0 |
5 |
180.76 KB |
Ultra rare synonymous enrichment background build from SFARI SSC WGS NYGC.
|
gene_score |
hg38/enrichment/ur_synonymous_SFARI_SSC_WGS_CSHL
|
0 |
5 |
180.24 KB |
Ultra rare synonymous enrichment background build from SFARI SSC WGS NYGC.
|
gene_score |
hg38/enrichment/ur_synonymous_iWES_v1_1
|
0 |
5 |
186.89 KB |
Ultra rare synonymous enrichment background build from SPARK iWES v1.1.
|
gene_score |
hg38/enrichment/ur_synonymous_iWES_v2
|
0 |
5 |
190.01 KB |
Ultra rare synonymous enrichment background build from SPARK iWES v2.
|
gene_score |
hg38/enrichment/ur_synonymous_iWGS_v1_1
|
0 |
5 |
187.1 KB |
Ultra rare synonymous enrichment background build from SPARK iWGS v1.1.
|
gene_score |
hg38/enrichment/ur_synonymous_w1202s766e611_liftover
|
0 |
5 |
172.46 KB |
Ultra rare synonymous enrichment background build from SFARI SSC WES CSHL liftover.
|
gene_models |
hg38/gene_models/refGene_v20170601
|
0 |
5 |
5.42 MB |
## refSeq gene models for HG38 from 20170601
|
gene_models |
hg38/gene_models/refSeq_v20200330
|
0 |
5 |
4.15 MB |
## refSeq gene models for HG38 from 2020-03
Default gene models used by GPF for HG38.
|
genome |
hg38/genomes/GRCh38-hg38
|
0 |
3375 |
3.04 GB |
## HG38 Reference Genome
Default HG38 reference genome used by GPF
|
np_score |
hg38/scores/CADD_v1.4
|
0 |
18 |
79.42 GB |
CADD score for functional prediction of a SNP. Please refer to Kircher
et al. (2014) Nature Genetics 46(3):310-5 for details. The larger the score the
more likely the SNP has damaging effect.
|
np_score |
hg38/scores/CADD_v1.6
|
0 |
12 |
80.65 GB |
CADD score for functional prediction of a SNP. Please refer to Kircher
et al. (2014) Nature Genetics 46(3):310-5 for details. The larger the score the
more likely the SNP has damaging effect.
|
allele_score |
hg38/scores/clinvar_20221105
|
0 |
44 |
115.55 MB |
ClinVar resource downloaded on 2022-11-05. Chromosome names are
remapped to have `chr` prefix.
ClinVar is a freely accessible, public archive of reports of the
relationships among human variations and phenotypes, with supporting
evidence. ClinVar thus facilitates access to and communication about
the relationships asserted between human variation and observed health
status, and the history of that interpretation. ClinVar processes
submissions reporting variants found in patient samples, assertions
made regarding their clinical significance, information about the submitter,
and other supporting data. The alleles described in submissions are mapped
to reference sequences, and reported according to the HGVS standard.
ClinVar then presents the data for interactive users as well as those
wishing to use ClinVar in daily workflows and other local applications.
ClinVar works in collaboration with interested organizations to meet
the needs of the medical genetics community as efficiently and effectively
as possible
|
position_score |
hg38/scores/phastCons100way
|
0 |
8 |
10.15 GB |
phastCons100way is a conservation score based on a multiple alignments
of 99 vertebrate genomes to the human genome.
The larger the score, the more conserved the site. Scores range from 0 to 1.
|
position_score |
hg38/scores/phastCons20way
|
0 |
8 |
13.6 GB |
phastCons20way is a conservation score based on a multiple alignments
of 19 vertebrate genomes to the human genome.
The larger the score, the more conserved the site. Scores range from 0 to 1.
|
position_score |
hg38/scores/phastCons30way
|
0 |
8 |
12.97 GB |
phastCons30way is a conservation score based on a multiple alignments
of 29 genomes to the human genome.
The larger the score, the more conserved the site. Scores range from 0 to 1.
|
position_score |
hg38/scores/phastCons7way
|
0 |
11 |
14.64 GB |
phastCons7way is a conservation score based on a multiple alignments
of 6 vertebrate genomes to the human genome.
The larger the score, the more conserved the site. Scores range from 0 to 1.
|
position_score |
hg38/scores/phyloP100way
|
0 |
8 |
16.1 GB |
phyloP (phylogenetic p-values) conservation score based on the
multiple alignments of 100 vertebrate genomes (including human).
|
position_score |
hg38/scores/phyloP20way
|
0 |
8 |
14.56 GB |
phyloP (phylogenetic p-values) conservation score based on the multiple
alignments of 19 genome sequences to the human genome. The higher the
more conservative. Scores range from from -14.191 to 1.199.
|
position_score |
hg38/scores/phyloP30way
|
0 |
8 |
14.72 GB |
phyloP (phylogenetic p-values) conservation score based on the
multiple alignments of 29 genome sequences to the human genome.
The higher the more conservative. Scores range from -20.000 to 1.312.
|
position_score |
hg38/scores/phyloP7way
|
0 |
8 |
12.54 GB |
phyloP (phylogenetic p-values) conservation score based on the multiple
alignments of 6 vertebrate genomes to the human genome. The higher the
more conservative. Scores range from from -5.220 to 1.062.
|
allele_score |
hg38/variant_frequencies/SSC_WG38_CSHL_2380
|
0 |
12 |
783.71 MB |
TODO
exported from SFARI_SSC_WGS_CSHL using
`gpf_validation_data/data_hg38/exports/SFARI_SSC_WGS_CSHL_frequency` scripts
|
allele_score |
hg38/variant_frequencies/gnomAD_v2.1.1_liftover/exomes
|
0 |
35 |
928.26 MB |
Liftover of gnomAD exomes v2.1.1 to hg38 published by the Broad Institute.
|
allele_score |
hg38/variant_frequencies/gnomAD_v2.1.1_liftover/genomes
|
0 |
35 |
11.53 GB |
## gnomAD genomes v2.1.1 liftover
Original gnomAD genomes v2.1.1 liftover is downloaded on October 19, 2020
from https://gnomad.broadinstitute.org/.
tabix -s 1 -b 2 -e 2 -f gnomad..r2.1.1.extract.tsv.gz
|
allele_score |
hg38/variant_frequencies/gnomAD_v3/genomes
|
0 |
18 |
17.64 GB |
gnomAD v3.0 variants built from ~150,000 samples with whole genome sequence data.
|
liftover_chain |
liftover/hg19ToHg38
|
0 |
6 |
450.19 KB |
## Liftover Chain Hg19 to Hg38
|
liftover_chain |
liftover/hg38ToHg19
|
0 |
6 |
2.4 MB |
|