Resource

Id: pipeline/GPF-SFARI_annotation
Type: annotation_pipeline
Version: 0
Summary:
Description:
Labels:

Pipeline Documentation

preamble

Summary GPF-SFARI Production Annotation Pipeline
Description This is the pipeline used in the GPF-SFARI instance.
Input reference genome hg38/genomes/GRCh38-hg38
Pipeline path:

Annotators

phyloP100way
Type: phyloP100way

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

HISTOGRAM
source: phyloP100way
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource
Type: position_score
Summary:
Conservation score based on the multiple alignment of 100 species
phyloP30way
Type: phyloP30way

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

HISTOGRAM
source: phyloP30way
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource
Type: position_score
Summary:
Conservation score based on the multiple alignment of 30 species
phyloP20way
Type: phyloP20way

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

HISTOGRAM
source: phyloP20way
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource
Type: position_score
Summary:
Conservation score based on the multiple alignment of 20 species
phyloP7way
Type: phyloP7way

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

HISTOGRAM
source: phyloP7way
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource
Type: position_score
Summary:
Conservation score based on the multiple alignment of 7 species
phastCons100way
Type: phastCons100way

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

HISTOGRAM
source: phastCons100way
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource
Type: position_score
Summary:
Conservation score based on the multiple alignment of 100 species
phastCons30way
Type: phastCons30way

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

HISTOGRAM
source: phastCons30way
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource
Type: position_score
Summary:
Conservation score based on the multiple alignment of 30 species
phastCons20way
Type: phastCons20way

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

HISTOGRAM
source: phastCons20way
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource
Type: position_score
Summary:
Conservation score based on the multiple alignment of 20 species
phastCons7way
Type: phastCons7way

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

HISTOGRAM
source: phastCons7way
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource
Type: position_score
Summary:
Conservation score based on the multiple alignment of 7 species
cadd_raw
Type: cadd_raw

CADD raw score for functional prediction of a SNP. The larger the score the more likely the SNP has damaging effect

position_aggregator: mean [default]

nucleotide_aggregator: max [default]

HISTOGRAM
source: cadd_raw
cadd_phred
Type: cadd_phred

CADD phred-like score. This is phred-like rank score based on whole genome CADD raw scores. The larger the score the more likely the SNP has damaging effect.

position_aggregator: mean [default]

nucleotide_aggregator: max [default]

HISTOGRAM
source: cadd_phred
Annotator type: np_score

Annotator to use with genomic scores depending on genomic position and nucleotide change like CADD, MPC, etc.

More info

Resource
Type: np_score
Summary:
CADD (Combined Annotation Dependent Depletion score) predicts the potential impact of a SNP
hg19_annotatable
Type: annotatable (Internal)

Lifted over allele.

source: liftover_annotatable
Annotator type: liftover_annotator

Annotator to lift over a variant from one reference genome to another.

More info

Resource
Type: liftover_chain
Summary:
Liftover Chain Hg38 to Hg19
Resource
Type: genome
Summary:
HG38 reference genome
Resource
Type: genome
Summary:
HG19 reference genome
fitcons_i6_merged
Type: fitcons_i6_merged

probability that a point mutation at each position in a genome will influence fitness

position_aggregator: mean [default]

HISTOGRAM
source: fc_i6_score
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

  • input_annotatable: hg19_annotatable
Resource
Type: position_score
Summary:
fitCons (fitness consequences) score estimates selective pressure on genomic positions.
linsight
Type: linsight

The LINSIGHT score measures the probability of negative selection on noncoding sites.

position_aggregator: mean [default]

HISTOGRAM
source: Linsight
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

  • input_annotatable: hg19_annotatable
Resource
Type: position_score
Summary:
LINSIGHT score quantifies the likelihood of negative selection on noncoding sites.
fitcons2_e067
Type: fitcons2_e067

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

HISTOGRAM
source: FitCons2_E067
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

  • input_annotatable: hg19_annotatable
Resource
Type: position_score
Summary:
Cell-type specific FitCons scores for Brain Angular Gyrus tissue (E067).
fitcons2_e068
Type: fitcons2_e068

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

HISTOGRAM
source: FitCons2_E068
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

  • input_annotatable: hg19_annotatable
Resource
Type: position_score
Summary:
Cell-type specific FitCons scores for Brain Anterior Caudate tissue (E068).
fitcons2_e069
Type: fitcons2_e069

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

HISTOGRAM
source: FitCons2_E069
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

  • input_annotatable: hg19_annotatable
Resource
Type: position_score
Summary:
Cell-type specific FitCons scores for Brain Cingulate Gyrus tissue (E069).
fitcons2_e070
Type: fitcons2_e070

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

HISTOGRAM
source: FitCons2_E070
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

  • input_annotatable: hg19_annotatable
Resource
Type: position_score
Summary:
Cell-type specific FitCons scores for Brain Germinal Matrix tissue (E070).
fitcons2_e071
Type: fitcons2_e071

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

HISTOGRAM
source: FitCons2_E071
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

  • input_annotatable: hg19_annotatable
Resource
Type: position_score
Summary:
Cell-type specific FitCons scores for Brain Hippocampus Middle tissue (E071).
fitcons2_e072
Type: fitcons2_e072

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

HISTOGRAM
source: FitCons2_E072
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

  • input_annotatable: hg19_annotatable
Resource
Type: position_score
Summary:
Cell-type specific FitCons scores for Brain Inferior Temporal Lobe tissue (E072).
fitcons2_e073
Type: fitcons2_e073

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

HISTOGRAM
source: FitCons2_E073
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

  • input_annotatable: hg19_annotatable
Resource
Type: position_score
Summary:
FitCons2 score computed for the Brain Dorsolateral Prefrontal Cortex (E073) tissue.
fitcons2_e074
Type: fitcons2_e074

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

HISTOGRAM
source: FitCons2_E074
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

  • input_annotatable: hg19_annotatable
Resource
Type: position_score
Summary:
Cell-type specific FitCons scores for Brain Substantia Nigra tissue (E074).
fitcons2_e081
Type: fitcons2_e081

The scores is the probability for a mutation to have a fitness consequence.

position_aggregator: mean [default]

HISTOGRAM
source: FitCons2_E081
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

  • input_annotatable: hg19_annotatable
Resource
Type: position_score
Summary:
Cell-type specific FitCons scores for Fetal Brain Male tissue (E081).
fitcons2_e082
Type: fitcons2_e082

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

HISTOGRAM
source: FitCons2_E082
Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

  • input_annotatable: hg19_annotatable
Resource
Type: position_score
Summary:
Cell-type specific FitCons scores for Fetal Brain Female tissue (E082).
mpc
Type: mpc

Missense badness, PolyPhen-2, and Constraint. A deleteriousness prediction score for missense variants"

position_aggregator: mean [default]

nucleotide_aggregator: max [default]

HISTOGRAM
source: MPC
Annotator type: np_score

Annotator to use with genomic scores depending on genomic position and nucleotide change like CADD, MPC, etc.

More info

  • input_annotatable: hg19_annotatable
Resource
Type: np_score
Summary:
MPC (Missense badness, PolyPhen-2, and Constraint) is a composite score that predicts the impact of missense variants.
normalized_allele
Type: annotatable (Internal)

Normalized allele.

source: normalized_allele
Annotator type: normalize_allele_annotator
No description
Resource
Type: genome
Summary:
HG38 reference genome
ssc_freq
Type: ssc_freq

SFARI SSC WGS CSHL allele frequency in %

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: allele_frequency
Annotator type: allele_score

Annotator to use with scores that depend on allele like variant frequencies, etc.

More info

Resource
Type: allele_score
Summary:
Exported from SFARI_SSC_WGS_CSHL using `gpf_validation_data/data_hg38/exports/SFARI_SSC_WGS_CSHL_frequency`.
exome_gnomad_v2_1_1_af_percent
Type: exome_gnomad_v2_1_1_af_percent

Alternative allele frequency in the whole gnomAD exome samples v2.1.1 as %

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: AF_percent
exome_gnomad_v2_1_1_ac
Type: exome_gnomad_v2_1_1_ac

Alternative allele count in the whole gnomAD exome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: AC
exome_gnomad_v2_1_1_af
Type: exome_gnomad_v2_1_1_af

Alternative allele frequency in the whole gnomAD exome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: AF
exome_gnomad_v2_1_1_an
Type: exome_gnomad_v2_1_1_an

Total allele count in the whole gnomAD exome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: AN
exome_gnomad_v2_1_1_controls_ac
Type: exome_gnomad_v2_1_1_controls_ac

Alternative allele count in the controls subset of whole gnomAD exome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: controls_AC
exome_gnomad_v2_1_1_controls_an
Type: exome_gnomad_v2_1_1_controls_an

gnomAD v2.1.1 liftover exomes count of genotyped individuals in control group

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: controls_AN
exome_gnomad_v2_1_1_non_neuro_ac
Type: exome_gnomad_v2_1_1_non_neuro_ac

Alternative allele count in the non-neuro subset of whole gnomAD exome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: non_neuro_AC
exome_gnomad_v2_1_1_non_neuro_an
Type: exome_gnomad_v2_1_1_non_neuro_an

Total allele count in the non-neuro subset of whole gnomAD exome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: non_neuro_AN
exome_gnomad_v2_1_1_controls_af_percent
Type: exome_gnomad_v2_1_1_controls_af_percent

Alternative allele frequency in the controls subset of whole gnomAD exome samples v2.1.1 as %

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: controls_AF_percent
exome_gnomad_v2_1_1_non_neuro_af_percent
Type: exome_gnomad_v2_1_1_non_neuro_af_percent

Alternative allele frequency in the non-neuro subset of whole gnomAD exome samples v2.1.1 as %

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: non_neuro_AF_percent
Annotator type: allele_score

Annotator to use with scores that depend on allele like variant frequencies, etc.

More info

  • input_annotatable: normalized_allele
Resource
Type: allele_score
Summary:
Liftover of gnomAD exomes v2.1.1 to hg38.
genome_gnomad_v2_1_1_af_percent
Type: genome_gnomad_v2_1_1_af_percent

Alternative allele frequency in the whole gnomAD genome samples v2.1.1 as %

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: AF_percent
genome_gnomad_v2_1_1_ac
Type: genome_gnomad_v2_1_1_ac

Alternative allele count in the whole gnomAD genome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: AC
genome_gnomad_v2_1_1_af
Type: genome_gnomad_v2_1_1_af

Alternative allele frequency in the whole gnomAD genome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: AF
genome_gnomad_v2_1_1_an
Type: genome_gnomad_v2_1_1_an

Total allele count in the whole gnomAD genome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: AN
genome_gnomad_v2_1_1_controls_ac
Type: genome_gnomad_v2_1_1_controls_ac

Alternative allele count in the controls subset of whole gnomAD genome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: controls_AC
genome_gnomad_v2_1_1_controls_an
Type: genome_gnomad_v2_1_1_controls_an

gnomAD v2.1.1 liftover genome count of genotyped individuals in control group

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: controls_AN
genome_gnomad_v2_1_1_non_neuro_ac
Type: genome_gnomad_v2_1_1_non_neuro_ac

Alternative allele count in the non-neuro subset of whole gnomAD genome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: non_neuro_AC
genome_gnomad_v2_1_1_non_neuro_an
Type: genome_gnomad_v2_1_1_non_neuro_an

Total allele count in the non-neuro subset of whole gnomAD genome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: non_neuro_AN
genome_gnomad_v2_1_1_controls_af_percent
Type: genome_gnomad_v2_1_1_controls_af_percent

Alternative allele frequency in the controls subset of whole gnomAD genome samples v2.1.1 as %

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: controls_AF_percent
genome_gnomad_v2_1_1_non_neuro_af_percent
Type: genome_gnomad_v2_1_1_non_neuro_af_percent

Alternative allele frequency in the non-neuro subset of whole gnomAD genome samples v2.1.1 as %

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: non_neuro_AF_percent
Annotator type: allele_score

Annotator to use with scores that depend on allele like variant frequencies, etc.

More info

  • input_annotatable: normalized_allele
Resource
Type: allele_score
Summary:
Liftover of gnomAD genomes v2.1.1 to hg38.
genome_gnomad_v3_af_percent
Type: genome_gnomad_v3_af_percent

Alternative allele frequency represented as a percent in the all gnomAD v3.0 genome samples.

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: AF_percent
genome_gnomad_v3_ac
Type: genome_gnomad_v3_ac

Number of alternative alleles in the all gnomAD v3.0 genome samples.

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: AC
genome_gnomad_v3_an
Type: genome_gnomad_v3_an

Number of genotyped individuals in the all gnomAD v3.0 genome samples.

position_aggregator: mean [default]

allele_aggregator: max [default]

HISTOGRAM
source: AN
Annotator type: allele_score

Annotator to use with scores that depend on allele like variant frequencies, etc.

More info

  • input_annotatable: normalized_allele
Resource
Type: allele_score
Summary:
gnomAD v3.0 variants built from ~150,000 samples with whole genome sequence data.

Files

Filename Size md5
GPF-SFARI_annotation.yaml 2.67 KB c3f510c0a0fc48eaa289edee293463d9
genomic_resource.yaml 232.0 B ef3a0a26a02c613fe081d1c681d5ce73
statistics/