Summary	GPF-SFARI Production Annotation Pipeline
Description	This is the pipeline used in the GPF-SFARI instance.
Input reference genome	hg38/genomes/GRCh38-hg38

worst_effect

Type:

Worst effect accross all transcripts.

source: worst_effect

worst_effect_genes

Type:

comma separated list of genes with worst effect.

source: worst_effect_genes

gene_effects

Type:

<gene_1>:<effect_1>|... A gene can be repeated.

source: gene_effects

effect_details

Type:

Effect details for each affected transcript. Format: < transcript 1 >:<gene 1>:<effect 1>:<details 1>|...

source: effect_details

gene_list

Type: (Internal)

List of all genes

source: gene_list

Annotator type: effect_annotator

Annotator to identify the effect of the variant on protein coding.

More info

Resource

Id: hg38/genomes/GRCh38-hg38

Type: genome

Summary:

HG38 reference genome

Resource

Id: hg38/gene_models/refGene_v20170601

Type: gene_models

Summary:

refSeq gene models for HG38 from 20170601

phylop100way

Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

source: phyloP100way

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource

Id: hg38/scores/phyloP100way

Type: position_score

Summary:

Conservation score based on the multiple alignment of 100 species

phylop30way

Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

source: phyloP30way

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource

Id: hg38/scores/phyloP30way

Type: position_score

Summary:

Conservation score based on the multiple alignment of 30 species

phylop20way

Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

source: phyloP20way

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource

Id: hg38/scores/phyloP20way

Type: position_score

Summary:

Conservation score based on the multiple alignment of 20 species

phylop7way

Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

source: phyloP7way

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource

Id: hg38/scores/phyloP7way

Type: position_score

Summary:

Conservation score based on the multiple alignment of 7 species

phastcons100way

Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

source: phastCons100way

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource

Id: hg38/scores/phastCons100way

Type: position_score

Summary:

Conservation score based on the multiple alignment of 100 species

phastcons30way

Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

source: phastCons30way

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource

Id: hg38/scores/phastCons30way

Type: position_score

Summary:

Conservation score based on the multiple alignment of 30 species

phastcons20way

Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

source: phastCons20way

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource

Id: hg38/scores/phastCons20way

Type: position_score

Summary:

Conservation score based on the multiple alignment of 20 species

phastcons7way

Type:

The score is a number that reflects the conservation at a position.

position_aggregator: mean [default]

source: phastCons7way

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

Resource

Id: hg38/scores/phastCons7way

Type: position_score

Summary:

Conservation score based on the multiple alignment of 7 species

cadd_raw

Type:

CADD raw score for functional prediction of a SNP. The larger the score the more likely the SNP has damaging effect

position_aggregator: mean [default]

allele_aggregator: max [default]

source: cadd_raw

cadd_phred

Type:

CADD phred-like score. This is phred-like rank score based on whole genome CADD raw scores. The larger the score the more likely the SNP has damaging effect.

position_aggregator: mean [default]

allele_aggregator: max [default]

source: cadd_phred

Annotator type: allele_score

Annotator to use with scores that depend on allele like variant frequencies, etc.

More info

Resource

Id: hg38/scores/CADD_v1.4

Type: allele_score

Summary:

CADD (Combined Annotation Dependent Depletion score) predicts the potential impact of a SNP

hg19_annotatable

Type: (Internal)

The lifted over annotatable

source: liftover_annotatable

Annotator type: liftover_annotator

Annotator to lift over a variant from one reference genome to another.

More info

Resource

Id: liftover/hg38ToHg19

Type: liftover_chain

Summary:

Liftover Chain Hg38 to Hg19

Resource

Id: hg38/genomes/GRCh38-hg38

Type: genome

Summary:

HG38 reference genome

Resource

Id: hg19/genomes/GATK_ResourceBundle_5777_b37_phiX174

Type: genome

Summary:

HG19 reference genome

fitcons_i6_merged

Type:

probability that a point mutation at each position in a genome will influence fitness

position_aggregator: mean [default]

source: fc_i6_score

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

input_annotatable: hg19_annotatable

Resource

Id: hg19/scores/FitCons-i6-merged

Type: position_score

Summary:

fitCons (fitness consequences) score estimates selective pressure on genomic positions.

linsight

Type:

The LINSIGHT score measures the probability of negative selection on noncoding sites.

position_aggregator: mean [default]

source: LINSIGHT

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

input_annotatable: hg19_annotatable

Resource

Id: hg19/scores/Linsight

Type: position_score

Summary:

The likelihood of negative selection on noncoding sites

fitcons2_e067

Type:

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

source: FitCons2_E067

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

input_annotatable: hg19_annotatable

Resource

Id: hg19/scores/FitCons2_E067

Type: position_score

Summary:

Cell-type specific FitCons scores for Brain Angular Gyrus tissue (E067).

fitcons2_e068

Type:

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

source: FitCons2_E068

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

input_annotatable: hg19_annotatable

Resource

Id: hg19/scores/FitCons2_E068

Type: position_score

Summary:

Cell-type specific FitCons scores for Brain Anterior Caudate tissue (E068).

fitcons2_e069

Type:

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

source: FitCons2_E069

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

input_annotatable: hg19_annotatable

Resource

Id: hg19/scores/FitCons2_E069

Type: position_score

Summary:

Cell-type specific FitCons scores for Brain Cingulate Gyrus tissue (E069).

fitcons2_e070

Type:

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

source: FitCons2_E070

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

input_annotatable: hg19_annotatable

Resource

Id: hg19/scores/FitCons2_E070

Type: position_score

Summary:

Cell-type specific FitCons scores for Brain Germinal Matrix tissue (E070).

fitcons2_e071

Type:

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

source: FitCons2_E071

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

input_annotatable: hg19_annotatable

Resource

Id: hg19/scores/FitCons2_E071

Type: position_score

Summary:

Cell-type specific FitCons scores for Brain Hippocampus Middle tissue (E071).

fitcons2_e072

Type:

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

source: FitCons2_E072

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

input_annotatable: hg19_annotatable

Resource

Id: hg19/scores/FitCons2_E072

Type: position_score

Summary:

Cell-type specific FitCons scores for Brain Inferior Temporal Lobe tissue (E072).

fitcons2_e073

Type:

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

source: FitCons2_E073

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

input_annotatable: hg19_annotatable

Resource

Id: hg19/scores/FitCons2_E073

Type: position_score

Summary:

FitCons2 score computed for the Brain Dorsolateral Prefrontal Cortex (E073) tissue.

fitcons2_e074

Type:

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

source: FitCons2_E074

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

input_annotatable: hg19_annotatable

Resource

Id: hg19/scores/FitCons2_E074

Type: position_score

Summary:

Cell-type specific FitCons scores for Brain Substantia Nigra tissue (E074).

fitcons2_e081

Type:

The scores is the probability for a mutation to have a fitness consequence.

position_aggregator: mean [default]

source: FitCons2_E081

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

input_annotatable: hg19_annotatable

Resource

Id: hg19/scores/FitCons2_E081

Type: position_score

Summary:

Cell-type specific FitCons scores for Fetal Brain Male tissue (E081).

fitcons2_e082

Type:

The scores is the probability for a mutation to have a fitenss consequence.

position_aggregator: mean [default]

source: FitCons2_E082

Annotator type: position_score

Annotator to use with genomic scores depending on genomic position like phastCons, phyloP, FitCons2, etc.

More info

input_annotatable: hg19_annotatable

Resource

Id: hg19/scores/FitCons2_E082

Type: position_score

Summary:

Cell-type specific FitCons scores for Fetal Brain Female tissue (E082).

mpc

Type:

Missense badness, PolyPhen-2, and Constraint. A deleteriousness prediction score for missense variants"

position_aggregator: mean [default]

allele_aggregator: max [default]

source: MPC

Annotator type: allele_score

Annotator to use with scores that depend on allele like variant frequencies, etc.

More info

input_annotatable: hg19_annotatable

Resource

Id: hg19/scores/MPC

Type: allele_score

Summary:

MPC (Missense badness, PolyPhen-2, and Constraint) is a composite score that predicts the impact of missense variants.

normalized_allele

Type: (Internal)

Normalized allele.

source: normalized_allele

Annotator type: normalize_allele_annotator

No description

Resource

Id: hg38/genomes/GRCh38-hg38

Type: genome

Summary:

HG38 reference genome

ssc_freq

Type:

SFARI SSC WGS CSHL allele frequency in %

position_aggregator: mean [default]

allele_aggregator: max [default]

source: allele_frequency

Annotator type: allele_score

Annotator to use with scores that depend on allele like variant frequencies, etc.

More info

Resource

Id: hg38/variant_frequencies/SSC_WG38_CSHL_2380

Type: allele_score

Summary:

Exported from SFARI_SSC_WGS_CSHL using `gpf_validation_data/data_hg38/exports/SFARI_SSC_WGS_CSHL_frequency`.

exome_gnomad_v2_1_1_af_percent

Type:

Alternative allele frequency in the whole gnomAD exome samples v2.1.1 as %

position_aggregator: mean [default]

allele_aggregator: max [default]

source: AF_percent

exome_gnomad_v2_1_1_ac

Type:

Alternative allele count in the whole gnomAD exome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

source: AC

exome_gnomad_v2_1_1_af

Type:

Alternative allele frequency in the whole gnomAD exome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

source: AF

exome_gnomad_v2_1_1_an

Type:

Total allele count in the whole gnomAD exome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

source: AN

exome_gnomad_v2_1_1_controls_ac

Type:

Alternative allele count in the controls subset of whole gnomAD exome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

source: controls_AC

exome_gnomad_v2_1_1_controls_an

Type:

gnomAD v2.1.1 liftover exomes count of genotyped individuals in control group

position_aggregator: mean [default]

allele_aggregator: max [default]

source: controls_AN

exome_gnomad_v2_1_1_non_neuro_ac

Type:

Alternative allele count in the non-neuro subset of whole gnomAD exome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

source: non_neuro_AC

exome_gnomad_v2_1_1_non_neuro_an

Type:

Total allele count in the non-neuro subset of whole gnomAD exome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

source: non_neuro_AN

exome_gnomad_v2_1_1_controls_af_percent

Type:

Alternative allele frequency in the controls subset of whole gnomAD exome samples v2.1.1 as %

position_aggregator: mean [default]

allele_aggregator: max [default]

source: controls_AF_percent

exome_gnomad_v2_1_1_non_neuro_af_percent

Type:

Alternative allele frequency in the non-neuro subset of whole gnomAD exome samples v2.1.1 as %

position_aggregator: mean [default]

allele_aggregator: max [default]

source: non_neuro_AF_percent

Annotator type: allele_score

Annotator to use with scores that depend on allele like variant frequencies, etc.

More info

input_annotatable: normalized_allele

Resource

Id: hg38/variant_frequencies/gnomAD_v2.1.1_liftover/exomes

Type: allele_score

Summary:

Liftover of gnomAD exomes v2.1.1 to hg38.

genome_gnomad_v2_1_1_af_percent

Type:

Alternative allele frequency in the whole gnomAD genome samples v2.1.1 as %

position_aggregator: mean [default]

allele_aggregator: max [default]

source: AF_percent

genome_gnomad_v2_1_1_ac

Type:

Alternative allele count in the whole gnomAD genome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

source: AC

genome_gnomad_v2_1_1_af

Type:

Alternative allele frequency in the whole gnomAD genome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

source: AF

genome_gnomad_v2_1_1_an

Type:

Total allele count in the whole gnomAD genome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

source: AN

genome_gnomad_v2_1_1_controls_ac

Type:

Alternative allele count in the controls subset of whole gnomAD genome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

source: controls_AC

genome_gnomad_v2_1_1_controls_an

Type:

gnomAD v2.1.1 liftover genome count of genotyped individuals in control group

position_aggregator: mean [default]

allele_aggregator: max [default]

source: controls_AN

genome_gnomad_v2_1_1_non_neuro_ac

Type:

Alternative allele count in the non-neuro subset of whole gnomAD genome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

source: non_neuro_AC

genome_gnomad_v2_1_1_non_neuro_an

Type:

Total allele count in the non-neuro subset of whole gnomAD genome samples v2.1.1

position_aggregator: mean [default]

allele_aggregator: max [default]

source: non_neuro_AN

genome_gnomad_v2_1_1_controls_af_percent

Type:

Alternative allele frequency in the controls subset of whole gnomAD genome samples v2.1.1 as %

position_aggregator: mean [default]

allele_aggregator: max [default]

source: controls_AF_percent

genome_gnomad_v2_1_1_non_neuro_af_percent

Type:

Alternative allele frequency in the non-neuro subset of whole gnomAD genome samples v2.1.1 as %

position_aggregator: mean [default]

allele_aggregator: max [default]

source: non_neuro_AF_percent

Annotator type: allele_score

Annotator to use with scores that depend on allele like variant frequencies, etc.

More info

input_annotatable: normalized_allele

Resource

Id: hg38/variant_frequencies/gnomAD_v2.1.1_liftover/genomes

Type: allele_score

Summary:

Liftover of gnomAD genomes v2.1.1 to hg38.

genome_gnomad_v3_af_percent

Type:

Alternative allele frequency represented as a percent in the all gnomAD v3.0 genome samples.

position_aggregator: mean [default]

allele_aggregator: max [default]

source: AF_percent

genome_gnomad_v3_ac

Type:

Number of alternative alleles in the all gnomAD v3.0 genome samples.

position_aggregator: mean [default]

allele_aggregator: max [default]

source: AC

genome_gnomad_v3_an

Type:

Number of genotyped individuals in the all gnomAD v3.0 genome samples.

position_aggregator: mean [default]

allele_aggregator: max [default]

source: AN

Annotator type: allele_score

Annotator to use with scores that depend on allele like variant frequencies, etc.

More info

input_annotatable: normalized_allele

Resource

Id: hg38/variant_frequencies/gnomAD_v3/genomes

Type: allele_score

Summary:

gnomAD v3.0 variants built from ~150,000 samples with whole genome sequence data.

Id:	pipeline/GPF-SFARI_annotation
Type:	annotation_pipeline
Version:	0
Summary:
Description:
Labels:

Filename	Size	md5
GPF-SFARI_annotation.yaml	2.79 KB	1c01dbef40729b322875b66413c93493
genomic_resource.yaml	232.0 B	ef3a0a26a02c613fe081d1c681d5ce73
statistics/

Resource

Pipeline Documentation

preamble

Annotators

Files