Genome Annotation
Overview
Teaching: 10 min
Exercises: 30 minQuestions
What genes are contained in my genome?
Objectives
Assigning gene calls to sequence and annotate them with the known databases
Importing and editing Annotation workflow
Import genome annotation workflow:
- In a new tab, open this link: https://usegalaxy.eu/u/matinnu/w/03annotation
- On the top-right corner, click the
+
button (import workflow) - Click
start
using this workflow - Click on the name of the newly imported workflow (
imported: 03_Annotation
) and clickEdit
- Explore the workflow and make changes if necessary
When you are contempt with the workflow, click save. Next, we will run the analysis on the newly assembled genome (sample_02)
- Click the “play” button on the top-right corner (
Run Workflow
) - If the workflow does not show in detailed view, click
Expand to full workflow form
- Set
Send results to a new history
toyes
. - Change the history name to “03_Annotation_2021_sample02”.
- On the
1: input dataset
, selectmedaka consensus
as the input. - Click
Run Workflow
on the top panel.
Prokka: rapid prokaryotic genome annotation
Prokka is a software tool to annotate bacterial, archaeal and viral genomes quickly and produce standards-compliant output files.
Discussion 01 - Prokka
- What kind of output do you get from Prokka?
- Which outputs can be used for subsequent analysis?
Solution
TBD
Busco:
Busco is a functional quality control based on evolutionarily-informed expectations of gene content of near-universal single-copy orthologs
Discussion 02 - Busco
- What kind of output do you get from Busco?
- How do you know if your genome is contaminated or incomplete?
- What other tools can we use for quality checking our genome completeness and contamination?
Solution
TBD
Key Points
Genome annotation starts by identifying genes and other functional elements (rRNA, tRNA, etc.) within the nucleotides. This is followed by comparison with databases of interest to predict the functions encoded in the genes.