Defining the problem
An Integrated Quality control system - Melbourne Genomics Health Australia
>
- Variance in the genome can lead to disease
- Determining the accuracy of variation in gene is important
- Coverage
- Quality of reads
- Level of variance
#HeHackSheHack
- Problem owner - Natalie Thorne
- Problem solvers
- Warwick Stinson
- Kirill Tsyganov
- Andrew Pattison
- Adele Barugahare
- Damien Zammit
- Juin Siew
- Tom Adamson
- Kristy Horan
Plan of action
What is the problem?
- Genome is large
- Genome is complex
Needed a database
Needed to parse data from 3 file types to the Database
- FastQC output
- Quality control tests for the sequencing reads
- Coverage
- Number of times each base was read by sequencing
- Variance
- Number of base difference from wild type
Needed to query the database for desired data
- what data was needed
- Quality scores from FastQC
- Coverage counts
- List of variations of each genes from wild type
Needed to visualize data
- Chose Shiny to create a web app in R
How did it work?
Plot coverage at genome positions
http://146.118.98.44:3838/andrew/app/
Quality control comparison
http://146.118.98.44:3838/adele/QC_app/
All of our code can be found on GitHub
Future plans/ considerations