PhD thesis: PRDM9 Diversity, Recombination Landscapes and Childhood Leukaemia by Ihthisham Ali
Appendix V contains various data pipelines and scripts used for the remapping of Illumina HiSeq2000 dataset to known PRDM9 ZnF arrays, read depth and variant calling vcf file generation, haplotype estimation and imputation of FIGNL1 coding variants in relation to the British ALL cohort, de novo assembly of read data and mapping of MinION read data.
A. ALL study phasing and imputation
B. Illumina HiSeq 2000 dataset - Read depth (DP) and variant calling pipeline
C. Illumina HiSeq 2000 dataset - data treatment
D. VelvetOptimiser best k-mer determination log (exemplary)
E. Alignment of contigs generated by Velvet de novo assembly for the PRDM9 A/A carrier and aligned against the PRDM9 A ZnF array