sorry, we can't preview this file

...but you can still download inphared_data.tar.gz
inphared_data.tar.gz (4.22 GB)


Download (4.22 GB)
posted on 2021-04-13, 07:44 authored by Andrew MillardAndrew Millard, Ryan Cook (INfrastructure for a PHAge REference Database) is a perl script which downloads and filters phage genomes from Genbank to provide the most complete phage genome database possible.

Useful information, including viral taxonomy and bacterial host data, is extracted from the Genbank files and provided in a summary table. Genes are called on the genomes using Prokka and this output is used to gather metrics which are summarised in the output files, as well as useful input files for vConTACT2.

The data provided is all genomes up to Jan 2021. This can be downloaded so users do not have to repeat the process of consistent gene calling on existing genomes.

The folder GenomesDB contains subfolders each containing a subfolder that is named on the accession number of each phage.

Within each folder are re-called genes in the following format



The complete genome *fna and genbank file without any annotation *gbf



The MRC Consortium for Medical Microbial Bioinformatics

Medical Research Council

Find out more...

CLIMB-BIG-DATA: A Cloud Infrastructure for Big-Data Microbial Bioinformatics

Medical Research Council

Find out more...


Usage metrics