University of Leicester
Browse

taxMyPhage: Automated taxonomy of dsDNA phage genomes at the genus and species level

Download (284.44 kB)
journal contribution
posted on 2025-02-13, 10:52 authored by Andrew MillardAndrew Millard, Remi Denise, Maria Lestido, Moi Taiga Thomas, Deven Webster, Dann Turner, Thomas Sicheritz-Pontén

Background: Bacteriophages are classified into genera and species based on genomic similarity, a process regulated by the International Committee on the Taxonomy of Viruses. With the rapid increase in phage genomic data there is a growing need for automated classification systems that can handle large-scale genome analyses and place phages into new or existing genera and species.

Materials and Methods: We developed taxMyPhage, a tool system for the rapid automated classification of dsDNA bacteriophage genomes. The system integrates a MASH database, built from ICTV-classified phage genomes to identify closely related phages, followed by BLASTn to calculate intergenomic similarity, conforming to ICTV guidelines for genus and species classification. taxMyPhage is available as a git repository at https://github.com/amillard/tax_myPHAGE, a conda package, a pip-installable tool, and a web service at https://phagecompass.ku.dk.

Results: taxMyPhage enables rapid classification of bacteriophages to the genus and species level. Benchmarking on 705 genomes pending ICTV classification showed a 96.7% accuracy at the genus level and 97.9% accuracy at the species level. The system also detected inconsistencies in current ICTV classifications, identifying cases where genera did not adhere to ICTV’s 70% average nucleotide identity (ANI) threshold for genus classification or 95% ANI for species. The command line version classified 705 genomes within 48 h, demonstrating its scalability for large datasets.

Conclusions: taxMyPhage significantly enhances the speed and accuracy of bacteriophage genome classification at the genus and species levels, making it compatible with current sequencing outputs. The tool facilitates the integration of bacteriophage classification into standard workflows, thereby accelerating research and ensuring consistent taxonomy.


Funding

The MRC Consortium for Medical Microbial Bioinformatics

Medical Research Council

Find out more...

CLIMB-BIG-DATA: A Cloud Infrastructure for Big-Data Microbial Bioinformatics

Medical Research Council

Find out more...

Norwegian Seafood Research Fund (FHF901707) and Leo Foundation (LF-OC-23-001423)

History

Author affiliation

College of Life Sciences Genetics, Genome Biology & Cancer Sciences

Version

  • AM (Accepted Manuscript)

Published in

PHAGE: Therapy, Applications, and Research

Publisher

Mary Ann Liebert

issn

2641-6530

eissn

2641-6549

Copyright date

2025

Available date

2025-03-20

Language

en

Deposited by

Dr Andrew Millard

Deposit date

2025-02-02

Rights Retention Statement

  • Yes

Usage metrics

    University of Leicester Publications

    Categories

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC