Metagenomics has shown a rising interest towards using long reads from third-generation sequencing (TGS) technologies to overcome the limitations of short reads in resolving structural and functional aspects of genomes. As these sequences are long enough to hold species-specific signals, they can be grouped into bins of different taxonomic groups.
Previous studies have already shown that binning long reads based on nucleotide composition and abundance information significantly improves the quality of downstream metagenome assemblies. This project aims to develop a method to refine the binning results of long-read datasets obtained from existing binning tools while being aware of the underlying microbial kingdoms. Additionally, we employ the connectivity information between the reads to correct the potentially misclassified reads and the Machine-learning based label propagation technique to ensure completeness in binning.
🔗 The long-reads refiner tool we developed; GraphK-LR is available at: https://github.com/NethmiRanasinghe/GraphK-LR
- Aththanayaka A.M.S. (E/18/030)
- Ranasinghe R.A.N.S. (E/18/282)
- Ranasinghe R.D.J.M. (E/18/283)
- Dr. Damayanthi Herath
- Dr. Vijni Mallawaarachchi