Skip to main content

nf-core proteinfold: a community-driven open source pipeline for deep learning based protein structure prediction methods

K.4.601 | Day 1 | 16:05 - 16:20 | Speakers: Jose Espinosa-Carrasco

nf-core proteinfold: a community-driven open source pipeline for deep learning based protein structure prediction methods
A picture of a devroom at FOSDEM 2024
Open in browser
Get involved in the conversation!Join the chat

Notes

Abstract

The release of AlphaFold2 paved the way for a new generation of prediction tools for studying unknown proteomes. These tools enable highly accurate protein structure predictions by leveraging advances in deep learning. However, their implementation can pose technical challenges for users, who must navigate a complex landscape of dependencies and large reference databases. Providing the community with a standardized workflow framework to run these tools could ease adoption.

Thanks to its adherence to nf-core guidelines, the nf-core/proteinfold pipeline simplifies the application of state-of-the-art protein structure modeling techniques by taking advantage of the optimized execution Nextflow’s capabilities on both cloud providers and HPC infrastructures. The pipeline integrates several popular methods, namely AlphaFold 2 and 3, Boltz 1 and 2, ColabFold, ESMFold, HelixFold, RosettaFoldAA, and RosettaFold2NA. Following structure prediction, nf-core/proteinfold generates an interactive report that allows users to explore and compare predicted models together with standardized confidence metrics, harmonized across methods for consistent interpretation. The workflow also integrates Foldseek-based structural search, enabling the identification of known protein structures similar to the predicted models.

The pipeline is developed through an international collaboration that includes Australian BioCommons, the Centre for Genomic Regulation, Pompeu Fabra University, and the European Bioinformatics Institute, and it already serves as a central resource for structure prediction at several of these organisations and others. This broad adoption demonstrates how nf-core/proteinfold, through its open-source and community-driven development model, is lowering the barrier to using deep learning based approaches for protein structure prediction in everyday research.

Interestingly, nf core proteinfold represents a new generation of Nextflow workflows designed to place multiple alternative methods for the same task within one coherent framework. This design makes it possible to compare the different procedures, providing a basis for developing combined approaches that may mature into meta-methods.

More info

nf-core project

nf-core/proteinfold pipeline

nf-core/proteinfold GitHub repository

Join nf-core

My bluesky

Attachments


Notice: The placeholder video image is licensed under CC BY-SA 4.0. The original image can be found hereChanges made to the image are: Cropped the image to a new ratio, part of the image was cut off.