Skip to main content
Frontiers in MicrobiologyVolume 14, 2023, Article number 1250806

A toolbox of machine learning software to support microbiome analysis(Review)(Open Access)

  • Marcos-Zambrano, L.J.,
  • López-Molina, V.M.,
  • Bakir-Gungor, B.,
  • Frohme, M.,
  • Karaduzovic-Hadziabdic, K.,
  • Klammsteiner, T.,
  • Ibrahimi, E.,
  • Lahti, L.,
  • Loncar-Turukalo, T.,
  • Dhamo, X.,
  • Simeon, A.,
  • Nechyporenko, A.,
  • Pio, G.,
  • Przymus, P.,
  • Sampri, A.,
  • Trajkovik, V.,
  • Lacruz-Pleguezuelos, B.,
  • Aasmets, O.,
  • Araujo, R.,
  • Anagnostopoulos, I.,
  • Aydemir, Ö.,
  • Berland, M.,
  • Calle, M.L.,
  • Ceci, M.,
  • Duman, H.,
  • Gündoğdu, A.,
  • Havulinna, A.S.,
  • Kaka Bra, K.H.N.,
  • Kalluci, E.,
  • Karav, S.,
  • Lode, D.,
  • Lopes, M.B.,
  • May, P.,
  • Nap, B.,
  • Nedyalkova, M.,
  • Paciência, I.,
  • Pasic, L.,
  • Pujolassos, M.,
  • Shigdel, R.,
  • Susín, A.,
  • Thiele, I.,
  • Truică, C.-O.,
  • Wilmes, P.,
  • Yilmaz, E.,
  • Yousef, M.,
  • Claesson, M.J.,
  • Truu, J.,
  • Carrillo de Santa Pau, E.
  • View Correspondence (jump link)
  • View Correspondence (jump link)
  Save all to author list
  • aComputational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, Madrid, Spain
  • bDepartment of Computer Engineering, Abdullah Gül University, Kayseri, Turkey
  • cDivision Molecular Biotechnology and Functional Genomics, Technical University of Applied Sciences Wildau, Wildau, Germany
  • dFaculty of Engineering and Natural Sciences, International University of Sarajevo, Sarajevo, Bosnia and Herzegovina
  • eDepartment of Microbiology and Department of Ecology, University of Innsbruck, Innsbruck, Austria
  • fDepartment of Biology, University of Tirana, Tirana, Albania
  • gDepartment of Computing, University of Turku, Turku, Finland
  • hFaculty of Technical Sciences, University of Novi Sad, Novi Sad, Serbia
  • iDepartment of Applied Mathematics, Faculty of Natural Sciences, University of Tirana, Tirana, Albania
  • jBioSense Institute, University of Novi Sad, Novi Sad, Serbia
  • kDepartment of Systems Engineering, Kharkiv National University of Radioelectronics, Kharkiv, Ukraine
  • lDepartment of Computer Science, University of Bari Aldo Moro, Bari, Italy
  • mNational Interuniversity Consortium for Informatics, Rome, Italy
  • nFaculty of Mathematics and Computer Science, Nicolaus Copernicus University, Toruń, Poland
  • oVictor Phillip Dahdaleh Heart and Lung Research Institute, University of Cambridge, Cambridge, United Kingdom
  • pFaculty of Computer Science and Engineering, Skopje, North Macedonia
  • qInstitute of Genomics, Estonian Genome Centre, University of Tartu, Tartu, Estonia
  • rDepartment of Biotechnology, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
  • sNephrology and Infectious Diseases R & D Group, i3S—Instituto de Investigação e Inovação em Saúde, INEB—Instituto de Engenharia Biomédica, Universidade do Porto, Porto, Portugal
  • tDepartment of Informatics, University of Piraeus, Piraeus, Greece
  • uComputer Science and Biomedical Informatics Department, University of Thessaly, Lamia, Greece
  • vDepartment of Electrical and Electronics Engineering, Karadeniz Technical University, Trabzon, Turkey
  • wINRAE, MetaGenoPolis, Université Paris-Saclay, Jouy-en-Josas, France
  • xFaculty of Sciences, Technology and Engineering, University of Vic, Central University of Catalonia, Barcelona, Vic, Spain
  • yIRIS-CC, Fundació Institut de Recerca i Innovació en Ciències de la Vida i la Salut a la Catalunya Central, Barcelona, Vic, Spain
  • zDepartment of Molecular Biology and Genetics, Çanakkale Onsekiz Mart University, Çanakkale, Turkey
  • aaDepartment of Microbiology and Clinical Microbiology, Faculty of Medicine, Erciyes University, Kayseri, Turkey
  • abMetagenomics Laboratory, Genome and Stem Cell Center (GenKök), Erciyes University, Kayseri, Turkey
  • acFinnish Institute for Health and Welfare - THL, Helsinki, Finland
  • adInstitute for Molecular Medicine Finland, FIMM-HiLIFE, Helsinki, Finland
  • aeInstitute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
  • afDepartment of Mathematics, Center for Mathematics and Applications (NOVA Math), NOVA School of Science and Technology, Caparica, Portugal
  • agUNIDEMI, Department of Mechanical and Industrial Engineering, NOVA School of Science and Technology, Caparica, Portugal
  • ahBioinformatics Core, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
  • aiSchool of Medicine, University of Galway, Galway, Ireland
  • ajDepartment of Inorganic Chemistry, Faculty of Chemistry and Pharmacy, University of Sofia, Sofia, Bulgaria
  • akCenter for Environmental and Respiratory Health Research (CERH), Research Unit of Population Health, University of Oulu, Oulu, Finland
  • alBiocenter Oulu, University of Oulu, Oulu, Finland
  • amSarajevo Medical School, University Sarajevo School of Science and Technology, Sarajevo, Bosnia and Herzegovina
  • anDepartment of Clinical Science, University of Bergen, Bergen, Norway
  • aoMathematical Department, UPC-Barcelona Tech, Barcelona, Spain
  • apAPC Microbiome Ireland, University College Cork, Cork, Ireland
  • aqComputer Science and Engineering Department, Faculty of Automatic Control and Computers, National University of Science and Technology Politehnica, Bucharest, Romania
  • arSystems Ecology Group, Luxembourg Centre for Systems Biomedicine, Esch-sur-Alzette, Luxembourg
  • asDepartment of Life Sciences and Medicine, Faculty of Science, Technology and Medicine, University of Luxembourg, Belvaux, Luxembourg
  • atDepartment of Computer Technologies, Karadeniz Technical University, Trabzon, Turkey
  • auDepartment of Information Systems, Zefat Academic College, Zefat, Israel
  • avGalilee Digital Health Research Center (GDH), Zefat Academic College, Zefat, Israel
  • awSchool of Microbiology, University College Cork, Cork, Ireland

Abstract

The human microbiome has become an area of intense research due to its potential impact on human health. However, the analysis and interpretation of this data have proven to be challenging due to its complexity and high dimensionality. Machine learning (ML) algorithms can process vast amounts of data to uncover informative patterns and relationships within the data, even with limited prior knowledge. Therefore, there has been a rapid growth in the development of software specifically designed for the analysis and interpretation of microbiome data using ML techniques. These software incorporate a wide range of ML algorithms for clustering, classification, regression, or feature selection, to identify microbial patterns and relationships within the data and generate predictive models. This rapid development with a constant need for new developments and integration of new features require efforts into compile, catalog and classify these tools to create infrastructures and services with easy, transparent, and trustable standards. Here we review the state-of-the-art for ML tools applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on ML based software and framework resources currently available for the analysis of microbiome data in humans. The aim is to support microbiologists and biomedical scientists to go deeper into specialized resources that integrate ML techniques and facilitate future benchmarking to create standards for the analysis of microbiome data. The software resources are organized based on the type of analysis they were developed for and the ML techniques they implement. A description of each software with examples of usage is provided including comments about pitfalls and lacks in the usage of software based on ML methods in relation to microbiome data that need to be considered by developers and users. This review represents an extensive compilation to date, offering valuable insights and guidance for researchers interested in leveraging ML approaches for microbiome analysis. Copyright © 2023 Marcos-Zambrano, López-Molina, Bakir-Gungor, Frohme, Karaduzovic-Hadziabdic, Klammsteiner, Ibrahimi, Lahti, Loncar Turukalo, Dhamo, Simeon, Nechyporenko, Pio, Przymus, Sampri, Trajkovik, Lacruz-Pleguezuelos, Aasmets, Araujo, Anagnostopoulos, Aydemir, Berland, Calle, Ceci, Duman, Gündoğdu, Havulinna, Kaka Bra, Kalluci, Karav, Lode, Lopes, May, Nap, Nedyalkova, Paciência, Pasic, Pujolassos, Shigdel, Susín, Thiele, Truică, Wilmes, Yilmaz, Yousef, Claesson, Truu, Carrillo de Santa Pau.

Author keywords

data integrationfeature analysisfeature generationmachine learningmicrobial gene predictionmicrobial metabolic modelingmicrobiomesoftware

Indexed keywords

EMTREE medical terms:artificial intelligenceartificial neural networkdisease classificationfeature selectiongene sequencehierarchical clusteringhumanlearning algorithmmetagenomicsmicrobial metabolic modelingmicrobiologistmicrobiomemodelnatural language processingnonhumanoperational taxonomic unitphylogenypredictive modelprincipal component analysisReviewshotgun sequencingsupport vector machinetaxonomytime series analysis

Funding details

Funding sponsor Funding number Acronym
ANR-11-DPBS-0001
Ministerio de Asuntos Económicos y Transformación Digital, Gobierno de EspañaPID2019-104830RB-I00MINECO
IJC2019-042188-I
European Cooperation in Science and TechnologyCA18131COST
  • 1

    This study was supported by COST Action CA18131 \u201CStatistical and machine learning techniques in human microbiome studies.\u201D LM-Z is supported by Spanish State Research Agency Juan de la Cierva Grant IJC2019-042188-I (LM-Z). MB is supported by Metagenopolis grant ANR-11-DPBS-0001. MLC was partially supported by the Spanish Ministry of Economy, Industry and Competitiveness, Reference PID2019-104830RB-I00.

  • 2

    This article is based upon work from COST Action ML4Microbiome \u201CStatistical and machine learning techniques in human microbiome studies,\u201D CA18131, supported by COST (European Cooperation in Science and Technology), www.cost.eu .

  • ISSN: 1664302X
  • Source Type: Journal
  • Original language: English
  • DOI: 10.3389/fmicb.2023.1250806
  • Document Type: Review
  • Publisher: Frontiers Media SA

  Marcos-Zambrano, L.J.; Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, Madrid, Spain;
  Carrillo de Santa Pau, E.; Computational Biology Group, Precision Nutrition and Cancer Research Program, IMDEA Food Institute, Madrid, Spain;
© Copyright 2023 Elsevier B.V., All rights reserved.

Cited by 10 documents

Bakir-Gungor, B. , Temiz, M. , Canakcimaksutoglu, B.
Prediction of colorectal cancer based on taxonomic levels of microorganisms and discovery of taxonomic biomarkers using the Grouping-Scoring-Modeling (G-S-M) approach
(2025) Computers in Biology and Medicine
Fonseca, D.C. , da Rocha Fernandes, G. , Waitzberg, D.L.
Artificial intelligence and human microbiome: A brief narrative review
(2025) Clinical Nutrition Open Science
Huang, Y.
Rapid detection of food microorganisms from the perspective of cellular and molecular biomechanics leveraging biotechnology and computer vision
(2025) MCB Molecular and Cellular Biomechanics
View details of all 10 citations
{"topic":{"name":"Microbiome; 16S Ribosomal RNA; DNA","id":8461,"uri":"Topic/8461","prominencePercentile":98.37137,"prominencePercentileString":"98.371","overallScholarlyOutput":0},"dig":"3e4b0a369cd89bca5f019b6c5ddb42d348182760f83cc27e49d1e548eddbc6eb"}

SciVal Topic Prominence

Topic:
Prominence percentile: