Skip to main content
Science of Computer ProgrammingVolume 238, December 2024, Article number 103168

Prescriptive procedure for manual code smell annotation(Article)

  Save all to author list
  • Department of Computing and Control Engineering, Faculty of Technical Sciences, University of Novi Sad, Novi Sad, Serbia

Abstract

– Code smells are structures in code that present potential software maintainability issues. Manually constructing high-quality datasets to train ML models for code smell detection is challenging. Inconsistent annotations, small size, non-realistic smell-to-non-smell ratio, and poor smell coverage hinder the dataset quality. These issues arise mainly due to the time-consuming nature of manual annotation and annotators’ disagreements caused by ambiguous and vague smell definitions. To address challenges related to building high-quality datasets suitable for training ML models for smell detection, we designed a prescriptive procedure for manual code smell annotation. The proposed procedure represents an extension of our previous work, aiming to support the annotation of any smell defined by Fowler. We validated the procedure by employing three annotators to annotate smells following the proposed annotation procedure. The main contribution of this paper is a prescriptive annotation procedure that benefits the following stakeholders: annotators building high-quality smell datasets that can be used to train ML models, ML researchers building ML models for smell detection, and software engineers employing ML models to enhance the software maintainability. Secondary contributions are the code smell dataset containing Data Class, Feature Envy, and Refused Bequest, and DataSet Explorer tool which supports annotators during the annotation procedure. © 2024 Elsevier B.V.

Author keywords

Code smellMaintainability,;Software qualityManual annotation

Indexed keywords

Engineering controlled terms:Computer software selection and evaluationOdors
Engineering uncontrolled termsCode smellData classHigh qualityMaintainability,;Manual annotationManual codesSoftware maintainabilitySoftware Quality
Engineering main heading:Maintainability

Funding details

Funding sponsor Funding number Acronym
Science Fund of the Republic of Serbia6521051
Science Fund of the Republic of Serbia
451-03-47/2023-01/200156,451–03–65/2024–03/200156
01-3394/1
  • 1

    This research is supported by the Science Fund of the Republic of Serbia , Grant No 6521051 , AI-Clean CaDET and the Ministry of Science, Technological Development and Innovation through project no. 451-03-47/2023-01/200156 \u201CInnovative scientific and artistic research from the FTS (activity) domain.\u201D Our funders had no involvement in the study design, collection, analysis, and interpretation of the data, writing of the report, or the decision to submit the article for publication.

  • 2

    This research is supported by the Science Fund of the Republic of Serbia, Grant No 6521051, AI-Clean CaDET and the Ministry of Science, Technological Development and Innovation through contract no. 451\u201303\u201365/2024\u201303/200156, and the Faculty of Technical Sciences, University of Novi Sad through project \u201CScientific and Artistic Research Work of Researchers in Teaching and Associate Positions at the Faculty of Technical Sciences, University of Novi Sad\u201D (No. 01-3394/1). Our funders had no involvement in the study design, collection, analysis, and interpretation of the data, writing of the report, or the decision to submit the article for publication.

  • ISSN: 01676423
  • CODEN: SCPGD
  • Source Type: Journal
  • Original language: English
  • DOI: 10.1016/j.scico.2024.103168
  • Document Type: Article
  • Publisher: Elsevier B.V.

  Prokić, S.; Faculty of Technical Sciences, University of Novi Sad, Trg Dositeja Obradovića 6, Novi Sad, Serbia;
© Copyright 2024 Elsevier B.V., All rights reserved.

Cited by 0 documents

{"topic":{"name":"Refactoring; Computer Software Selection and Evaluation; Open Source Software","id":5965,"uri":"Topic/5965","prominencePercentile":94.68596,"prominencePercentileString":"94.686","overallScholarlyOutput":0},"dig":"1b0252448cbb1f114919fd4ce52f134486401fddbd56197bee80d515814843d8"}

SciVal Topic Prominence

Topic:
Prominence percentile: