

Code readability models are typically based on the code's structural and textual features, considering code readability as an objective category. However, readability is inherently subjective and dependent on the knowledge and experience of the reader analyzing the code. This paper assesses the readability of Python code statements commonly used in undergraduate programming courses. Our readability model is based on tracking the reader's eye movement during the while-read phase. It uses machine learning (ML) techniques and relies on a novel set of features—observational features—that capture how the readers read the code. We experimented by tracking the eye movement of 90 undergraduate students while assessing the readability of 48 Python code snippets. We trained an ML model that predicts readability based on the collected observational data and the code snippet's structural and textual features. In our experiments, the XGBoost classifier trained using observational features exclusively achieved the best results (0.85 F-measure). Using correlation analysis, we identified Python statements most affecting readability for undergraduate students and proposed implications for teaching Python programming. In line with findings for Java language, we found that constructs related to the code's size and complexity hurt the code's readability. Numerous comments also hindered readability, potentially due to their association with less readable code. Some Python-specific statements (list comprehension, lambda function, and dictionary comprehension) harmed code readability, even though they were part of the curriculum. Tracking students' gaze indicated some additional factors, most notably nonlinearity introduced by if, for, while, try, and function call statements. © 2023 Wiley Periodicals LLC.
| Engineering controlled terms: | Codes (symbols)Eye movementsEye trackingHigh level languagesMachine learningStudents |
|---|---|
| Engineering uncontrolled terms | Code readabilityEmpirical studiesEye-trackingMachine-learningProgramming coursePython codePython programmingStructural featureTextual featuresUndergraduate students |
| Engineering main heading: | Python |
| Funding sponsor | Funding number | Acronym |
|---|---|---|
| 451‐03‐47/2023‐01/200156 | ||
| Science Fund of the Republic of Serbia | 6521051 |
This research was supported by the Science Fund of the Republic of Serbia, Grant No 6521051, AI‐Clean CaDET and the Ministry of Science, Technological Development and Innovation through project no. 451‐03‐47/2023‐01/200156 “Innovative scientific and artistic research from the FTS (activity) domain.”
Savić, G.; University of Novi Sad, Faculty of Technical Sciences, Trg Dositeja Obradovića 6, Novi Sad, Serbia;
© Copyright 2024 Elsevier B.V., All rights reserved.