Skip to main content
NeurocomputingVolume 548, 1 September 2023, Article number 126376

Exploring the association between time series features and forecasting by temporal aggregation using machine learning(Article)(Open Access)

  Save all to author list
  • aCardiff Business School, 3 Colum Drive, CF10 3EU Cardif, United Kingdom
  • bInstitute for Artificial Intelligence Research and Development of Serbia, Fruskogorska 1, Novi Sad, 21000, Serbia

Abstract

When a forecast of the total value over several time periods ahead is required, forecasters are presented with two temporal aggregation (TA) approaches to produce required forecasts: i) aggregated forecast (AF) or ii) aggregate data using non-overlapping temporal aggregation (AD). Often, the recommendation is to aggregate data to a frequency relevant to the decision the eventual forecast will support and then produce the forecast. However, this might not be always the best choice and we argue that both AF and AD approaches may outperform each other in different situations. Moreover, there is a lack of evidence on what indicators may determine the superiority of each approach. We design and execute an empirical experiment framework to first explore the performance of these approaches using monthly time series of M4 competition dataset. We further turn the problem into a classification supervised learning by constructing a database consisting of features of each time series as predictor and model class labelled as AF/AD as response/outcome. We then build machine learning algorithms to investigate the association between time series features and the performance of AF and AD. Our findings suggest that both AF and AD approaches may not consistently generate accurate results for every individual series. AF is shown to be significantly better than AD for the monthly M4 time series, especially for longer horizons. We build several machine learning approaches using a set of extracted time series features as input to predict accurately whether AD or AF should be used. We find out that Random Forest (RF) is the most accurate approach in correctly classifying the outcome assessed both by statistical measures such as misclassification error, F-statistics, area under the curve, and a utility measure. The RF approach reveals that curvature, nonlinearity, seas_pacf, unitroot_pp, mean, ARCHM.LM, Coefficient of Variation, stability, linearity, and max_level_shif are among the most important features in driving the predictions of the model. Our findings indicate that the strength of trend, ARCH.LM, hurst, autocorrelation lag 1, unitroot_pp, and seas_pacf may favour AF approach, while lumpiness, entropy, nonlinearity, curvature, and strength of seasonality may increase the chance of AD performing better. We conclude the study by summarising the findings and present an agenda for further research. © 2023 The Author(s)

Author keywords

ClassificationExponential SmoothingForecastingM4 competitionMachine LearningRandom ForestTemporal AggregationTime Series Features

Indexed keywords

Engineering controlled terms:AggregatesClassification (of information)ForestryLearning algorithmsMachine learningTime series
Engineering uncontrolled termsAggregate datumExponential smoothingM4 competitionMachine-learningPerformanceRandom forestsTemporal aggregationTime series featuresTime series forecastingTimes series
Engineering main heading:Forecasting
EMTREE medical terms:area under the curvearticleautocorrelationcompetitioncontrolled studyentropyforecastinglearningmachine learningnonlinear systemoutcome assessmentpredictionrandom forestseasonal variationtime series analysis

Funding details

Funding sponsor Funding number Acronym
Horizon 2020
  • 1

    Dr. Dejan Mircetic obtained his PhD at the University of Novi Sad, in time series forecasting & supply chain analytics. Dejan published more than 40 scientific papers and the inspiration for his papers is majority driven by real problems in contemporary business & industry. Moreover, he served as the supply chain analyst for several logistics companies, where he led teams for designing machine learning solutions for optimizing supply chains. Currently, he is serving as a consulting machine learning expert for a USAID humanitarian project, with a strong wish to enhance the healthcare system of Côte d’Ivoire and desire to improve everyday life of people in Africa. At the Institute for Artificial Intelligence research and development of Serbia, he is leading the Intelligent Supply, Inventory and Operations Planning (ISIOP) project, part of AIPlan4EU project funded under Horizon 2020 research and innovation programme.

  • ISSN: 09252312
  • CODEN: NRCGE
  • Source Type: Journal
  • Original language: English
  • DOI: 10.1016/j.neucom.2023.126376
  • Document Type: Article
  • Publisher: Elsevier B.V.

  Rostami-Tabar, B.; Cardiff Business School, 3 Colum Drive, CF10 3EU Cardif, United Kingdom;
© Copyright 2023 Elsevier B.V., All rights reserved.

Cited by 3 documents

Hagedorn, B. , Becker, M.W. , Silbiger, N.J.
Refining submarine groundwater discharge analysis through nonlinear quantile regression of geochemical time series
(2024) Journal of Hydrology
Fidlerová, H. , Mirčetić, D.
Ethics in the Tech Age Resilience Strategies for Mitigating Technological Risks
(2024) Handbook of Technological Sustainability: Innovation and Environmental Awareness
Zhu, G. , Gong, Y. , Ding, J.
Time Granularity Setting Principle for Short-Term Passenger Flow Prediction in Urban Rail Transit
(2024) IEEE Transactions on Computational Social Systems
View details of all 3 citations
{"topic":{"name":"Time Series; Neural Network; Forecasting Model","id":7518,"uri":"Topic/7518","prominencePercentile":98.410675,"prominencePercentileString":"98.411","overallScholarlyOutput":0},"dig":"35f1df982587b2c59c26050933cab9c86ba713ccc8baf962e8db87d461a6c41b"}

SciVal Topic Prominence

Topic:
Prominence percentile: