Data-driven statiscal and machine learning modelling of real building stock energy use
DOI:
https://doi.org/10.34641/clima.2022.379Keywords:
Data-driven modelling, Linear regression, Gradient boosting trees, Building stock energy useAbstract
One of today’s major challenges is to become climate neutral by 2050. Large potential for energy reduction is found in the building sector (which accounts for 40% of Europe’s total primary energy use). To compare energy reduction strategies, Building-Stock Energy Models are vital instruments. Yet, the regulatory energy performance calculation (which is currently used by EU policy makers) poorly predicts the real building energy use in residential buildings and largely overestimates the potential energy savings. Promising data- driven black-box models are gaining considerable traction in a wide range of applications. This paper evaluates whether data-driven linear regression and gradient boosting machine models provide better predictions of the real total building energy use at large scale as compared to the current regulatory white-box building energy calculation method. Compared to the performance of the regulatory method, both the linear regression models and the gradient boosting regression trees perform better (gradient boosting regression trees slightly worse than multiple linear regression). Yet, a large part of the variance in the linear regression models is left unexplained and also for the gradient boosting trees, there is room for improvement. At individual building level, it is clear that both the linear regression model performance and the gradient boosting regression tree performance is too poor for inference. At stock level, however, both types of models seem promising and can be a useful tool to inform big housing owners (e.g., financial institutions, governments, housing companies etc.) or for policy making.