Please login first
Effective Outlier Detection in Smart Home Energy Consumption Using Integrated Change Point Detection and Unsupervised Learning
1 , 2 , * 3
1  School of Computer Science and Engineering, VIT-AP University, Amaravati 522241, Andhra Pradesh, India
2  Department of Computer Science and Engineering, Siddhartha Academy of Higher Education, Kanuru 520007, Andhra Pradesh, India
3  School of Electronics Engineering, VIT-AP University, Amaravati 522241, Andhra Pradesh, India
Academic Editor: Simeone Chianese

Abstract:

As smart homes proliferate globally, smart meter energy consumption data has become vital for data-driven decisions, making data quality crucial for reliable analytics. However, smart meter data often contain anomalies such as outliers, missing values, and redundant entries, caused by communication delays, transmission errors, and device malfunctions. These anomalies can significantly compromise the accuracy of applications, including billing, contingency analysis, and energy forecasting. Among them, outliers are particularly detrimental, as they can distort statistical analysis, mislead machine learning models, and undermine overall system reliability. Thus, to address the challenge of detecting outliers in smart home energy consumption effectively, this paper focuses on hourly usage patterns and explores three methods initially: (i) a clustering-based technique using Density-Based Spatial Clustering of Applications with Noise (DBSCAN), (ii) a statistical forecasting model using Auto-Regressive Integrated Moving Average (ARIMA), and (iii) a time series segmentation method using Change Point Detection (CPD). While each method has its strengths, their standalone use is limited in handling the complexity and variability of real-world data. Therefore, to address this issue, this paper proposes three hybrid models, namely ARIMA+DBSCAN, CPD+DBSCAN, and ARIMA+CPD+DBSCAN. These models are designed to leverage temporal forecasting, structural shifts, and density-based clustering to identify both sudden and subtle deviations in energy consumption behavior. Simulations on a public smart home dataset from Kaggle show that the ARIMA+CPD+DBSCAN model outperforms others, achieving 0.96 precision, 0.89 recall, a 0.90 F1-score, and 0.98 accuracy, demonstrating the advantage of integrating statistical and clustering-based methods for robust outlier detection in smart home energy.

Keywords: Change Point Detection; Data Anomalies; Energy Consumption; Machine Learning; Outliers; Smart Homes
Comments on this paper
LALITHABHAVANI BIKKINA
innovative concept

KRISHNAVENI GARLAPATI
ONE SHOULD TAKE INITIATIVE TO GET THE BETTER PERFORMANCE



 
 
Top