Energy Consumption Prediction in Data Centres Using Deep RNN Optimised by Ninja Optimisation Algorithm
A forecasting framework using Bidirectional and Stacked DRNN with NiOA-based hyperparameter optimisation.
Introduction
Data centres significantly contribute to global carbon emissions due to continuous computational workload and energy demand. Accurate prediction of energy increments enables proactive carbon optimisation and intelligent resource management.
This project proposes a Bidirectional Deep Recurrent Neural Network (DRNN) optimised via Ninja Optimisation Algorithm (NiOA) for robust multi-horizon energy increment forecasting, ensuring the time-series integrity and full experimental reproducibility. This project aims to propose a reproducible deep learning framework for multi-horizon energy consumption prediction at data centers using a bidirectional Deep Recurrent Neural Network optimised by Ninja Optimsation Algorithm (NiOA).
Literature Review
Problem Statement
Conventional carbon footprint estimation methods are inaccurate and fail to capture dynamic, time-evolving power consumption patterns in data centers. Despite the availability of advanced deep learning architectures and optimization frameworks, accurate, real-time prediction of carbon emissions in data centers remains challenging because of high dimensional sensor data, complex temporal dynamics, and difficulty of hypermeter tuning. Manual hyperparameter tuning is inefficient and suboptimal. Hence, a real-time, scalable, and accurate predictive system is essential for optimizing energy use and minimizing carbon emissions in data centers.
Objectives
- Design Bidirectional DRNN for cumulative energy prediction.
- Optimise hyperparameters using NiOA.
- Implement multi-horizon forecasting (1 sec → 15 min).
- Ensure strict chronological splitting.
- Maintain full reproducibility and benchmarking compatibility.
Proposed Methodology
Dataset
- Source: IEEEDataPort→ Data Server Energy Consumption Dataset (August-December 2021)
- Duration: ~5 months continuous monitoring
- Sampling Resolution: 1-second frequency
- Features: Voltage(V), Current(A), Power(W), Energy(KWh), CPU usage, Temperature
Data Preprocessing
- Chronological timestamp sorting
- Missing value imputation (forward fill)
- Outlier removal (Z-score; threshold = 3)
- Train-only StandardScaler
- No target scaling
Sequence Generation
Sliding window approach with sequence length = 10(for 1 second pipeline),
120 (for longer horizons) for contextual learning.
Splitting Strategy
Train range: 2021-08-05 12:44:50 → 2021-09-12 04:11:57 (70%)
Validation range: 2021-09-12 04:11:58 → 2021-11-26 02:11:39 (15%)
Hypermeter Optimisation
Ninja Optimisation Algorithm (NiOA) with 6 agents and 6 iterations.
A meta-heuristic approach inspired by ninja hunting strategies, balancing exploration and
exploitation
for optimal hyperparameter selection.
Rapid convergence within first 3-4 iterations.
Search space:
LSTM Layers: 2-3
Units per Layer: 64-128
Dropout Rate: 0.3-0.6
Optimiser: AdamW
Learning Rate: 5e-05 to 0.0005
Batch Size: 32
Target
ΔEₖ(t) = E(t+k) - E(t)
Enables multi-horizon forecasting (k = 60, 300, 900 seconds) for
comprehensive temporal insights. It also improves signal strength
and reduces noise compared to 1-second ΔE.
Model Architecture
Bidirectional LSTM → Stacked LSTM → Dropout → Batch Normalisation → Global Pooling → Dense layers.
Optimisation (NiOA)
6 agents x 6 iterations. A meta-heuristic approach inspired by ninja hunting strategies, balancing exploration and exploitation for optimal hyperparameter selection. Rapid convergence within first 3-4 iterations.
Live Execution
View CodePipeline Evolution
Version 1.0: 1 second Δenergy prediction → High noise, negative R².
Version 2.0: 1 minute Δenergy prediction → Improved R², but scaling mismatch and leakage issues.
Version 3.0: Multi-horizon cumulative reformulation (k=60, 300, 900; a work in progress).
Sequence length increased from 10 → 120 for contextual learning.
Results
Sequence Shape
Train Seq shape = (2016429, 120, 17)
Validation seq shape=(431997, 120, 17)
Best Hypermrters
LSTM layers = 2
Units = 64
Dropout = 0.4182390933142627
Optimizer = AdamW
Learning Rate= 0.00049147762299616
Batch Size= 32
k = 1 second
MAE ≈ 1.224
RMSE ≈ 1.73
R² ≈ ~0
High noise, weak signal strength.
k = 60 seconds
MAE ≈ 1.04
RMSE ≈ 2.08
R² ≈ ~0
High noise, weak-moderate signal strength.
Outcome & Achievements
- Implemented a modular, scalable forecasting framework.
- Initial 1-Second Δenergy Model successfully implemented.
- 1-minute Δenergy Model successfully implemented.
- Target Reformulation for multi-horizon forecasting.
Future Work
- Complete multi-horizon evaluation (k=300, 900 seconds). Expect near normal residuals and stronger R².
- Benchmarking against classical ML models.
- Statistical significance testing.
References
- Hochreiter & Schmidhuber (1997) - Long Short-Term Memory
- Recent Energy Forecasting research literature
- Ninja Optimisation Algorithm - Metaheuristic Approach
Academic Credits
Project Guide
Dr Shishir Singh Chauhan
Student
Anwesha Singh
23FE10CSE00299
Thank You
Questions & Discussion