Introduction
The rapid evolution of digital economies has diversified investment avenues beyond traditional savings, with assets like cryptocurrencies gaining prominence. Bitcoin, characterized by extreme nonlinearity and non-stationarity, presents both opportunities and risks for investors. Accurate price prediction models are essential to capitalize on volatility while mitigating losses. Advances in deep learning, particularly Long Short-Term Memory (LSTM) networks and Convolutional Neural Networks (CNN), have significantly enhanced predictive analytics in financial markets.
Prior research by Guo Sihan [1] employed improved recurrent neural networks for Bitcoin price forecasting, while Zhang Ning et al. [2] demonstrated that hybrid models combining LSTM with other architectures outperform standalone LSTM models. This study builds on these insights by comparing individual LSTM and CNN models with a novel CNN-LSTM hybrid to evaluate prediction accuracy.
Long Short-Term Memory (LSTM) Networks
LSTM, a specialized Recurrent Neural Network (RNN), addresses RNN’s limitations through gated mechanisms (input, forget, and output gates) that regulate information flow. Key computational steps include:
Forget Gate:
Determines which historical data to retain or discard using sigmoid-activated weights:$$ f_t = \sigma(W_f \cdot [h_{t-1}, x_t] + b_f) $$
Input Gate:
Updates cell state with new information via sigmoid and tanh layers:$$ i_t = \sigma(W_i \cdot [h_{t-1}, x_t] + b_i) $$
$$ \tilde{C}_t = \tanh(W_C \cdot [h_{t-1}, x_t] + b_C) $$
Output Gate:
Generates short-term memory output:$$ o_t = \sigma(W_o \cdot [h_{t-1}, x_t] + b_o) $$
$$ h_t = o_t \cdot \tanh(C_t) $$
LSTM Model Empirical Analysis
Data Selection
- Dataset: 1,826 daily Bitcoin price points (2016–2021) from NASDAQ.
- Features: RSI14, DIFF, DEA, MACD, Up20, Down20 (technical indicators).
- Preprocessing: Sliding window method (10-day window) for time-series generation.
Model Architecture
- Layers: 3 LSTM, 3 dense, 5 dropout.
- Metric: Mean Absolute Percentage Error (MAPE = 10.14%).
👉 Explore advanced crypto analytics
Convolutional Neural Network (CNN)
CNN extracts spatial features through convolutional and pooling layers:
Convolutional Layer:
$$ X^l = f(W^l \otimes X^{l-1} + b^l) $$
- Pooling Layer: Reduces dimensionality while preserving invariance.
CNN Model Implementation
- Parameters: 3 convolutional, 2 pooling, 3 dense layers.
- Enhancements: Dilated convolutions and residual learning to prevent overfitting.
- Performance: MAPE = 9.29%, outperforming LSTM in dynamic capture.
CNN-LSTM Hybrid Model
Integration Strategy
- Weighted Fusion: Combines CNN (α = 0.1) and LSTM (β = 0.9) outputs.
Prediction Formula:
$$ \hat{P}_t = \alpha \cdot \hat{P}_{\text{CNN}} + \beta \cdot \hat{P}_{\text{LSTM}} $$
Results
- MAPE: 4.74% (vs. 8.20% LSTM, 7.09% CNN).
Advantages:
- Mitigates LSTM’s lag.
- Reduces CNN’s vertical error.
👉 Learn about hybrid trading models
FAQs
Q1: Why combine CNN and LSTM for price prediction?
A1: CNN captures spatial patterns (e.g., technical indicators), while LSTM models temporal trends, yielding a more robust hybrid.
Q2: How does dropout improve model performance?
A2: Dropout layers reduce overfitting by randomly deactivating neurons during training, enhancing generalization.
Q3: What metrics evaluate prediction accuracy?
A3: MAPE measures average deviation between predicted and actual prices, with lower values indicating higher precision.
Q4: Can this model predict other cryptocurrencies?
A4: Yes, but retraining with asset-specific data is recommended due to varying volatility patterns.
Conclusion
The CNN-LSTM hybrid model synergizes LSTM’s sequential analysis and CNN’s feature extraction capabilities, achieving superior accuracy (47% lower MAPE than standalone models). Future work could explore ensemble techniques integrating additional indicators or architectures.