Enhancing Stock Price Prediction Using Stacked Long Short-Term Memory

ABSTRACT


INTRODUCTION
Predicting stock prices is paramount in financial markets due to its wide-ranging applications.Accurate stock price predictions can aid investors, traders, and financial institutions make informed decisions regarding portfolio management, risk assessment, and strategic investment planning [1].Stock price prediction is also crucial in market surveillance and regulatory compliance, enabling authorities to monitor and prevent market manipulation and insider trading [2].Furthermore, accurate forecasts of stock prices contribute to the stability and efficiency of financial markets by facilitating the identification of mispriced assets, improving market liquidity, and enhancing overall market transparency [3].In this context, stock price prediction models serve as valuable tools for market participants and financial institutions, empowering them to navigate the complexities of the market and optimize their investment strategies.
This research aims to develop an effective stock price prediction model that can assist investors and financial practitioners make informed decisions in dynamic and volatile financial markets [4].Accurate predictions of stock prices are of great significance as they provide valuable insights into market trends, risk assessment, and investment strategies [2].This study chose Stacked Long Short-Term Memory (LSTM) as the predictive model due to its proven ability to capture and learn complex temporal dependencies in sequential data [5].LSTM networks, with their memory cells and gates, excel at handling long-term dependencies, making them well-suited for modeling and predicting time series data like stock prices [6].Furthermore, stacking multiple LSTM layers allows for extracting more abstract and higher-level features from the data, potentially enhancing the model's predictive capabilities [7].By utilizing Stacked LSTM, this research aims to leverage the strengths of this architecture to improve the accuracy and reliability of stock price predictions, thereby providing valuable insights for investment decision-making.
Several studies have highlighted the potential of LSTM for stock price prediction.Costa et al. [8] introduced a methodology that leverages iRace and NSGA-II algorithms to define and optimize an LSTM architecture for predicting stock market prices.The proposed method achieves impressive results, with Mean Absolute Percentage Error (MAPE) values of 1.279%, 1.564%, and 2.047% for BVSP, IBM, and AAPL stocks, respectively.Similarly, Bairagi et al. [9] focused on utilizing LSTM neural networks to forecast the prices of selected stocks and develop a user-friendly stock analysis dashboard.Meanwhile, Wu et al. [10] proposed an innovative framework that combines Convolutional Neural Network (CNN) and LSTM to enhance stock price prediction accuracy.The suggested algorithm demonstrates superior performance compared to previous methods.These studies show the effectiveness and potential of LSTM-based approaches for stock price prediction, showcasing their ability to achieve accurate forecasts and providing valuable tools for investors and analysts.
The literature demonstrates the advantages of LSTM in stock prediction.Chang et al. [11] conducted research highlighting the enhanced prediction, accuracy, and efficiency achieved through LSTM-based models.Similarly, Yadav et al. [12] focused on optimizing LSTM for time series prediction, specifically in the Indian stock market, revealing that meticulous hyperparameter selection can improve performance.In another study, Li et al. [13] proposed DP-LSTM, a novel deep neural network that integrates financial news and differential privacy mechanisms to enhance prediction accuracy.Furthermore, Qian and Chen [14] compared LSTM with ARIMA and determined that LSTM exhibited superior prediction accuracy while less sensitive to stability response.Collectively, these papers provide substantial evidence supporting LSTM as a promising approach for stock prediction, emphasizing its potential for accurate and efficient forecasting.
The literature provides insights into the limitations of LSTM for stock prediction, highlighting the potential effectiveness of combining LSTM with other techniques.Zou and Qu [15] discovered that Attention-LSTM outperformed other models regarding prediction error and return in their trading strategy; however, stacked-LSTM did not exhibit improved predictive power over LSTM.Ding and Qing [16] proposed an associated deep recurrent neural network model based on LSTM, which demonstrated superior accuracy in predicting multiple values simultaneously.Liu et al. [17] utilized LSTM to filter, extract feature values, and analyze stock data, establishing a prediction model for stock transactions.Furthermore, Althelaya et al. [18] evaluated and compared LSTM deep learning architectures for short-and long-term financial time series prediction, finding that LSTM supports time steps of arbitrary sizes without encountering the vanishing gradient problem.Nonetheless, it is worth noting that stock prediction remains one of the most challenging real-world applications for time-series prediction.Collectively, these papers indicate that while LSTM can be effective for stock prediction, it may benefit from incorporating additional techniques to enhance its accuracy.
The literature suggests that stacked LSTM is a promising deep learning model for stock prediction.Jaswanth and Kaushik [19] and Zhang et al. [20] delved into utilizing stacked LSTM in stock prediction and demonstrating enhanced predictive power and generalization capabilities.Uddin et al. [21] proposed an integrated solution that combines an extended LSTM model with a multivariate feature correlation approach, showing promising potential in stock prediction.Banyal et al. [22] similarly employed stacked LSTM for predicting the Indian stock market and reported superior performance compared to contemporary approaches.Collectively, these papers highlight the viability of stacked LSTM as an effective option for stock prediction, showcasing its potential to improve forecasting accuracy and overall predictive capabilities.
The dataset utilized in this research is sourced from Yahoo Finance and comprises stock price data from the top 10 stocks listed in the Indonesia Stock Exchange (IDX).The selected stocks represent prominent companies traded in the IDX, reflecting the dynamics of the Indonesian stock market.The dataset covers a substantial time range from July 6 2015 to October 14 2021, encompassing nearly six years of stock price information.This duration allows for a comprehensive analysis of stock price patterns, trends, and volatility over an extended period, facilitating the evaluation and validation of the proposed stock price prediction model.By utilizing this dataset, the study aims to gain insights into the effectiveness of the Stacked Long Short-Term Memory (LSTM) model for predicting stock prices in the context of the Indonesian Stock Exchange.
This research employs the Stacked LSTM model to facilitate stock price predictions.The LSTM architecture is designed to capture and learn complex temporal dependencies in sequential data, making it well-suited for modeling stock price movements.The trained model is then used to predict future stock prices based on input features such as historical prices, trading volumes, and other relevant financial indicators.The predicted stock prices are compared with the actual stock prices for evaluation purposes using various evaluation metrics, including Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and R-squared (R2).The 50 days of actual and predicted stock prices are plotted on a graph, allowing a side-by-side comparison to represent the model's performance visually.This visualization enables a qualitative assessment of the model's ability to capture the underlying trends and patterns in the stock price movements and provides valuable insights into the accuracy and reliability of the predictions made by the Stacked LSTM model.

RESEARCH METHOD 2.1. Stacked Long Short-Term Memory
The Stacked LSTM architecture is an extension of the single-layer LSTM that aims to capture more complex temporal dependencies and improve the modeling capabilities for stock price prediction [23].Unlike a single-layer LSTM consisting of a single layer of LSTM cells, the stacked LSTM incorporates multiple layers of LSTM cells stacked on top of each other.
In a stacked LSTM, each layer except the last one produces a sequence of hidden states that serves as input to the next layer [24].The output of the last layer is typically used for prediction or further downstream tasks.This layer-by-layer structure allows the stacked LSTM to capture hierarchies of information and learn representations at different levels of abstraction.The input to each layer is the hidden state output of the previous layer, while each layer's output serves as the subsequent layer's input.
The computation within each LSTM layer can be described in Equations 1-6.
Here,   represents the input at time step , ℎ −1 denotes the hidden state of the previous time step, σ represents the sigmoid activation function, tanh denotes the hyperbolic tangent function, and ⊙ represents element-wise multiplication.  ,   , and   are the forget gate, input gate, and output gate vectors, respectively.  ̃ represents the candidate cell state,   is the cell state, and ℎ  denotes the hidden state at time step .
By stacking multiple LSTM layers, the stacked LSTM architecture can capture more complex patterns and long-term dependencies in the input sequence, potentially improving the accuracy and performance of stock price prediction models.

Dataset
The dataset utilized in this research is sourced from Yahoo Finance and encompasses the top 10 stocks listed in the IDX from July 6 2015 to October 14 2021.These stocks represent various sectors and industries, comprehensively representing the Indonesian stock market.The dataset includes the following stocks, their respective ticker symbols, and industry classifications, as shown in Table 1 This dataset encompasses a substantial period, spanning over six years, allowing for the exploration of long-term trends and patterns in the stock prices of these companies.By utilizing this dataset, the research aims to develop and evaluate a stock price prediction model using the Stacked LSTM approach, enabling insights into the performance and accuracy of the model for predicting the stock prices of these selected companies listed in the Indonesia Stock Exchange.

Data Preprocessing
Data preprocessing is crucial in preparing the dataset for effective analysis and modeling.In stock price prediction, several preprocessing steps are commonly applied to ensure the data's quality and suitability for further analysis.The following steps outline the data preprocessing techniques utilized in this research: 1. Volume Filtering: A volume filter ensures that only meaningful data is considered.This step excludes data points where the volume is zero or negative, as such values are typically regarded as invalid or erroneous.

Handling Missing Values:
Missing values can disrupt the analysis process and affect the model's performance.Thus, any remaining missing values in the dataset are addressed by dropping the corresponding data points or employing imputation methods.3. Normalization: Normalization is employed to scale the numerical features within a specific range, facilitating fair comparisons between features.This research uses the Min-Max scaler to normalize the stock price data.The Min-Max normalization equation is defined as follows: where  ′ is the normalized value,  is the original value, min() is the minimum value of the feature, and max() is the maximum value of the feature.4. Selecting Close Price: For this analysis, only the close price of the stocks is considered.The close price represents the price at which a stock ends trading on a particular day.The analysis can concentrate on the stock's performance during trading hours by focusing solely on the close price.
The dataset is refined and prepared for subsequent analysis and model development by applying these preprocessing steps.These steps ensure that the data is in a suitable format and facilitates the extraction of meaningful patterns and trends for stock price prediction.

Data Splitting
The train-test split methodology is a fundamental practice in machine learning and predictive modeling to evaluate the performance of a model on unseen data.In stock price prediction, it is crucial to appropriately balance training and evaluation data to ensure the model's effectiveness and generalizability.
This research shows 1269 trading days from July 6 2015 to October 14 2021 for analysis.The dataset is divided into three subsets to establish the train-test split: training, test, and validation.
Based on the provided information, the training set consists of 1169 trading days, encompassing most data.This large training set allows the model to learn from many historical price patterns and trends.The model can capture the underlying relationships and make accurate predictions by training on substantial data.
The test set comprises 50 trading days, representing a smaller portion of the dataset.The purpose of the test set is to assess the model's performance on unseen data, simulating real-world scenarios.Evaluating the model on a separate test set provides insights into its ability to generalize and make accurate predictions on new, unseen market conditions.
Additionally, a validation set of 50 trading days is mentioned.The validation set is typically used to fine-tune the model's hyperparameters and optimize its performance.By evaluating the model's performance on the validation set, adjustments can be made to enhance its predictive capabilities.
Overall, this train-test split methodology, with 1169 trading days for training, 50 trading days for testing, and 50 trading days for validation, ensures a reasonable data allocation for model training, evaluation, and fine-tuning.It enables the model to learn from a substantial historical context while providing comprehensive evaluation data to assess its performance on unseen market conditions and optimize its parameters.

Model Training Process
The model training process utilizes a stacked LSTM architecture to predict the next day's stock price based on the previous 50 days' data.The model's hyperparameters, architecture, optimization algorithm, and loss function are outlined as follows: 1. Hyperparameters: a. n_steps: The number of previous days considered as input to predict the next day's stock price.In this case, n_steps is set to 50, indicating that the model utilizes the past 50 days' data as input.b. n_features: The number of features or variables that predict the stock price.Here, n_features is set to 1, indicating that only the stock price is considered the input feature.The model is fitted to the data and trained for 100 epochs using the Adam optimizer and the MSE loss function during training.By specifying the model architecture, hyperparameters, optimization algorithm, and loss function, the model is trained to learn the patterns and relationships in the historical stock price data, making predictions for future stock prices based on the input sequence of the previous 50 days.

Evaluation Metrics
Several evaluation metrics are commonly used to evaluate the performance of the model in predicting stock prices, including Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and R-squared (R2).These metrics provide insights into the model predictions' accuracy, precision, and overall goodness of fit.

Root Mean Square Error (RMSE)
The RMSE measures the average deviation between the predicted and actual values.It is calculated by taking the square root of the average of the squared differences between the predicted and actual values.The formula for RMSE is given by Equation 8.
where  represents the total number of predictions,   denotes the actual value, and   ̂ represents the predicted value.

Mean Absolute Error (MAE)
The MAE measures the average absolute difference between the predicted and actual values.It is calculated by taking the average absolute differences between the predicted and actual values.The formula for MAE is given by Equation 9.
where  represents the total number of predictions,   denotes the actual value, and   ̂ represents the predicted value.

Mean Absolute Percentage Error (MAPE)
The MAPE measures the average percentage difference between the predicted and actual values, providing a relative measure of the prediction error.It is calculated by taking the average absolute percentage differences between the predicted and actual values.The formula for MAPE is given by Equation 10.
where  represents the total number of predictions,   denotes the actual value, and   ̂ represents the predicted value.

R-squared (R2)
R-squared is a statistical measure that indicates the proportion of the variance in the dependent variable (actual values) that can be explained by the independent variable (predicted values).It represents the goodness-of-fit of the model.The formula for R2 is given by Equation 11.
where  represents the total number of predictions,   denotes the actual value,   ̂ represents the predicted value, and  ̅ represents the mean of the actual values.
By calculating and analyzing these evaluation metrics, we can assess the accuracy, precision, and goodness-of-fit of the model's predictions and determine its effectiveness in predicting stock prices.

RESULTS AND ANALYSIS 3.1. Performance Metrics
This study utilized the Stacked LSTM model to predict the stock prices for the next 50 days.The model's performance was evaluated using key metrics, including RMSE, MAE, MAPE, and R2, as shown in Table 2.These metrics provide insights into the accuracy and reliability of the predictions.The results obtained from the model demonstrated promising performance, with low values of RMSE and MAE, indicating small errors between the actual and predicted values.The MAPE values showed a reasonable percentage of error in the predictions, while the R2 values indicated a high level of variance explained by the model.A figure was generated to illustrate the performance visually, showcasing the actual and predicted prices for the next 50 days, as shown in Figures 1-10.This graphical representation clearly compares the predicted (red line) and observed (blue line) stock prices, highlighting any significant trends or deviations.The figure provides a visual confirmation of the model's ability to capture the general patterns and movements in the stock prices, further supporting the effectiveness of the Stacked LSTM model in predicting future stock price trends.

Performance Analysis
Looking at the RMSE values, we can observe that they range from 0.004 to 0.013, indicating relatively low errors in the predicted stock prices.Lower RMSE values suggest better accuracy and closer alignment between the predicted and actual prices.
The MAE values range from 0.004 to 0.012, further confirming the model's ability to make accurate predictions.The MAE measures the average magnitude of the errors, and lower values indicate a closer fit to the actual prices.
The MAPE values range from 0.009 to 0.074, representing the average percentage difference between the predicted and actual prices.Lower MAPE values indicate a smaller deviation and better precision in the predictions.
The R2 values range from 0.907 to 0.991, indicating the proportion of the variance in the actual prices that the model's predictions can explain.Higher R2 values signify a stronger correlation between the predicted and actual prices.
Overall, the Stacked LSTM model demonstrates good performance across the evaluated metrics, with low errors (RMSE and MAE), small percentage deviations (MAPE), and high explanatory power (R2).This performance suggests that the model effectively captures the stock price patterns and makes accurate predictions for the selected stocks in the given dataset.

Strengths and Limitations
The Stacked LSTM model has several strengths in capturing stock price patterns and making accurate predictions.One of the strengths is its ability to capture long-term dependencies and complex temporal relationships in the stock market data.The stacked architecture, consisting of multiple LSTM layers, allows the model to learn and represent intricate patterns that may exist in the historical stock price data.This model makes it suitable for capturing trends, seasonality, and other non-linear patterns commonly observed in stock markets.
Another strength is the model's capability to handle sequential data.The LSTM architecture is designed to process and analyze sequential information well-suited for time series data like stock prices.The model can effectively learn from the past and predict future prices by considering a sequence of historical prices as input.The Stacked LSTM model can also be trained on large datasets, enabling it to generalize well to unseen data.The model can learn robust representations and generalize its predictions to different market conditions by utilizing a substantial amount of historical stock price data.This is particularly advantageous in the dynamic and volatile nature of the stock market.
However, the Stacked LSTM model also has limitations that need to be considered.One limitation is the assumption that historical patterns will repeat in the future.While the model can capture and learn from historical data, it may not fully account for unforeseen events or sudden changes in market dynamics that can significantly impact stock prices.External factors such as economic news, political events, or global market trends are not explicitly incorporated into the model, which may limit its ability to capture the full range of influences on stock prices.Another limitation is the sensitivity to hyperparameter tuning.The performance of the Stacked LSTM model heavily relies on selecting appropriate hyperparameters, such as the number of LSTM layers, the number of units in each layer, and the sequence length.Finding the optimal set of hyperparameters can be challenging and time-consuming, requiring extensive experimentation and validation.Furthermore, the model's predictions are inherently uncertain due to financial markets' inherent volatility and unpredictability.While the Stacked LSTM model can provide valuable insights and predictions, it is crucial to interpret its outputs cautiously and consider them as probabilistic estimations rather than deterministic forecasts.
In conclusion, the Stacked LSTM model offers strengths in capturing stock price patterns and making accurate predictions by leveraging its ability to capture long-term dependencies and process sequential data.However, it is important to recognize its limitations in handling unforeseen events, the need for careful hyperparameter tuning, and the inherent uncertainty in financial market predictions.Integrating external factors and continuously monitoring and adapting the model's performance can help mitigate these limitations and enhance its effectiveness.

2 .
Model Architecture: a.The model architecture consists of three stacked LSTM layers.The first LSTM layer is defined with 200 units, utilizes the ReLU activation function, and returns sequences to be used by subsequent LSTM layers.b.The second LSTM layer also has 200 units, employs the ReLU activation function, and returns sequences.c.The third LSTM layer has 200 units and uses the ReLU activation function.d.A dense layer with one unit is added as the final output layer.

3 .
Optimization Algorithm: The Adam optimizer is employed for training the model.Adam is a popular optimization algorithm known for its adaptive learning rate and efficient convergence.4. Loss Function: The mean squared error (MSE) loss function measures the discrepancy between the predicted and actual stock prices.The MSE loss function calculates the average squared difference between the predicted and actual values.