Stock Price Forecasting with Machine Learning: LSTM Edition

The Quantamental
Oct 28, 2024
6 min read

In today's fast-paced financial world, predicting stock prices is a crucial skill for investors and traders alike. With advancements in technology, machine learning (ML) has emerged as a powerful tool for making these predictions. Among the various ML models available, Long Short-Term Memory (LSTM) networks stand out for their ability to analyze time series data effectively. This article will guide you through the fundamentals of LSTM and provide a step-by-step approach to implementing it for stock price forecasting, even if you're starting from scratch.

What is LSTM?

LSTM is a type of Recurrent Neural Network (RNN), which is designed specifically to handle sequential data—data that is ordered in time. Traditional RNNs can struggle with long sequences due to a problem known as the vanishing gradient problem. This issue occurs when information from earlier time steps fades away as it moves through the network, making it difficult for the model to learn long-term dependencies. LSTMs solve this problem by introducing a unique architecture that includes memory cells and three types of gates:

Forget Gate: Decides what information should be discarded from the memory.
Input Gate: Determines what new information should be added to the memory.
Output Gate: Controls what information from the memory should be outputted.

This structure allows LSTMs to remember important information over long periods, making them particularly effective for tasks like stock price prediction.

Why Use LSTM for Stock Price Forecasting?

Stock prices are influenced by numerous factors over time, including market trends, economic indicators, and company performance. LSTMs excel at capturing these temporal patterns because they can learn from past data and make informed predictions about future prices. This capability makes them ideal for financial forecasting.

Step-by-Step Implementation of LSTM for Stock Price Forecasting

Let's dive into how you can implement an LSTM model to forecast stock prices using Python. We'll use historical stock data, which you can easily obtain from various financial data sources like Yahoo Finance.

Step 1: Install Required Libraries

Before you can run any code, you need to install the necessary libraries. In Python, libraries are collections of pre-written code that help you perform specific tasks without having to write everything from scratch.

pip install numpy pandas matplotlib yfinance tensorflow

numpy: A library for numerical operations.
pandas: A library for data manipulation and analysis, especially with tabular data.
matplotlib: A library for creating visualizations (graphs and charts).
yfinance: A library to fetch financial data from Yahoo Finance.
tensorflow: A powerful library for machine learning and deep learning, which includes Keras for building neural networks.

Step 2: Collect Historical Stock Data

For this example, we’ll use the stock price data of Apple Inc. (AAPL). You can download historical data using the yfinance library:

import yfinance as yf

# Download historical stock data
ticker = 'AAPL'
data = yf.download(ticker, start='2018-01-01', end='2023-01-01')
data.to_csv('AAPL_stock_data.csv')

import yfinance as yf: This line imports the yfinance library and gives it a shorter name (yf) for convenience.
ticker = 'AAPL': This variable stores the stock symbol for Apple Inc. (AAPL).
yf.download(...): This function fetches historical stock data for the specified ticker between the start and end dates.
data.to_csv('AAPL_stock_data.csv'): This saves the downloaded data into a CSV file named AAPL_stock_data.csv so you can access it later.

Step 3: Data Preprocessing

Before we can use the data to train our model, we need to preprocess it. This involves normalizing the prices and creating sequences of data.

import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler

# Load data
data = pd.read_csv('AAPL_stock_data.csv')
prices = data['Close'].values.reshape(-1, 1)

import pandas as pd and import numpy as np: Importing libraries that help with data manipulation and numerical operations.
data = pd.read_csv('AAPL_stock_data.csv'): Loads the CSV file we created earlier into a DataFrame (a table-like structure).
prices = data['Close'].values.reshape(-1, 1): Extracts the closing prices from the DataFrame and reshapes them into a 2D array (required for normalization).

Normalization

Normalization scales the prices to a range between 0 and 1. This helps the model learn better.

# Normalize the data
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_prices = scaler.fit_transform(prices)

scaler = MinMaxScaler(feature_range=(0, 1)): Creates a scaler object that will normalize values to a range between 0 and 1.
scaled_prices = scaler.fit_transform(prices): Fits the scaler to our price data and transforms it, effectively normalizing it.

Creating Sequences

LSTMs require input in sequences. We will create sequences of past prices to predict future prices.

def create_dataset(data, time_step=1):
    x, y = [], []
    for i in range(len(data) - time_step - 1):
        x.append(data[i:(i + time_step), 0])
        y.append(data[i + time_step, 0])
    return np.array(x), np.array(y)

time_step = 60  # Use past 60 days to predict the next day
X, y = create_dataset(scaled_prices, time_step)
X = X.reshape(X.shape[0], X.shape[1], 1)  # Reshape for LSTM input

def create_dataset(data, time_step=1): Defines a function that creates sequences from our normalized price data.
Inside this function: We initialize two empty lists, x (for input) and y (for output). The loop iterates through the dataset to create sequences of length time_step (60 days in this case). For each sequence in x, we store the corresponding next price in y.

After calling this function:

X will contain sequences of past prices.
y will contain future prices corresponding to those sequences.

Finally:

We reshape X to have three dimensions because LSTM expects input in this format: [samples, time steps, features].

Step 4: Building the LSTM Model

Now we’ll build our LSTM model using Keras.

from keras.models import Sequential
from keras.layers import LSTM, Dense, Dropout

# Build the LSTM model
model = Sequential()
model.add(LSTM(units=50, return_sequences=True, input_shape=(X.shape[1], 1)))
model.add(Dropout(0.2))  # Regularization to prevent overfitting
model.add(LSTM(units=50))
model.add(Dropout(0.2))
model.add(Dense(units=1))  # Output layer

# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')

Explanation:

from keras.models import Sequential: Imports a model type that allows us to build layers sequentially.
from keras.layers import LSTM, Dense, Dropout: Imports necessary layers for our model: LSTM: The core layer that processes sequences. Dense: A standard layer used for output. Dropout: A technique used to prevent overfitting by randomly ignoring some neurons during training.

Building the model:

We start with an empty sequential model (model = Sequential()).
We add an LSTM layer with: units=50: The number of memory units in this layer. return_sequences=True: Indicates that this layer will return sequences (needed when stacking LSTMs).

We then add dropout layers (20% dropout) after each LSTM layer to help prevent overfitting.Finally:

The last layer (Dense(units=1)) has one unit because we want a single output (the predicted price).

Compiling:

We compile our model with an optimizer (adam) and a loss function (mean_squared_error) which measures how well our predictions match actual values.

Step 5: Training the Model

Now we’ll train our model on the dataset:

model.fit(X, y, epochs=100, batch_size=32)

Explanation:

model . fit(X, y): This trains our model using input sequences X and their corresponding outputs y.
epochs=100: The number of times we go through the entire dataset during training. More epochs can lead to better learning but may also cause overfitting.
batch_size=32: The number of samples processed before updating the model weights. Smaller batches can lead to more stable updates but take longer to train.

Step 6: Making Predictions

After training is complete, we can use our model to make predictions:

# Prepare test data (the last 'time_step' days of historical data)
test_data = scaled_prices[-time_step:]
test_data = test_data.reshape((1, time_step, 1))

# Predict future price
predicted_price = model.predict(test_data)
predicted_price = scaler.inverse_transform(predicted_price)  # Rescale back to original prices

print(f'Predicted Price for Tomorrow: {predicted_price[0][0]}')

Explanation:

First:

We prepare test data by taking the last time_step days from our scaled prices.

Then:

We reshape it into three dimensions suitable for LSTM input.

We use model.predict(test_data) to get predictions based on our trained model.

Finally:

We rescale these predicted values back to their original scale using scaler.inverse_transform(predicted_price) so they are understandable (in actual dollar amounts).

The last line prints out tomorrow's predicted stock price.

Step 7: Evaluating Model Performance

To evaluate how well our model performed:

from sklearn.metrics import mean_squared_error

# Assuming you have actual prices for comparison in 'actual_prices'
actual_prices = ... # Load actual future prices here
rmse = np.sqrt(mean_squared_error(actual_prices, predicted_prices))
print(f'Root Mean Squared Error: {rmse}')

Explanation:

Here’s what’s happening:

We import a metric function called mean_squared_error from sklearn.metrics.

You would load actual future prices into actual_prices (this part needs your actual future values).

Then:

We calculate RMSE (Root Mean Squared Error), which tells us how far off our predictions are from actual values. Lower RMSE indicates better performance.

Finally:

The result is printed out so you can see how well your model did.

Conclusion

By following this guide, you've learned how to implement an LSTM model for stock price forecasting from scratch. You now understand how LSTMs work and why they are suitable for analyzing sequential data like stock prices. With practice and further exploration of hyperparameter tuning and advanced techniques, you can enhance your predictive capabilities and become proficient in using machine learning for financial analysis.

As you continue your journey in machine learning and finance, remember that real-world applications often require experimentation and refinement of models based on specific datasets and market conditions. Happy forecasting!