Long Short-Term Memory (LSTM) is a Deep Learning algorithm in the field of machine learning. It can not only process single data points (such as images), but also entire sequences of data (such as text, speech, video or time series).
In this article, we will explore using the LSTM networks in Python using the Keras deep learning library to address a demonstration time-series prediction problem by predicting a stock price.
What is LSTM?
The Long Short-Term Memory network, or LSTM network, is a recurrent neural network that is trained using Back-Propagation Through Time and overcomes the vanishing gradient problem.
As such, it can be used to create large recurrent networks that in turn can be used to address difficult sequence problems in machine learning and achieve state-of-the-art results.
Instead of neurons, LSTM networks have memory blocks that are connected through layers.
A block has components that make it smarter than a classical neuron and a memory for recent sequences. A block contains gates that manage the block’s state and output. A block operates upon an input sequence and each gate within a block uses the sigmoid activation units to control whether they are triggered or not, making the change of state and addition of information flowing through the block conditional.
There are three types of gates within a unit:
Forget Gate: conditionally decides what information to throw away from the block.
Input Gate: conditionally decides which values from the input to update the memory state.
Output Gate: conditionally decides what to output based on input and the memory of the block.
Each unit is like a mini-state machine where the gates of the units have weights that are learned during the training procedure.
We can see how we may achieve sophisticated learning and memory from a layer of LSTMs, and it is not hard to imagine how higher-order abstractions may be layered with multiple such layers.
How to Use it?
Now, let's put the theory to practice. Let's use LSTM in Python Keras library to predict the stock price of a company, e.g. Zoom Video Communications, Inc
Let's import some basic libraries in Jupyter notebook
We need to read stock price data from Yahoo using Yahoo data reader into a dataframe
Let's plot the stock price chart to see its visualization
Since we are only interested into predicting the daily close price of the stock, so we will ignore other columns and keep only 'Close' price column.
We will need to keep the price data in Numpy array data to keep it simple
We also need to scale the data into standard scale using MinMaxScaler
Now, we can split train and test data. But, let's create the x and y for train data set
We need to convert the train data set into Numpy array and reshape it to fit into the model
We will do the same for test data set
Now, it's the interesting part where we use import LSTM library and create the structure for the recurrent neural network model
Then, we compile the model that we just built
After compiled the model, we fit the train data into the model so that it can learn from the train data set to find the best model
Now, can use the trained model to predict the test data. Then, we calculate the error to see how good the prediction was
Finally, we can plot all train data, real test data and predicted test data on one graph to see how well the model predicted
We see that a very simple LSTM model can predict the time series stock price data quite well.
Conclusion
LSTM is quite a simple RNN algorithm to use but it can be very powerful in modeling and forecasting time series data. If we have more features and more more layers to build into the model, it can be a very effective predicting method.
Happy learning!
Comments