Good job, Sir! You have established the Liquid Brain Academy. Everybody can read the tutorial examples and repeat, but you can exlain all details to an audience. Great lecture. It could be a good next step to explain the sunspots dataset LSTM example.
How do I interpret the results? I.e. the final graph does not give a temperature reading but it shows the predicted values vs the actual values. My question is: how to interpret the end result?
I usually uses log to transform data with a heavy skewness (such as gene expression), so I can get them all roughly on the same scale. For this example, most of the values are already in the same scale so I don't think it is needed. However, do feel free to add that in and check if it works better with your NN.
Hi, thank you so much for all your work. I would like to ask how to transform the scaled values back to their original range? Please help me, thank you so much and have a nice day.
Thank you for a great video ! I was wondering, when you are explaining the lookback, you say we are looking at 240 observations back and predict 1. Once 1 observation is predicted, and we are predicting the following one, is the one that has been predicted now included in those 240 observations we look back at? Is it possible to add the really observed values until time (t-1) each time we predict time t, so we are looking back at the real 240 observations at each prediction step? Thank you
RNN would be a little more complicate to understand, but for MLP you can try to search for a video I made titled "Visualise a Neural Network", which grabs the weights from each epoch and superimpose onto the network diagrams. for RNN I believe you will just need a much bigger graphs or some other way of presentation.
Hello, how does the model know that "temperature" is the response variable when it is located in the 2nd column? I do not see where you are targeting it.
In the data generation function, I have chosen the temperature column to be output~ so when data is generated it will grab that columns as the target 😃
Is each observation unique? What if I have data in which observations are repeated. For example customer purchasing data with dates but same customers can make multiple purchases and hence customers are repeated
As in the weather data, the input are actually not unique. A cyclic patterns is actually preferred in a RNN since it's much easier to predict the output. You can just tag in the date data to make sure order of the data is preserved
Dear Liquid brain, great video. Thank you. Please advise how to change code of last model (after data optimisation) if I want to have more than 2 variables. I spend a lot of time buy it doesnt work;(
hello, pls help....do all like in ur code but i use wind speed data to predict it. and i cant get why for lines 241-246 of ur R code, my r got stuck every single time.
@@LiquidBrain thanks a lot for reply! before i insurt my code....i found one difference... i have wind speed in xts format ( time series time steps) - is that the reason why it doesnt work?
there s one thing that doesnt work: with keras: > library(keras), Attaching package: ‘keras’,The following object is masked _by_ ‘.GlobalEnv’: normalize....i googed and it says that there s smth else with the same name or so ,but how to fix it, cant find. may be you have some suggestions?
@@LiquidBrain there is no error code, just this message : Error: Valid installation of TensorFlow not found. Python environments searched for 'tensorflow' package: C:\Users\asus\miniconda3\python.exe
@@LiquidBrain I did solve it. it seem to be a version problem I have used the below code to install packages again : #devtools::install_github("rstudio/tensorflow") #devtools::install_github("rstudio/keras") #Execute the below #tensorflow::install_tensorflow() #tensorflow::tf_config()
I haven't encounter the problem since i have been exporting my data into the list 1st before feeding them into the model.fit. I will try to explore the option and maybe make another video about this. Thanks for the heads up.
Excellent explanation, thanks for sharing your knowledge. Are there possibilities for applying this method to forecast catches with other variables? Type: Catch ~ SST + CHL + NINO3.4, etc.
I mean I would just export the target data using a specific function,and train on those target. But I believe there's can be ways to integrate then into the network with some data transformation before the network layers.
hello. Can you please tell me how long did Rstudio takes to run the codes please? Mine is taking like an eternity to run the simple keras_model_sequential() one. Any possible solution please?
hii, so sorry im newkids here and my english probably bad...but anyway my question is how to return the original value of prediction data? because its between 0 -1. thanks
For that you will need to check how's the data is being normalised in the beginning and multiple the scaled data into the original data. (You will have the keep the data used for the minmax normalization and the means stddev used for the scaling)
Don't we have risk of data leakage when scaling and normalizing the data before the train test split? I was expecting the generators to use train data and test data rather than data in the script as well.
Great comprehensive content! Can you please make a video on how to make hybrid model with LSTM and other predicting models (preferably regressive model like ARIMA)?
I am not too sure about those since network structure are usually super project specific. I found one good repo here from github you might be able to check it out. github.com/perseus784/Vehicle_Collision_Prediction_Using_CNN-LSTMs
Great video. I have a question: How can I incorporate 'lookback' and 'step' elements like you used in the video to time series classification with Support Vector Machines?
The lookback and step actually just affects the generator functions. You can just change the machine learning model in the model definitions to SVM instead of the lstm shown in the video 😀
Are these codes work for electricity consumption forecasting as well? if no how we should change it or do we need just data transformation? Thanks for this amazing tutorial
Hard to say, these code is more for demonstration purposes and would not work that well in the real work. I would suggest a much longer prediction period with a longer look back, but models are also quite data specific
You can always use the predict () function to load in the new data, and you should be able to get a prediction from any data that have the same Input dimensions
@@LiquidBrain sure makes sense and I assumed as much. I think another way to phrase it/be more specific, is how do you get the lags for going out into the future where data theoretically isn’t available? Do you use predictions as lag values?
For that i think you will need a model that can take the input and predict the output with the same amount of dimension. i.e. 5input node- > neurons -> 5 output nodes, after the training you can essential fit the output back as input and continuously generate new data based on a single input. I have actually try this previously, but it didn't work well for the dataset i am trying. Perhaps it will work in some other scenario or with a much larger model size.
thanks for the video... very useful. I would like to have a little cheat sheet about the different definitions such as epoch, lookback, etc... I am 100% new and self-learning.
Great explanation of the RNN tutorial. This is the only multivariate RNN complete tutorial for R on the internet. How did you make the plot and chart show in the coding window? Thanks.
Most of the plot are done in ggplot2, when done in RMD format ( you can choose it from the new->R notebook on the file tab), the output of each code chuck will appear below.
@@LiquidBrain thanks for the quick reply. I don't get. What do you mean by RMD format? Where should I click exactly to make the plot appear right under each code blocks?
@@lulumink0 the run button on the top right corner of each chuck, you can also try to use plot(cars) to plot a simple scatter plot using the buld in dataset. RMD format is an ipython style coding interface instead of the normal scripting sheet, so you can test the code in chunks instead of using the line execute thingy you have in R
@@LiquidBrain gotcha, thanks again. One last question, this might not be easy to answer. What if you have say 50 explanatory variables in the dataset, how will you feed all those data into the model the same way as you would do in this video or do you need to use some kind of PCA to get the weights right first?
@@lulumink0 PCA would be a great way to do data reduction prior to the model training, you can also try TSNE, or just do a corrplot to the data you want to predict. I think Computerphile did a great breakdown of the data reduction process, you can search for their video tittle "Data Analysis 5: Data Reduction - Computerphile" :)
Hi Brandon... I'm a civil engineering grad hence I'm new in this field... Due to covid, I chose a project which can be done without lab access so I wanted to incorporate data science into my domain.. I have been working on my college project on LSTM for forecasting Water Demand... I got to learn so much from your video sir... I want to tell you that, after modifying the code when I tried to run it, I'm getting an error while integrating miniconda to my R Studio... It's saying "Can't install miniconda to a path containing spaces"... Could you please help me with this... I tried to fix it but I failed so far.. Please help..!!
Can you do a getwd(), and is one of the folder in your computer have a space in the folder name? Try to remove all spaces that works with the RStudio, and see if the installation work
Hi, it is generally not recommend to predict data only one in the future since you might run into the nearest neighbor pitfall, where the algorithm might just take the previous data and project it forward. I would suggest increase your output network size and predict a long period into the future. The size of the look back is really dependent on the data and how you can get a repeatable patterns in the data history, hard to say exactly.
@@LiquidBrain Yes I see. Actually I am working on Flood prediction system. I found a research paper of china where the authors have applied LSTM model to predict discharge (Flow rate) on Hoa Binh station of Da river Basin, 1 day, 2 days and 3 days ahead. I am using R to apply LSTM on the river in india. The data has 6400 rows and 4 variables (including One target). I am taking 1 to 5844 rows for training and rest for testing. My xtrain is 2D matrix (5844, 3) and ytrain is 1D matrix (5844). I am not able to understand what should be my input shape. Can you please tell me what dimensions of xtrain I should reshape before I fed it to LSTM layer. I found that it should be [batch_size, time_steps, num_features]. So in my case, will it be correct if I keep it as [5844, 1, 3] or should it be [1, 5844, 3] or [5844*3, 1, 1]. Please guide me a little. Thank you.
Slightly off topic, I just want to mention my favorite, and underestimated, R function, which is embed().
x
Good job, Sir! You have established the Liquid Brain Academy. Everybody can read the tutorial examples and repeat, but you can exlain all details to an audience. Great lecture. It could be a good next step to explain the sunspots dataset LSTM example.
That was the best explanation on RNN inputs I've ever seen. Any plan for a LSTM showcase?
working on it for sometime now, havent got the time to compile those into a video
Good insightful video found after long days of search on RNN time series, thanks!
How do I interpret the results? I.e. the final graph does not give a temperature reading but it shows the predicted values vs the actual values. My question is: how to interpret the end result?
Thanks for your great video. I'd like to ask why does the 'delay' differ between training (delay = 44) and testing (delay = 0)?
can we use RNN with ARIMA to forecast covid?
Great content ! however, why you didn't use logarithmic transformation to standardize the data ?
I usually uses log to transform data with a heavy skewness (such as gene expression), so I can get them all roughly on the same scale. For this example, most of the values are already in the same scale so I don't think it is needed. However, do feel free to add that in and check if it works better with your NN.
Hi, thank you so much for all your work. I would like to ask how to transform the scaled values back to their original range? Please help me, thank you so much and have a nice day.
Thank you for a great video ! I was wondering, when you are explaining the lookback, you say we are looking at 240 observations back and predict 1. Once 1 observation is predicted, and we are predicting the following one, is the one that has been predicted now included in those 240 observations we look back at? Is it possible to add the really observed values until time (t-1) each time we predict time t, so we are looking back at the real 240 observations at each prediction step? Thank you
Thanks for your explanation. How can we get feature importance of RNN/LSTM model in R?
RNN would be a little more complicate to understand, but for MLP you can try to search for a video I made titled "Visualise a Neural Network", which grabs the weights from each epoch and superimpose onto the network diagrams. for RNN I believe you will just need a much bigger graphs or some other way of presentation.
I love the video. Please more videos like this about this topic
Thank you! Great video.
Nice class! Thank you very much
How can I makr further predictions? For example, 1 year ahead
Good job. Can you do a video for GAN in r regression? Thanks
Hello, how does the model know that "temperature" is the response variable when it is located in the 2nd column? I do not see where you are targeting it.
In the data generation function, I have chosen the temperature column to be output~ so when data is generated it will grab that columns as the target 😃
@@LiquidBrain I now see where the "targets" object is being created. Thanks!
Is each observation unique?
What if I have data in which observations are repeated. For example customer purchasing data with dates but same customers can make multiple purchases and hence customers are repeated
How can we work when observations are repeated as in multiple purchases at different times.
Thank you
As in the weather data, the input are actually not unique. A cyclic patterns is actually preferred in a RNN since it's much easier to predict the output. You can just tag in the date data to make sure order of the data is preserved
Dear Liquid brain, great video. Thank you. Please advise how to change code of last model (after data optimisation) if I want to have more than 2 variables. I spend a lot of time buy it doesnt work;(
hello, pls help....do all like in ur code but i use wind speed data to predict it. and i cant get why for lines 241-246 of ur R code, my r got stuck every single time.
Is it ok for you put the code here ? It's much easier to troubleshoot
@@LiquidBrain thanks a lot for reply!
before i insurt my code....i found one difference... i have wind speed in xts format ( time series time steps) - is that the reason why it doesnt work?
there s one thing that doesnt work: with keras: > library(keras),
Attaching package: ‘keras’,The following object is masked _by_ ‘.GlobalEnv’: normalize....i googed and it says that there s smth else with the same name or so ,but how to fix it, cant find. may be you have some suggestions?
Hi, can I ask why I get an Error of miniconda installation? What should I do?
What's your error code?
@@LiquidBrain there is no error code, just this message :
Error: Valid installation of TensorFlow not found.
Python environments searched for 'tensorflow' package:
C:\Users\asus\miniconda3\python.exe
@@LiquidBrain I did solve it. it seem to be a version problem I have used the below code to install packages again :
#devtools::install_github("rstudio/tensorflow")
#devtools::install_github("rstudio/keras")
#Execute the below
#tensorflow::install_tensorflow()
#tensorflow::tf_config()
Have you figured out how to get around the fit_generator depreciation in the latest Keras?
I haven't encounter the problem since i have been exporting my data into the list 1st before feeding them into the model.fit. I will try to explore the option and maybe make another video about this. Thanks for the heads up.
Excellent explanation, thanks for sharing your knowledge.
Are there possibilities for applying this method to forecast catches with other variables? Type: Catch ~ SST + CHL + NINO3.4, etc.
I mean I would just export the target data using a specific function,and train on those target. But I believe there's can be ways to integrate then into the network with some data transformation before the network layers.
hello. Can you please tell me how long did Rstudio takes to run the codes please? Mine is taking like an eternity to run the simple keras_model_sequential() one. Any possible solution please?
You can try to reduce the size of the network, or the input? I am running mine on gpu so it was quite fast, few mins max I believe?
An excellent explanation.Great job.
hii, so sorry im newkids here and my english probably bad...but anyway my question is how to return the original value of prediction data? because its between 0 -1. thanks
For that you will need to check how's the data is being normalised in the beginning and multiple the scaled data into the original data. (You will have the keep the data used for the minmax normalization and the means stddev used for the scaling)
Don't we have risk of data leakage when scaling and normalizing the data before the train test split? I was expecting the generators to use train data and test data rather than data in the script as well.
Great comprehensive content!
Can you please make a video on how to make hybrid model with LSTM and other predicting models (preferably regressive model like ARIMA)?
Not very experienced in ARIMA, but sure will check it out
@@LiquidBrain many many thanks. Can you please direct me to some contents on how to develop hybrid lstm ?
I am not too sure about those since network structure are usually super project specific. I found one good repo here from github you might be able to check it out.
github.com/perseus784/Vehicle_Collision_Prediction_Using_CNN-LSTMs
Great video. I have a question: How can I incorporate 'lookback' and 'step' elements like you used in the video to time series classification with Support Vector Machines?
The lookback and step actually just affects the generator functions. You can just change the machine learning model in the model definitions to SVM instead of the lstm shown in the video 😀
Is there anyway to get forecast values with this approach like we get in timeseries forecasting?
Yap, you can use the predict() fuction to feed the data back into the model to get the output
I don't understand the batch_size_plot calculation...
Are these codes work for electricity consumption forecasting as well? if no how we should change it or do we need just data transformation? Thanks for this amazing tutorial
Hard to say, these code is more for demonstration purposes and would not work that well in the real work. I would suggest a much longer prediction period with a longer look back, but models are also quite data specific
Great video! Any suggestions on how to predict new data (I.e. beyond the test data)?
You can always use the predict () function to load in the new data, and you should be able to get a prediction from any data that have the same Input dimensions
@@LiquidBrain sure makes sense and I assumed as much. I think another way to phrase it/be more specific, is how do you get the lags for going out into the future where data theoretically isn’t available? Do you use predictions as lag values?
For that i think you will need a model that can take the input and predict the output with the same amount of dimension. i.e. 5input node- > neurons -> 5 output nodes, after the training you can essential fit the output back as input and continuously generate new data based on a single input. I have actually try this previously, but it didn't work well for the dataset i am trying. Perhaps it will work in some other scenario or with a much larger model size.
@@LiquidBrain that’s helpful thanks so much for the detailed response
thanks for the video... very useful. I would like to have a little cheat sheet about the different definitions such as epoch, lookback, etc... I am 100% new and self-learning.
I have a request for you...can u please do a video on APAlyzer
Great explanation of the RNN tutorial. This is the only multivariate RNN complete tutorial for R on the internet. How did you make the plot and chart show in the coding window? Thanks.
Most of the plot are done in ggplot2, when done in RMD format ( you can choose it from the new->R notebook on the file tab), the output of each code chuck will appear below.
@@LiquidBrain thanks for the quick reply. I don't get. What do you mean by RMD format? Where should I click exactly to make the plot appear right under each code blocks?
@@lulumink0 the run button on the top right corner of each chuck, you can also try to use plot(cars)
to plot a simple scatter plot using the buld in dataset. RMD format is an ipython style coding interface instead of the normal scripting sheet, so you can test the code in chunks instead of using the line execute thingy you have in R
@@LiquidBrain gotcha, thanks again. One last question, this might not be easy to answer. What if you have say 50 explanatory variables in the dataset, how will you feed all those data into the model the same way as you would do in this video or do you need to use some kind of PCA to get the weights right first?
@@lulumink0 PCA would be a great way to do data reduction prior to the model training, you can also try TSNE, or just do a corrplot to the data you want to predict. I think Computerphile did a great breakdown of the data reduction process, you can search for their video tittle "Data Analysis 5: Data Reduction - Computerphile" :)
When i run the feedforward they asked me to download python. How could i troubleshoot this?
Try to download Python
@@LiquidBrain good suggestion. hahaha
Great Job!
Hi Brandon... I'm a civil engineering grad hence I'm new in this field... Due to covid, I chose a project which can be done without lab access so I wanted to incorporate data science into my domain.. I have been working on my college project on LSTM for forecasting Water Demand... I got to learn so much from your video sir... I want to tell you that, after modifying the code when I tried to run it, I'm getting an error while integrating miniconda to my R Studio... It's saying "Can't install miniconda to a path containing spaces"... Could you please help me with this... I tried to fix it but I failed so far.. Please help..!!
Can you do a getwd(), and is one of the folder in your computer have a space in the folder name? Try to remove all spaces that works with the RStudio, and see if the installation work
@@LiquidBrain yes... Actually the user folder does have a space it was named as "Asus PC"... And I'm unable to change that now😣
I have a daily data, and I want to predict a target variable 1 day ahead i.e. one day in future. What should be my lookback in this case?
Hi, it is generally not recommend to predict data only one in the future since you might run into the nearest neighbor pitfall, where the algorithm might just take the previous data and project it forward. I would suggest increase your output network size and predict a long period into the future. The size of the look back is really dependent on the data and how you can get a repeatable patterns in the data history, hard to say exactly.
@@LiquidBrain Yes I see. Actually I am working on Flood prediction system. I found a research paper of china where the authors have applied LSTM model to predict discharge (Flow rate) on Hoa Binh station of Da river Basin, 1 day, 2 days and 3 days ahead.
I am using R to apply LSTM on the river in india. The data has 6400 rows and 4 variables (including One target). I am taking 1 to 5844 rows for training and rest for testing.
My xtrain is 2D matrix (5844, 3) and ytrain is 1D matrix (5844). I am not able to understand what should be my input shape. Can you please tell me what dimensions of xtrain I should reshape before I fed it to LSTM layer. I found that it should be [batch_size, time_steps, num_features]. So in my case, will it be correct if I keep it as [5844, 1, 3] or should it be [1, 5844, 3] or [5844*3, 1, 1]. Please guide me a little. Thank you.
Great work...