At the beginning of each trading day, only Open price is known. The features High, Low and Volume are not yet known, and hence, using them as features is not possible to predict the Close price of the day.
You entroduced lookahead biais in your model training using high, low and volume as it is unknown at the open time of the candle. What you could do is shift your Close column for your y variable, to try predicting the next canddle close price
The y value should be different from the current open, low, high and volume information row. Should we use other data, rather than open ,low, high and volume to predict the future stock price ?
He did, but without using the split method. He did it by manually assigning X as all the rows excluding the Close column and the last row. He then assigned Y as all the rows close value except the last row, as this is the test set. He then trained the model with the above, and then he ran the test on the last row(test data) X values(columns excluding close value) and predicted Y(the close value for the last row). It's not the best algorithm as his set is split at a very unbalanced value. One needs more data to make it more accurate.
Exactly. The y value should be different from the current open, low, high and volume information row. Should we use other data, rather than open ,low, high and volume to predict the future stock price ?
Why would the model predict 263 only, if the last couple of days are already > 270, values which are included into the prediction of only 263 and not 270-280?
You are using the High and Low of the hour, but you will only know this information once the hour is finished. These two features dont make sense. Thanks anyways for the video.
You would need to include the file path to your stock_data.csv file. pd.read_csv('/path/to/file'). That error means that your notebook can't find the CSV file.
At the beginning of each trading day, only Open price is known. The features High, Low and Volume are not yet known, and hence, using them as features is not possible to predict the Close price of the day.
Exactly, so what is the solution?
@@kibs_neville using open price to predict
You entroduced lookahead biais in your model training using high, low and volume as it is unknown at the open time of the candle. What you could do is shift your Close column for your y variable, to try predicting the next canddle close price
No train test split. This is the equivalent of giving the model the answer sheet to the test so you don’t get an accurate picture of model performance
pls enclose a link for the data.....thanks a lot
The y value should be different from the current open, low, high and volume information row. Should we use other data, rather than open ,low, high and volume to predict the future stock price ?
You are use indicators value, emas cross, macd, rsi etc..values as feature instead of OHLV values
do you not need to split the dataset?
He did, but without using the split method. He did it by manually assigning X as all the rows excluding the Close column and the last row. He then assigned Y as all the rows close value except the last row, as this is the test set.
He then trained the model with the above, and then he ran the test on the last row(test data) X values(columns excluding close value) and predicted Y(the close value for the last row). It's not the best algorithm as his set is split at a very unbalanced value. One needs more data to make it more accurate.
Nice work, thanks for sharing.
Thanks for watching!
How would you make a graph based on this? Thank you
Bogus Exercise. Feature already are part of future data thus making prediction using them makes no sense.
Exactly. The y value should be different from the current open, low, high and volume information row. Should we use other data, rather than open ,low, high and volume to predict the future stock price ?
Why would the model predict 263 only, if the last couple of days are already > 270, values which are included into the prediction of only 263 and not 270-280?
Excellent. Could you make a video on Portfolio Optimization using Black Litterman Model?
You are using the High and Low of the hour, but you will only know this information once the hour is finished. These two features dont make sense. Thanks anyways for the video.
Thank you!!
Man I'm working on a trading bot. How much for your help?
I am working on a similar project on colab but I cannot import sklearn ensemble RandomForestEnsemble..please help me
Merci (:
Argh. I get anFileNotFoundError at the line -- df = pd.read_csv('stock_data.csv')
You would need to include the file path to your stock_data.csv file. pd.read_csv('/path/to/file'). That error means that your notebook can't find the CSV file.
@@ElectricSH33P Ah, thank you (I'm very new to this.)