But can you train a better version of VeLO using VeLO have they tried that yet? What happens when you train VeLO again with the same training data using VeLO
Thanks for the video - that really helped in understanding in short amount of time with clarity Question: Do you think it will perform well on large genomic datasets or Large Language Models(LLMs)?
It's hard to day without someone having tested this. The paper is quite clear that VeLO performs well on the kind of architectures (transformers: check ✅) and objectives (language modelling ✅) and tasks (genomics ❌) it has seen in training .
Yes and no. No, because it has to be trained just once. Yes, but then hyperparameter tuning time should go into the time measures of standard optimizers.
I have a question about AI, let’s say I put in “ Asian Police officer with pigtails” in the prompt; I have been told It should have a close resemblance to the police uniform, but not any real police officers unless that officer’s image is put in by random chance. Is this true, will the Image I have been given be of a police officer who doesn’t really exist?
How do you mean complex? Complex in usage? Or conceptionally? Because the conceptional complexity is taken off you by the author's work. You just need to use their neural net to so weight updates instead of an optimizer. They have a JAX implementation, so yeah, not easy to use in Keras. Yet. 😅
@@AICoffeeBreak Hmmm. Good question. Since the training process of VeLO is so compute intensive, there is no way to fine tune it to new architectures. This wouldn't be an issue if it actually generalizes perfectly (which I doubt.). Then if you train something new and it doesn't work you have additional source of error. You might think to yourself oh maybe it's because VeLO isn't working with my architecture. Maybe it has a certain kind of affinity for certain activations functions etc. Also for big networks the forward pass requires a lot of GPU memory. Which means less memory I can spend on my actual dataset. While I agree that in theory it's simple, just one forward pass. I feel like in practice it will be too much of additional headache. It's more of an opinion of mine rather than hard facts. All of the above combined with the marginal improvement I don't see myself using this anytime soon. Thx for the video, the explanation was very good :)
Happy new year Ms. Coffee Bean!!! ☕
Happy New Year, Tim! 🍷
I signed up to your sponsor , They should appreciate you more.
Great to see you again
Great you are watching again. 😶🌫️
@@AICoffeeBreak yup otherwise, I'll miss out something of great value,😍😍
Happy New Year, Ms. Coffee Bean! :D
Happy New Year to you too. ☺️
Happy New year, Ms Coffee Bean ✨
Happy new year to you too! :)
I often feel like I need a ccup of of prrdductce workk.
Me too! ☕☕☕ Appy ne wearry!
But can you train a better version of VeLO using VeLO have they tried that yet? What happens when you train VeLO again with the same training data using VeLO
Thanks for the video - that really helped in understanding in short amount of time with clarity
Question:
Do you think it will perform well on large genomic datasets or Large Language Models(LLMs)?
It's hard to day without someone having tested this. The paper is quite clear that VeLO performs well on the kind of architectures (transformers: check ✅) and objectives (language modelling ✅) and tasks (genomics ❌) it has seen in training .
Since the VeLO neural net must first be pre-trained, shouldn't that be counted in the total training time when comparing benchmarks?
Yes and no.
No, because it has to be trained just once.
Yes, but then hyperparameter tuning time should go into the time measures of standard optimizers.
I have a question about AI, let’s say I put in “ Asian Police officer with pigtails” in the prompt; I have been told It should have a close resemblance to the police uniform, but not any real police officers unless that officer’s image is put in by random chance. Is this true, will the Image I have been given be of a police officer who doesn’t really exist?
This is one reasonable step towards not having to train NNs from scratch every time.
I have been playing with hypergradients
Puhhh seems way too complex to be useful imo.
Also the performance doesn't seem that much better to justify all this complexity.
How do you mean complex? Complex in usage? Or conceptionally? Because the conceptional complexity is taken off you by the author's work. You just need to use their neural net to so weight updates instead of an optimizer. They have a JAX implementation, so yeah, not easy to use in Keras. Yet. 😅
@@AICoffeeBreak Hmmm. Good question. Since the training process of VeLO is so compute intensive, there is no way to fine tune it to new architectures. This wouldn't be an issue if it actually generalizes perfectly (which I doubt.).
Then if you train something new and it doesn't work you have additional source of error. You might think to yourself oh maybe it's because VeLO isn't working with my architecture.
Maybe it has a certain kind of affinity for certain activations functions etc.
Also for big networks the forward pass requires a lot of GPU memory. Which means less memory I can spend on my actual dataset.
While I agree that in theory it's simple, just one forward pass. I feel like in practice it will be too much of additional headache.
It's more of an opinion of mine rather than hard facts.
All of the above combined with the marginal improvement I don't see myself using this anytime soon.
Thx for the video, the explanation was very good :)