It requires less memory and computation for large sequences, since in a transformer you need to calculate attention weights between each token (scales quadratic). In RNN you just update the state (which makes it slower but more fit for mobile devices). I think it also requires more data to train a tranformer model and it has more parameters. However it's hard to use transfer learning on RNNs and easy to use it with pretrained Transformers, which reduces the training cost. It all depends on the data, computation requirements and cost I guess.
@@Karl_with_a_K in what sense, if you are saying it has become obselete then you are wrong. It's being used as sentence encoder or in sentence transformer, and the fact it's second smallest model in current LLM time, I think we can now work to integrate a pruned version of that into mobile devices, although more reparametrizing is needed for it to fit in small devices
@@Karl_with_a_K 😅it's natural we always love new tech, we only use old tech when it provides something different and better compared to the latest tech
What are some good points of RNN over transformer?
It requires less memory and computation for large sequences, since in a transformer you need to calculate attention weights between each token (scales quadratic). In RNN you just update the state (which makes it slower but more fit for mobile devices). I think it also requires more data to train a tranformer model and it has more parameters. However it's hard to use transfer learning on RNNs and easy to use it with pretrained Transformers, which reduces the training cost. It all depends on the data, computation requirements and cost I guess.
What do you think now Bert? It's blown up hasn't it...
@@Karl_with_a_K in what sense, if you are saying it has become obselete then you are wrong. It's being used as sentence encoder or in sentence transformer, and the fact it's second smallest model in current LLM time, I think we can now work to integrate a pruned version of that into mobile devices, although more reparametrizing is needed for it to fit in small devices
@ambeshshekhar4043 no, I mean in popularity ...
@@Karl_with_a_K 😅it's natural we always love new tech, we only use old tech when it provides something different and better compared to the latest tech
THE VISUALS ARE SO HELPFUL THANKS A LOT U ARE A SAVIOUR DUDE