Размер видео: 1280 X 720853 X 480640 X 360
Показать панель управления
Автовоспроизведение
Автоповтор
Some quality content here. So the new tokens just get appended to the current tokenizer right?
Yes
good shit bro
thanks
I did the same for 22 Indian languages. But when I searched a kannada language character in the tokens for a test purpose, it was not showing anything. Also, tokenizer separates punctuation as well. Your method of splitting is not optimal.
Some quality content here. So the new tokens just get appended to the current tokenizer right?
Yes
good shit bro
thanks
I did the same for 22 Indian languages. But when I searched a kannada language character in the tokens for a test purpose, it was not showing anything. Also, tokenizer separates punctuation as well. Your method of splitting is not optimal.