Exaone3.5 Performance in

Matt Williams

Просмотров 3,5 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 18 дек 2024

Комментарии • 22

@human_shaped 11 часов назад ⁺²
Good production. You're starting to relax more and be yourself in front of the camera, and the editing is good.
@technovangelist 3 часа назад
This was 100% scripted as usual and I read in front of a teleprompter.
@ihaveacutenose 10 часов назад ⁺¹
I like that you are having the models write short stories. AI creativity is an important factor to me.
@scouzi7201 6 часов назад ⁺²
Wow - quite the machine you've got there. Congrats!
@issao.8453 4 часа назад ⁺¹
Great video! It would be even more helpful if you could add a brief summary at the end where you share your thoughts on the model. Specifically, it would be valuable to hear your opinion on how well the model performs, its strengths and weaknesses, and the scenarios where it excels. Additionally, insights on how it compares to other models of similar size and what it might be equivalent to (e.g., “Llama 3.3 is comparable to Llama 405”) would make it much easier for us to categorize and understand its potential.
Hearing this information from an experienced professional like you saves us a lot of time, and while I know this is subjective, it’s incredibly useful. If possible, giving the model a score based on different categories would make your review even more insightful.
Thank you for your great work and dedication!
@technovangelist 4 часа назад
When I watched again after posting I realized I hadn’t done that. It’s a great point and it was something I meant to do. Thanks for pointing it out.
@andytolle4352 10 часов назад
I very much appreciate the dedication and I personally like the in-depth reviews. For starter I appreciate that you take number of parameters and quantization into account. Not everyone does, so thanks for bringing to people's attention that these variables are essential when we try to evaluate models.
On the flip side: in this case, the questions are too vague and abstract for me to find any real business value in video as a whole.
The point being: after watching this, I still don't have a grasp of how these models compare to each other... whereas I think it there could have been quite a massive business value in a video like this, if simply question's with one objective and verifiable answer were asked.
For example: you could have done this by looking at a handful of industries or use-cases and then compare question relevant to that industry.
A simple and obvious example could be the industry coding where several coding problems are asked and tested in the end (given the temperature, you might take 3 tries for each problem and then plot the results.
Another industry might be writing where you for example put a one or many wrong words in a paragraph and then ask the models to find the wrong words, make that increasingly harder, run that 3 times each and again plot the number of correct answers in a graph.
With some brainstorming (or 'prompting' as we call that today) better practical use-cases could be found than what I can come up with while just typing this.
In the end there then would have probably a very clear picture of when it matters to have a higher parameter or quantization count and when it doesn't.
In summary: making it practical, by starting from real-world problems, taking into account temperature and the fact that these models are about probability, then asking objective and verifiable questions and then plotting the results. Would have been massive business value for me: for our company that's what I recently did to select a proper model (slm) and this video could have been the one saving me hours.
Then again: than you very much for you diligence and professionalism. I very much enjoy your content and I'm happy to see people find their way to your channel.
Cheers,
Andy
@TheNiast 7 часов назад
It was great to see the responses side by side. One can see the difference immediately. What was interesting is that smaller model generated verbose answers.
@Igbon5 9 часов назад
That was quite interesting. I watched all the way so I liked the longer format. I'm glad you channel is doing well. It is super cool to be able to run this stuff on local computers. I'm upgrading right now but nothing like that monster you have to play with.
The question choice was interesting.
I wonder if the robot dreamed of electric sheep.
It was weird the way they failed to account for the extra killer in the room. Seemingly so obvious.
@erikschiegg68 11 часов назад
We are bump stick users compared to his hardware, which solves the LED problem:
People use the LED light of their high end LLM systems to replace the traditional light bulbs, that's why they use more electicity after replacing traditional with LED...
@Rune-Eriksen225 11 часов назад ⁺¹
What's the most common winning🏆 investment strategy for a new beginner?
@soucekl 8 часов назад
Hi Matt, thanks for video. It is not first time someone using LLM to "solve" logical/mathematical problems, etc. I personally do not think this is correct use of LLM. I mainly see LLM as great tool to help "convert" text to some other/condensed text, extract information and convert it into more machine friendly format, categorize large text in mass volume, etc. Simply put it what you can do yourself but machine can do it much faster. However you can always make steps manually and check result correctness. As you think logical reasoning is feature of LLM to use (extract info a "logically" find relations, etc.) where and how you would use it in real production live system with confidence (I mean use returned result in farther downstream process for every returned result)? I personally do not see it can provide sufficiently confidence results there.
@Yheyehe 11 часов назад ⁺¹
What's the most common winning🏆 investment strategy for a new beginner
@MukulTripathi 10 часов назад
That gpu flex!
@AricRastley 9 часов назад ⁺¹
I’m just looking for a model that’s good at creative writing. Fantasy novels and D&D basically
@vertigoz 4 часа назад
17:20 you might have started it backwards, give him a korean sentence which you knew the meaning and ask him to translate it for english
@technovangelist 4 часа назад
Ahh good idea. I am sure there is a bunch of classic Korean literature online and then have it and google translate translate to English and then find a Korean native tell me which is the better translation.
@vertigoz 2 часа назад
@@technovangelist there's ought to be some books already in both korean and english. My main idea was simpler though, just try to check if the english he translated from a known korean text made sense :P
@ringpolitiet 7 часов назад ⁺¹
Very bad prompting sir.
@RaverDK 10 часов назад
Woo, which bank did you rob for that machine..? 👀
@thesral96 2 часа назад
Did he just say LG?
@technovangelist Час назад
Yup. That’s who made it.

Следующие

Автовоспроизведение