ChatGLM: The ChatGPT killer? Checking out ChatGLM6B

Поделиться
HTML-код
  • Опубликовано: 31 дек 2024

Комментарии •

  • @avi7278
    @avi7278 Год назад +544

    God you are a breath of fresh air... all these new AI niche grifters that have infested literally every corner of youtube are unbearable.

    • @Melvin420x12
      @Melvin420x12 Год назад +22

      That's why you should avoid clickbaity titles and thumbnails. People talking about GPT-5 or doing outrageous claims. I look for difficult words and model names in the title and thumbnails. Those are high quality videos.

    • @avi7278
      @avi7278 Год назад +26

      @@Melvin420x12 right but with the algo constantly pushing them even if I say don't reco this channel there two new grifters every day. "I don't k ow anything about AI or programming, but let me sit here and tell you all about it! Derp. Did you see auto gpt? Derp. Derp. Derp. Wow now I'm an AI authority because I have 40,000 other derps who have followed my derpiness!" /r

    • @tonywhite4476
      @tonywhite4476 Год назад

      That's bc everyone is blowing smoke up each others arses and no one is calling this crap out. I'll probably get a nasty response for just saying that. But I've paid my dues to rant. I started learning development thru RUclips and I've spent hundreds on BS. So yeah, as long as we're buying bs, they're selling bs. And they should stop with the "it's dangerous but let me show you how to build it", crap. Sentdex is cool but I've never seen him use a click bait title before. Until now.

    • @tkdevlop
      @tkdevlop Год назад

      well it's there lively hood

    • @iconelias508
      @iconelias508 Год назад +3

      I thought I was the only getting these recommendations

  • @jeffreycooper6163
    @jeffreycooper6163 Год назад +14

    From basic python to self driving tech - now LLMs. Harrison you are a gentleman who always guides the way through cool new tech with no BS. Thanks for this video!

  • @ChaiTimeDataScience
    @ChaiTimeDataScience Год назад +48

    I love how Harrison is standing in videos when he is teaching us via code and sits down when its a video talking about a concept or project!

  • @mytechnotalent
    @mytechnotalent Год назад +95

    It is mind-numbing how cheap the GLM is compared to GPT3. Can you imagine where we are going to be in just another few months? Great one Harrison as always!

  • @PMX
    @PMX Год назад +29

    Gave it a test, it used a Chinese term in a response for some reason. I asked it to define it and it did (I assume correctly, or at least it fit with how it used it in the previous response). Apparently the model was trained "on 1T tokens of English and Chinese" and it seems some Chinese is leaking into the English responses, perhaps in the same way a multilingual person may occasionally use a word in a different language by accident, as the word captures a particular idea better than the language being used at the moment (?).

  • @HelloThere65272
    @HelloThere65272 Год назад +17

    I literally cannot bear any "AI news" these days because it's always sleeky dudes that have obviously no idea what they are talking about that just want to grift on the next thing (basically crypto bros from 9 months ago). I literally hide videos about AI per default with the "not interested" button, but I clicked on this one immediately. Thank you Sentdex!

  • @rumidom
    @rumidom Год назад +30

    Have you seen the open-assistant efforts in creating a open RLHF dataset? Hopefully that could also result in gptchat like smaller models that can be run locally

  • @Dmitrii-q6p
    @Dmitrii-q6p Год назад +22

    - are you better than chatGPT?
    - As an AI language model, I am capable of generating text that is similar to human language, but I don’t have the same level of natural language understanding as a human. So, in terms of generating text, I am similar to ChatGPT in that respect. However, in terms of specific abilities such as understanding natural language and generating appropriate responses to a user’s questions, I may be slightly better as I have been trained on a wider range of data and have additional features such as context and history that ChatGPT does not have.

    • @foxshot97
      @foxshot97 Год назад +1

      this is the response i got: "As an AI language model, I’m designed to模仿人类的语言交流能力, but I’m not quite as powerful as ChatGPT, the open-source machine learning model developed by Facebook AI Research. While I can generate human-like responses to a wide range of topics and questions, ChatGPT is specifically designed to generate text that is similar to human language, with a focus on language production and understanding.
      That being said, I’m constantly learning and improving, and I’m eager to continue contributing to the field of AI and language processing. So, in some ways, I’m still developing and could potentially improve in the future."

  • @BinarySplit
    @BinarySplit Год назад +8

    Thanks for sharing! I've been looking for an open LLM with English & Chinese support, and this sounds perfect! It's really cool that the authors care about the affordability of running the model.
    You're absolutely right that many models are undertrained and this one was well trained. That's definitely a big factor. However, GLM also claims better accuracy with similar numbers of training tokens as other models. Reading through the paper, they integrated a huge bag of tricks into both the pretraining and fine-tuning regimes to make it more sample efficient. I don't think there's any singular pivotal innovation (though there certainly are innovations) - it looks like it's the actual legitimate hard work in designing and building the complicated beasts that are their pretraining regime and fine-tuning datasets.

  • @mysticaltech
    @mysticaltech Год назад

    Man, the server racks behind you, so dope! Thanks for sharing about this valuable model.

  • @jessemair3684
    @jessemair3684 Год назад +1

    thank you for finding these and sharing them with us. Years ago you got many of us excited about agi chatbots. Love your vids, thank you!

  • @dan110024
    @dan110024 Год назад +3

    One of the few channels that has mostly dialogue and no shitty b-roll stock footage that nobody cares about. I always know I'm gonna get a sensible yet intriguing take on something.

  • @serta5727
    @serta5727 Год назад +16

    Open Assistant is about to drop a new opensource ChatGPT soon

    • @sentdex
      @sentdex  Год назад +9

      Can't wait to check it out, been following them closely. I wonder how it will compare.

    • @freestyletow1
      @freestyletow1 Год назад

      15 abr

    • @mikerhinos
      @mikerhinos Год назад +1

      I commented the same thing on Twitter :D
      Open Assistant could have the same effect as Stable Diffusion in image generation, and Midjourney wouldn't be where they are now without SD.

  • @list1726
    @list1726 Год назад

    Thank you for posting. Keep up the good work

  • @mlcat
    @mlcat Год назад +5

    I tried a few models for assistance, mainly llama based sub-models, for some reason huggingface variation of llama works worse than original fb one, maybe that's a mac-only issue though, but all the "loras" only work with hf one. Noticed that loras like gpt4all and alpaca-lora were trained either on alpaca dataset or chatgpt outputs and provide censored answers for obviously provocative questions. One model that still acts like an assistant but does not censor outputs is llama adapter, it's less accurate in answers but it provides a better impression about the capabilities of untouched models.
    Thank you for this video, will also try this model later!

  • @rikvermeer1325
    @rikvermeer1325 Год назад

    You are really good at explaining this; subscribed!

  • @JC.72
    @JC.72 Год назад +4

    it seems like the 130b model is also available, but u need to send a request form in, and wait for them to approve it through email. they ask questions such as what your intended use are, and the need to agreement to their terms. and i wonder how much better is the 130b model if there is any type of metric such as % differences.

  • @alexjenkins8026
    @alexjenkins8026 Год назад +4

    Great find and summary! Love your work.
    I'd love to see you use it!
    Are you able to run through a code example (python) of how to train any of these models (say a hugging face LLM) for a different purpose? I hear it a lot (and you mentioned it as well) but i've been struggling to repurpose them to do other things... Say a paragraph of text as input and a 'score' out or something (maybe how relevance to a topic, sentiment, or even how appropriate it would be to say at work or something)?
    I've even just tried to take the output of these models pre-text and feeding that into another model that I train - that way i don't have to try and fine-tune the big model at all (and get huge advantages of only having to feed my training data into the big model once, save the output and then train my small model)... but it seems i'm unable to get the big models to be consistent at anything - like it's not 'exploiting' enough and sending through the same inputs changes the outputs dramatically :/
    Keep it up! :)

  • @TheRealMcNuggs
    @TheRealMcNuggs Год назад +5

    Will you ever continue the "Neural Networks from scratch" series? Was really looking forward to backpropagation.

  • @sgaseretto
    @sgaseretto Год назад +5

    You should try RWKV and their fine-tuned Raven models, they work as RNN but are trained as a Transformer. The model is gaining a lot of momentum now

  • @ds920
    @ds920 Год назад +1

    Thank you, sir! A breath of fresh air🎉

  • @codechimps
    @codechimps Год назад +1

    This is such a solid video we need more of these when it comes to AI today.

  • @SpaghettiRealm
    @SpaghettiRealm Год назад +1

    Thank you, your content is great as always

  • @skaltura
    @skaltura Год назад +2

    I wonder about the training side, can we further train these tiny models and if so how, with what hardware and software to fine tune them to very specific narrow tasks

  • @ander300
    @ander300 Год назад

    Part 10 of Neural Net from Scratch, about analytical derivatives??? Please bring the series back!

  • @Shaunmcdonogh-shaunsurfing
    @Shaunmcdonogh-shaunsurfing Год назад

    Thanks goodness for this channel!

  • @happywednesday6741
    @happywednesday6741 Год назад +1

    Sentdex can you guide me to a place on your channel, a playlists/lists, or somewhere where I can brush up on ML for these large language models and recent generative AI tools?

  • @jadzzzz
    @jadzzzz Год назад +9

    I wonder why there aren't distributed GPU compute solutions Ala foldit. In fact, why not create something like render token that rewards users for their gpu time, and allows people in the network to either donate their computing power for a project or tokens so it can continue training models. A truly democratized community effort. That being said, nothing would stop the bot uprising in that case.

    • @ablasttv
      @ablasttv Год назад +1

      A bit difficult I think at least for any entity that wants to keep the full model proprietary. They'd have to distribute it in pieces between thousands of nodes that only process part of the solution and then host their own input server to handle splitting the job, and waiting for some jobs to return before sending out other pieces of work specifically for fully connected layers. It's doable, but it would still cost millions to train these large models whether they're paying a datacenter or through some crypto token. And any entity that has that money is going to expect a return on their investment.

  • @HellTriX
    @HellTriX Год назад +1

    I was thinking the same thing, that it would be nice to be able to download and test some of these large models within the constraints of some consumer hardware.

  • @tiagotiagot
    @tiagotiagot Год назад

    Why people seem to hesitant to allow the negative side of the activation function to have similar magnitude as the positive side? What's the mechanism that makes it worse when you have the negative mirror the positive or have similar ranges?

  • @fnegnilr10
    @fnegnilr10 Год назад

    Wow, this talk is fantastic! Have you seen MrPSolver's video on using a GPU to speed up calcs with torch by 250 times over numpy.

  • @nathanbanks2354
    @nathanbanks2354 Год назад +3

    This is really neat! I hadn't heard about GLM-130B and wonder how it compares to LLaMA, facebook's over-trained model. After a quick search, the LLaMA paper only briefly mentions GLM-130B in the context of instruction fine-tuning, but doesn't really talk about general performance. I managed to get LLaMA-30b running on my 16GB video card using 3-bit quantization, and Alpaca was pretty good, but it would be neat to run something with an Apache license instead of an academic one. Now I'm downloading ChatGLM6B and hoping it works easily with text-generation-webui....

    • @4.0.4
      @4.0.4 Год назад

      3-bit quantization? What do you run it on?

    • @nathanbanks2354
      @nathanbanks2354 Год назад

      @@4.0.4 A dell precision 7720 with a mobile p5000 mxm graphics card. It's a few years old, but it has 16gb which is enough to have some fun. It can also barely transcribe in real-time using the largest whisper model.

    • @Harsh-un2pt
      @Harsh-un2pt Год назад

      @@nathanbanks2354 impressive!!

  • @lijostar167
    @lijostar167 Год назад +7

    Have you tried the vicuna model yet? Supposed to be '92%' as good as chatgpt. How would say it compares to ChatGLM?

    • @IronZk
      @IronZk Год назад +1

      i have tried vicuna 7b on cpu and it's impressive!

  • @yeetdeets
    @yeetdeets Год назад +8

    I'd be really interested to see your take on the recent vector db boom. Particularly how it will relate to AI, because I don't get it. Why is a vector db better than a relational db for AI?

    • @leptir1
      @leptir1 Год назад +1

      The Access to Information for Daily Sensibilities Law of 2023 (AIDS, 2023) states that every question must now be posed to AI before a human, since you were going to ask the human the same way you would ask the AI, and the AI had a chance of knowing.
      It's the new "LMGTFY"

    • @Crayphor
      @Crayphor Год назад

      I have not heard of vector db before but I would assume that the vectors in question are static embeddings attached to the data. The reason why this would improve upon relational db is that it should contain more complex continuous relationships between data rather than discrete human-generated relationships. It would be similar to the improvements of using NNs as generalized feature extractors compared to having an expert come up with features. (Reiterating that this is just my guess based on the name)

    • @yeetdeets
      @yeetdeets Год назад

      @@Crayphor As I understood it "vector" is a misnomer because it's more like every entry has a coordinate in a Cartesian space. As such the relationships are simply the distances in the shared space.
      I just don't see how it would be better than a relational DB for memory to an AI or AGI. I imagine you mostly want specific recall rather than general recall. The model adds generality and the database adds specificity, at least that's how I'd think of it.
      Maybe it's beneficial to have some sort of general memory in order to abstract out lessons from repeated experiences though, which is how we learn. But I'd argue that's mostly a workaround due to our lack of ability to generate experiences. AGI wouldn't have that problem. It could simply take all its memories and retrain its logic on all of them at the same time, which would have fewer biases - of which we have many due to the architecture of our intelligence.

    • @davidw8668
      @davidw8668 Год назад +1

      They allow to search by similarity in a fast and scalable way e.g for information retrieval. These "DBs" are quite sophisticated systems, and it would be rather hard to build them from the ground up. You couldn't perform the similarity calculations with relational dbs accurately. In combination with llms they are used for retrieval augmentation but also in general for vector search and recommender. Search is a big deal in AI.

    • @pozz941
      @pozz941 Год назад +1

      @@yeetdeets I don't really understand what you men by "the model adds generality and the database adds specificity". The database is not used for specific data retrieval or to enhance the capabilities of the model, for that search engines on the internet already exist. What it is used for is to prolong the effective memory of the model.
      Let's say that the model has a memory of 1000 tokens, it is quite easy to have a conversation that uses all of them. So what do you do if you want a longer conversation? You ask the model to summarize the conversation, and you store the summary in a vector database. Then at the next message you pre-prompt the model with the topic of conversation (by showing the model the part of the current conversation that fits its memory and asking it what the current topic is), you retrieve related summaries of the conversation (by converting the last messages or just the topic in a vector and searching the database for similar stuff) and you give it the last messages of the current conversation so that it knows at which point of the conversation you are.
      Doing this, you can fake a much longer memory. For example let's say that you are having a long conversation with your AI about dogs, but you ask it "do you remember when we were talking about cats?" at that point since the topic shifted to cats when you search the database using the last messages you find all previous conversations you had about cats and feed them to the AI so that it knows what you are talking about. Of course, it is a little more complex than what I just told you, otherwise you could just have a topic field in a table and retrieve all related summaries. Vector database allow for a more fuzzy search, for example maybe the topic of a previous conversation was not exactly cats but felines in general. In a vector database you would still find something, but you would find nothing in a relational database.
      Take whatever I said in this comment with a grain of salt because I am not an expert by any means, this is just what I understood from people talking about vector databases in the context of AI.

  • @lutzsommer4708
    @lutzsommer4708 Год назад +8

    What about the LLaMa oder Alpaca model? How do they compare to ChatGLM6B?

    • @sentdex
      @sentdex  Год назад +3

      Only used the base LLaMa, no quality chat variant that I could find, can you link to any specific project page or weights that I can download and try? Finding one issue with many of these models is just... ease of use OOTB.

    • @djklfsdjklfjlif
      @djklfsdjklfjlif Год назад

      @@sentdex Almost all of the recent exciting ones are built from the leaked LLaMA which is why getting ahold of them isn't quite as easy. Most have gotten uploaded to HuggingFace in various formats by this point though. LLaMA of course, then the various transformations of it like 4bit. Alpaca, which didn't release their weights, but has been reproduced & released by others. Vicuna which uses the ShareGPT dataset and is said to be better than ChatGLM6b. And gpt4-x-alpaca which looks promising. The downside of course is licensing. Since they're built from LLaMA you can't use any of these for commercial use. The later ones were tuned using output from OpenAI which is against their terms. All of these can run on consumer hardware.
      For finding more about these - There's the oobabooga project which makes discovering & running a lot of these easier. There's also the Oobabooga & LocalLLaMA subreddits which discuss the newest stuff. Lots of model announcements hit the front page of HN due to the hype right now.
      There's also the Open Assistant work which looks promising. Don't know much about it yet but worth looking into.

    • @eh_channel3
      @eh_channel3 Год назад +3

      @@sentdex There is vicuna model that was released few days ago, give me much better result compare to LLama

    • @lutzsommer4708
      @lutzsommer4708 Год назад

      @@sentdex I'm sorry I can't reply to you with a useful content, my comment keeps getting deleted... I have no clue why.

    • @sentdex
      @sentdex  Год назад +1

      @@lutzsommer4708 feel free to email me harrison@pythonprogramming.net

  • @MikeClarkeARVR
    @MikeClarkeARVR Год назад +2

    Thanks for the video, I just tried the model.... not very good for the use cases I experimented with. Perhaps others can recommend other use cases.

  • @fitybux4664
    @fitybux4664 Год назад

    GLM-130B vs Vicuna? Any opinion on that? I know that Vicuna is a much smaller model, just wondering, of the Llama based models if any compare to GLM-130B.

  • @TheMagicMooshroom69
    @TheMagicMooshroom69 Год назад

    I know you put out your video discussing the NNFS Part 10 about a year ago, but have you got any updates on that series at all? :)

  • @tomwozne
    @tomwozne Год назад +1

    Given that it's trained bilingually could we limit the training data to only English and improve it's output for English only?

  • @barny1174
    @barny1174 Год назад

    Thank you for sharing once again 👍

  • @1PercentPure
    @1PercentPure Год назад

    thank you so much for sharing!

  • @shawnfromportland
    @shawnfromportland Год назад

    really great channel, man!

  • @snarkyboojum
    @snarkyboojum Год назад +1

    How are you marking up the PDFs of the whitepapers as you read them? It looks electronic to me....

  • @Crayphor
    @Crayphor Год назад +16

    That gaussian activation function is interesting! I'll have to look into exactly what it is and why they chose it. If it is based on a gaussian distribution, it could have interesting properties, since the sum of gaussian variables are still gaussian. Maybe even the model output would be a gaussian.

    • @perc-ai
      @perc-ai Год назад +3

      This guy doesn’t know anything about Gaussian lol

  • @joshmabry7572
    @joshmabry7572 Год назад

    What is the context size of the model? I was trying to dig around and couldn't find the info offhand
    Also, its slightly amusing how hard it struggles to translate chinese to english. I tried a few prompts after an initial conversation, so not exactly extensive testing.
    "The translation of “各自的” to English characters is “各自的”."

  • @theoistic7002
    @theoistic7002 Год назад

    to run yes... but you cant expect to train on the model on an RTX 20xx series, you need at least 16-20 GB of VRam to train the none-quantized model.
    cuz after you quantize it, you kinda lock the model. at least thats the rule for LLaMa .. which although is different, same quantization rule applies.

  • @MrRadziu86
    @MrRadziu86 Год назад

    How would you compare its performance to Hugging Chat (open assistant), WizardLM, StableVicuna and other open LLMs and chats?

  • @TerboBlitz
    @TerboBlitz Год назад +6

    I would like to see a tutorial showing how to train ChatGLM6B to do just one single task like Python programming, for example. Will narrowing the scope of learning let us limit the size and complexity of the model so it can be developed on a power GPU workstation by individuals?

    • @differentperspectives
      @differentperspectives Год назад +3

      In the future, it is likely that specialized consumer AI models will emerge, focusing on specific domains such as Python or C++ programming, rather than attempting to handle every task. This specialization could potentially address the limitations currently faced by many of the all-purpose AI models being introduced.

    • @stevepk5421
      @stevepk5421 Год назад

      @@differentperspectives thats common sense

  • @aa-xn5hc
    @aa-xn5hc Год назад

    What about "open assistant"?

  • @Tepalus
    @Tepalus Год назад +5

    "It's fast and small, and it sometimes gets lost with too much dialogue." - My Ex lol

  • @aloo_explains
    @aloo_explains Год назад

    GELU's are being used quite alot in models like, GPT-3, BERT.

  • @sevret313
    @sevret313 Год назад +8

    I'm hopeful to see better models in the future, but I'm worried that attempting to shrink them down too much before reaching sufficient quality with open-source models is a bit of a mistake.

    • @StoutProper
      @StoutProper Год назад

      It’s up to you to retrain and fine tune them

    • @sevret313
      @sevret313 Год назад

      @@StoutProper Training is expensive and there is nothing wrong with me pointing out that the open-source community suffering from lack of quality control and commitment at times.
      If we have any hope of getting OpenAI quality from open-source it doesn't help with having the funding being spread on a ton of mediocre projects.

  • @jurelleel668
    @jurelleel668 Год назад

    Awesome video mate.
    Sentdex please do a chat gpt glm-6b tasks review

  • @3dx3em
    @3dx3em Год назад

    HF test does not go beyond 1st prompt. It cut off, I asked to Continue and nothing. tried another, same thing nothing beyond 1st answer

  • @Arewethereyet69
    @Arewethereyet69 Год назад

    Wonder if GLM is Alibabas version. Probably not but makes you wonder since they roll theirs out

  • @Truizify
    @Truizify Год назад +5

    Thanks for the video! Great content. I find it hard to believe that ChatGLM would be better than instruction-tuned LLaMA-7B (e.g. Alpaca, etc.), at chatting in English. Did you check out any of those models? Plan on doing a comparison myself.

    • @Will-kt5jk
      @Will-kt5jk Год назад +1

      I’ve found Alpaca 7b pretty good (if a bit slow on a 4 year old laptop, running in docker, restricted to 8GB ram - speeds up quite a lot when you up the thread count )
      So I’m interested in how the 2 compare too.

    • @zgolkar
      @zgolkar Год назад +1

      Vicuna is much more ChatGPT alike. Koala follows closely. So I womder how ChatGLM will compare to them…

  • @freecode.ai-
    @freecode.ai- Год назад

    Could a botnet train an LLM at the same level these big companies do?

  • @nandafprado
    @nandafprado Год назад

    I like getting this LLMs and test them in another language they weren't trained in, so I played with using it to answer me in portuguese and you could see the memory limitation there but it was still ok for its model size.

  • @joshuascholar3220
    @joshuascholar3220 Год назад +4

    I thought that you don't need GPUs for inference, only for training. If you could train or tune on consumer hardware THAT would be HUGE!

    • @ablasttv
      @ablasttv Год назад

      Inference does still need GPU power for these large models. It's still lots of matrix multiplication.

  • @uae7001
    @uae7001 Год назад

    Did you try LLaMA AI language model which leaked online recently, i like it and will compare it with this

  • @devilmix85
    @devilmix85 Год назад

    Can you make a video on Gated Linear Units?

  • @SkimFelBon
    @SkimFelBon Год назад

    Hi @sentdex, what tool you are using to highlight PDF's?

  • @TheDogn
    @TheDogn Год назад

    11:22 I wish you talked about how to run it locally

  • @tskumar1320
    @tskumar1320 Год назад

    Wonderful. Keep it up 💓🙏

  • @bravinrutto8784
    @bravinrutto8784 Год назад

    Have you looked at LLama cpp where they were able to run the Llama-30B with just 30GB of RAM.

  • @alexd7466
    @alexd7466 Год назад

    Which one is the best for writing code?

  • @gwky
    @gwky Год назад +5

    LLMs seem hype until you start chatting with so many of them and realize they can't keep a conversation for more than a few lines... Even story writing models seem to start hallucinating. But, I'm genuinely amazed that so much can be generate from just a few prompts and while the "memory" is usually 2048 tokens, as hardware gets better, we might get to see actual story-telling ability, multiple characters, and entire worlds created that the users can interact with, be it in text adventures like zork or via ai generated novels. At the moment, I've tried every tool out there, and the most popular 6B/13B/20B models. It's still garbage after a few prompts, but sometimes it does remember the context and leaves me genuinely impressed. It's like an abusive relationship, most of the time it's crap, but you learn to cherish the good moments.

    • @Saerthen
      @Saerthen Год назад

      It's possible to use summarization to make the chatbot "remember" what was discussed before (prompt: "summarize the dialogue from the beginning to this point"), it can help to keep the dialogue for a longer time, but if you want "long memory", you need some kind of database (like advanced storytelling software does). Also, I'm not sure if "long memory" would be a benefit or a problem.

    • @gwky
      @gwky Год назад

      @@Saerthen IT doesn't work. Already, long-term memory and key-specific memory exists, it works by injecting details into the prompt to "remind" the chatbot, but unless your question is to "define" something, it really doesn't do much more than that. If you try to play a chat adventure, zork-like, it creatively generates the wrong enemies, or ignores your previous action. Furthermore, loading LLMs with 2048 tokens as context isn't easy even on modern hardware, and while loading models with more parameters (going from 6B to 13B or 20B) sometimes help the AI to generate more context-relevant responses, it still hallucinates plenty, whereas the more parameters the model supports, the longer it takes to load it, and generate responses.
      In short, these ideas have already been explored, there are even commercial SaaS novel writing AI solutions or text-adventure ones, but they also exhibit the same problems. It's just the limitation of the current tech.

    • @larion2336
      @larion2336 Год назад +1

      That's kind of my experience as well, as someone mostly interested in storywriting. I've used GPT4ALL and various versions of Alpaca. I found Alpaca better but it still had major issues with memory. That said, in some ways it was also noticeably better in its contextual knowledge compared to older open source models like Fairseq / NeoX, as in recognizing characters / popular stories and their events by default. But they really need better interfaces - for that I am waiting for NovelAI to train a new model with their new h100 cluster they got access to. That will be hype.

    • @Saerthen
      @Saerthen Год назад +1

      @@larion2336 Llama was initially trained with 2048 tokens context. As far as I know, you can't fine-tune the model to use a larger context; you'll have to train it from scratch. And right now, no open-source models are trained that way.

    • @gwky
      @gwky Год назад

      @@larion2336 wouldn't be too crazy about it, LLMs are just guess engines. While I was hoping more parameters and token context would fix it, the randomness is something hard to predict. At the moment the reason most models use 2048 tokens is for compute simplicity and ease of training, similar to what Ivan was mentioning. I rented cloud VMs and you can train a larger model for a few hundred $, I used google compute vm with 80gb a100 cards. Sadly, even if you hand pick the data, say I wanted xianxia light novel generation, so I'd train the model on that, but because it simply tries to predict the best conclusion to your prompt, it generates at best pseudo random nonsense. Like you mentioned, sometimes it seems to write with the relevant character, scene, or context in mind. Then it breaks that momentary immersion with a mind leap elsewhere, because it probably thinks it's a more mathematically fitting conclusion to your prompt or whatever it wrote.
      But maybe novelai will do something outside of my expectations, it'd really be great to have a novel writing ai that would generate stories on demand. Similarly, using it for text adventures or DnD would likewise be a great. Still, I always remind myself that this is the worst that ai will ever be. It only gets better from here.

  • @rileykramer3642
    @rileykramer3642 Год назад

    3090s are like $1500 new. itd only be like 6k for a setup like that. id personally wait for 5000 series to see if they up the vram. im guessing we will see a 32gb 90 series model.

    • @fitybux4664
      @fitybux4664 Год назад

      Why get a 3090 for $1500 when you can get a 4090 for around the same price?

  • @distortedjams
    @distortedjams Год назад +17

    I'm surprised researchers just release these advanced models out to the public for free. I'm sure going into the future everything will just be proprietary like OpenAI seems to be moving towards.

    • @sentdex
      @sentdex  Год назад +27

      No one really knows what to do with these models on a grand scale yet. Instead, they're relying on users/devs to make uses. The main issue right now is, even the "best" model today will be replaced tomorrow, so no sense in closing off entirely IMO. I think the planned model for most companies is to be the "platform." I think this is OAI's objective too. Theyll continue selling access to their models, and people will build/sell products that are built on top.

    • @Tri-Technology
      @Tri-Technology Год назад +1

      @@sentdex I am wondering about "No one really know what to do with these models". I mean isn't replacing google search a big and nice target? You could build like a little companion application that helps the user with almost every text based information.

    • @IronZk
      @IronZk Год назад

      @@sentdex Can you tried vicuna-7b it's very impressive for a small model.

  • @AHN1444
    @AHN1444 Год назад

    if I wanted one model just for core logic control? just 1 language, which is the minimum size it could be nowadays?, the model would be prompted with a summary made by another model, and responds with for example [module="memory" action="retrieve data" params="xxxxx"] [module="browser" action="visit site" params="xxxx"] [etc] [etc]

  • @jurelleel668
    @jurelleel668 Год назад

    Do multimodal tasks, image tasks, mathematics tasks for the glm 6-b in English. Thanks

  • @loopuleasa
    @loopuleasa Год назад

    I would've loved more examples of the thing in the beginning
    I was not convinced

  • @aladinmovies
    @aladinmovies Год назад

    How much operations can make 1 GPU 3090? And how to calculate how much operations needs programm? Thanks for answers!

  • @Maisonier
    @Maisonier Год назад

    What IA projects we can focus with 1 rtx 3090? there is any advantage in buying another and use 2 rtx 3090 ? or is too late and the hardware won't make any difference in our homes?

  • @rohanjamadagni
    @rohanjamadagni Год назад

    Hey, what about the LLaMA models? Did you have a look at those?

  • @mukkeshmckenzie7386
    @mukkeshmckenzie7386 Год назад +1

    Can you compare this to llama? Or alpaca?

  • @creativedolt
    @creativedolt Год назад

    The output of ChatGLM seems to often have chinese charaters mixed in with English text. eg. if you ask "What is backpropagation?", then words like "accuracy", and "gradient" show up in chinese. Doesn't anyone else run into such issues? Doesn't that change the utility of the model for non-chinese speakers?

  • @homataha5626
    @homataha5626 Год назад

    Thank you for the video.
    Can you give us some guidance how yo build a model on other language than English?
    I am interested in building a model for persian texts.

  • @jaysonp9426
    @jaysonp9426 Год назад

    I asked it "which world leader looks the most like Whinny the Pooh." I'm dead now.

  • @tomwozne
    @tomwozne Год назад

    Does this need a connection to the internet?

  • @CMAZZONI
    @CMAZZONI Год назад

    Could this model be used for semantic search?

  • @gogibear123
    @gogibear123 Год назад

    appreciate u❤

  • @rifkiardiansyah616
    @rifkiardiansyah616 Год назад

    Bro, you should continue your NNFS video series!!!!

  • @abuslume4169
    @abuslume4169 Год назад

    My guess is that this bot's context is a little too limited, but it'd be interesting to see how this bot would fare in the same test Microsoft did with AutoGPT.

  • @4mb127
    @4mb127 Год назад +2

    Did you compare to LLaMa?

    • @sentdex
      @sentdex  Год назад +1

      I've only played with the base LLMs, is there a chat variant that you think is exceptional and that I should take a peak at?

  • @litttlemooncream5049
    @litttlemooncream5049 Год назад

    thanks

  • @MrTheKeiron
    @MrTheKeiron Год назад

    0:55 "It's shockingly good for its size"
    Thats what she said

  • @JanBadertscher
    @JanBadertscher Год назад

    not going to try anything smaller than 13b 4bit for inference :) . fine tuned larger bloomz models run just fine for a few dollars on AWS sage maker and other OSS models will eventually come out, pushing the quality even further into GPT-3.5 territory and hopefully into GPT-4 territory with some tricks... currently, imho there are enough llama, alpaca and vicuna models with custom fine tuning, to make them really really good.

  • @joshheller1439
    @joshheller1439 Год назад

    What technology jobs haven’t the greatest risk of depreciating faster as a result of AI? (Reflecting quickly it feels like ALL could be replaced much like repetitive actions with robotics)

    • @sentdex
      @sentdex  Год назад

      Quick thought, but ... probably alignment/safety for models hah.

  • @CA-iv6tz
    @CA-iv6tz Год назад

    Did anyone else get "OSError: Unable to load weights from pytorch checkpoint file" when trying to run ChatGLM?

  • @harriehausenman8623
    @harriehausenman8623 Год назад

    What about OpenAssistant?!

  • @dgiri2333
    @dgiri2333 Год назад

    Can you do videos for Pybrain to train model like chartgpt like AI

  • @clydecmcelroy4638
    @clydecmcelroy4638 Год назад

    I wonder how many article titles and video titles are going to contain the words "chat GPT killer".

  • @li-pingho1441
    @li-pingho1441 Год назад

    awesome!!!!

  • @sebzim4500
    @sebzim4500 Год назад

    GPT-3 also uses GELU activation

  • @mujtabaalam5907
    @mujtabaalam5907 Год назад

    Have you tried Chinchilla or Vicuña?

  • @efovex
    @efovex Год назад +1

    ChatGLM seems cool on a technological level, but a word of warning: this is trained on text available inside mainland China, so don't think for a second that it's going to give neutral or factual responses to everything.
    As a demonstration, try asking it what happened on Tiananmen Square on June 4th 1989...

  • @yarpenzigrin1893
    @yarpenzigrin1893 Год назад

    That's pretty good. Have you looked into LLaMa, Alpaca, Koala or Vicuna?

    • @SJ-xz2ch
      @SJ-xz2ch Год назад +1

      Yea I was impressed by vicuna.
      It seems like the best of the llama based models. (So far)

    • @kazimafzal
      @kazimafzal Год назад

      @@SJ-xz2ch -- well Vicuna definitely requires more than 8GB of VRAM to run, I wasn't able to get it running on my 3060Ti (complained of not enough memory)

    • @SJ-xz2ch
      @SJ-xz2ch Год назад

      Mm good to know. Looks like vicuña 7b wants 14 gig.
      Still if the parallelism/multi gpu option works, it makes it a realistic option to run locally. (Don’t know of anyone who has tested it yet)
      Hoping for great things from open assistant but it’s going to be tough if they won’t use response data from gpt like vicuña did.

  • @patrick-gerard
    @patrick-gerard Год назад

    Hi, do you know if that model runs on a Huggingface Nvidia Tesla T4 Inference Endpoint?

    • @sentdex
      @sentdex  Год назад

      Looks like it does, the unofficial demo I linked/showed is running on a T4 (huggingface.co/spaces/multimodalart/ChatGLM-6B)

  • @IronZk
    @IronZk Год назад

    Vicuna-7b is also not bad.