👑 FALCON LLM beats LLAMA

1littlecoder

Просмотров 12 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 5 ноя 2024

Комментарии • 127

@slowmissouri205 Год назад ⁺¹²
I like seeing contributions like this to the community. They aren't cheap to make, so it's a big deal.
@michaelbarry755 Год назад ⁺⁴
Me too. This also highlighted my own ignorance, I had no idea Abu Dhabi had an interest in this sort of thing, especially open source. It tends to be US/UK/China. This really put Abu Dhabi on the map. I'm so grateful for everyone that is contributing to democratising this technology, it's an honour to witness
@VIVEKKUMAR-kx1up Год назад ⁺⁴
after andrej webinar sought off, i came to know it's not only about model parameter ,but it's also about number of tokens
love you videos!! :)
@tarun4705 Год назад ⁺⁷
I think the license looks fine to me in my opinion since we can use the model right away for free in our startups and if our startup is making more than 1M/year then paying for the model won't be a problem.
@zerog4879 Год назад ⁺²
true, but they also should release the % of royalties they want, without specifying it can function like a blackmail sort of situation, "if you don't want your startup to fail or lose the growth momentum pay us 50%"
@karthikbalu5824 Год назад ⁺⁴
@@zerog4879 in that stage we can switch to a different latest model
@xiaojinyusaudiobookswebnov4951 Год назад
@@karthikbalu5824 😆
@blisphul8084 Год назад
Large companies can use it free as long as the attributable revenue is less than $1m. This basically means building is free, deployment is not.
@JavArButt Год назад ⁺²
Thank you for going into details about the license. Indeed, this is a very important factor when deciding for one LLM.
@1littlecoder Год назад
Glad you appreciate it!
@RacingMachine Год назад ⁺²
Thanks a lot bro, you are the best! Please continue with the content, i will never be able to praise you enough as i am starting my ML journey and entepreneurhip as well!
@1littlecoder Год назад
Thanks so much 😊
@Syn_Slater Год назад ⁺⁷
I've always been a fan of licenses that follow the framework of: It's free untill you make substantial revenue. I don't know about the unity license, but I know unreal follows a similar license.
@rajivmehtapy Год назад ⁺⁶
QLora+falcon llm, going to work tomorrow.
@1littlecoder Год назад ⁺⁷
Please let me know how it goes. I'm definitely interested trying QLora on something!
@nattyzaddy6555 Год назад ⁺¹
What GPU are you using to do that
@kait3n10 Год назад ⁺¹⁰
I wonder when The_Bloke will do his magic with GPTQ and GGML versions, can't wait to try it!
@1littlecoder Год назад ⁺²
That'd be super interesting to see
@IronMechanic7110 Год назад ⁺²
ggml-falcon can't wait for this👊🤛
@michaelbarry755 Год назад ⁺²
soon some one will figure out a way to quantize this down to 1 bit, you'll have a 40b model on 5GB vram :)
@nattyzaddy6555 Год назад
@@michaelbarry755 For now what kind of GPU will I need for the 40b do you think
@lightyagami6823 Год назад
Could you explain what GGLM and dGPTQ are? Or point me to some resources for this
@projectbit2248 Год назад ⁺³
This is beautiful!
@1littlecoder Год назад ⁺¹
Thanks
@krawlak Год назад ⁺⁶
The licensing sounds more similar to Unreal's model than to Unity's.
@rationalistfaith Год назад ⁺⁵
Mash’Allah beautiful
@1littlecoder Год назад ⁺¹
It's!
@gunngunn6763 Год назад
Great your audience base is increasing day by day... happy for you
@michaelbarry755 Год назад ⁺⁴
Considering they probably spent about $1m training the model, its fair to ask for a royalty if you use it to make a shit ton of cash. However, this will soon be eclipsed by a better model released Apache style, so in reality they probably won't get any royalties. Look at the pace of research, a stable diffusion alternative just blew SD out the water by 100x faster inference (100 FPS), we can now fine tune a 7b vicuna model overnight on an iPhone whilst it charges, infinite context length is almost here (just a few issues to iron out), this is all from the last 2/3 days worth of papers. I estimate that by 5pm on Monday we'll each be pretraining GPT-5 on a raspberry pi. Imagine being the guy who wrote a $10 billion cheque to OpenAI... yikes. Less is more :)
@shotelco Год назад
I disagree. There is a term in business known as "opportunity cost". Opportunity costs represent the potential benefits that an individual, investor, or business *misses out on when choosing one alternative over another.* Example, I just got a job, and I need a car to get to work. I see there is a car I cab buy now that will get me to work...but, I will wait until there is a better car that is faster. Guess what, the new company fires me because I didn't show up to work...awaiting the release of the new better car.
Many fully open source models can perform domain specific task BETTER than GPT5 (which isn't available yet) could ever do. The key phrase is "domain specific".
@michaelbarry755 Год назад
@@shotelco I'm a developer so I don't really know much about business, but I know know enough that when you spend $10 BILLION and then a few weeks later open source developers are releasing similar product that cost less than a pizza, its safe to say someone fucked up big time.
@Quantum_Nebula Год назад ⁺³
whoa! whoa! You breezed over just how significantly faster that inference time is! It's literally 4x faster than GPT-3 at generating response. (2:46) Jesus Christ in the flesh - that's fast.
@1littlecoder Год назад
My bad!
@michaelbarry755 Год назад ⁺²
The model on hugging face is ridiculously fast, I'm not such what hardware it's running on but its the fastest Inference I've seen yet. It was a few paragraphs in less than a second
@Quantum_Nebula Год назад
@@michaelbarry755 game changing! That is fast. Couple that with Tree of Thoughts and it might be a little too powerful.
@1littlecoder Год назад
@@michaelbarry755 you mean the space or the right side inference?
@michaelbarry755 Год назад
@@1littlecoder it was the space, but normally where it says "Powered by xyz etc" there was no indication of what hardware it was on, I assume they have an option to hide the hardware from the user, but that surprised me because I wanted to know what it was running on
@blisphul8084 Год назад
Keep in mind, this only mentions attributable revenue. This means if you work for a large company, you can freely experiment with it, but once it's deployed in a commercial product, then you have to pay.
@HazemAzim Год назад ⁺⁶
Great coverage as usual thanks . Hands on video Pls
@1littlecoder Год назад ⁺¹
Thanks. Hopefully soon
@IronMechanic7110 Год назад ⁺⁵
YESSSSSSSSSSSS
@1littlecoder Год назад ⁺²
:D
@christopherchilton-smith6482 Год назад ⁺¹
I share this exact sentiment.
@karthikbalu5824 Год назад ⁺³
pls include how to run these models like using RtX 3090 etc
@holdthetruthhostage Год назад ⁺²
T they are horrifically optimized these programs when it comes to running locally I mean most of them are just text you know reply and Converse with the language models through text yet it's demanding over 16 GB GPU it's crazy
@xiaojinyusaudiobookswebnov4951 Год назад ⁺²
@@holdthetruthhostage in the future it is likely that GPUs will get upgrades to handle these demands, or even better, LLMs may become optimized enough to function on even our smartphone (like Google's Gecko)
@TBKJoshua Год назад
@@xiaojinyusaudiobookswebnov4951 I hope the latter is true at least. It doesn't seem like Nvidia is gonna rock the boat and add more vram to consumer grade GPUs just yet. Maybe they'll surprise me.
@karthikbalu5824 Год назад
@@holdthetruthhostage is it better than MPT-7B?
@holdthetruthhostage Год назад
@@karthikbalu5824 I can't say I haven't been able to run any of them, the graphics card demand is just crazy stupid I mean someone with a 3090 can't even run the LLM & let's be honest it's just text generation, your telling me that harder to run than a video game with shaders and so on. The optimization is horrific
@danielsachkov Год назад
Did they release any code to fine time it? Is there an easy way to apply LORA or QLORA to it that you aware of?
@michaelbarry755 Год назад
There is no "easy" way, it uses a custom GPT-like architecture, but it's only a matter of time until someone releases a qlora patch. I suspect the community will rally behind this new model
@abduallahmustafa1029 Год назад ⁺²
video on LOra.
@joe_limon Год назад ⁺¹
I am curious how many contracts actually get signed. Vs large companies simply waiting a week for the new flavor of the week.
@1littlecoder Год назад
True. Also would be interesting to see how many such businesses they go after.
@geekyprogrammer4831 Год назад
Proud to see the country achieved this where I am residing :D
@michaelbarry755 Год назад
On behalf of the open source community, we thank your nation for such an incredible gift to mankind
@cryptostu Год назад
What’s the estimate on how much this would be to produce? Also, how long does it take to train?
@Synthetiks Год назад ⁺¹
Checkout the size of this model it's ~2 tb
@vmarzein Год назад
THEY WAIVED THE ROYALTY LETS GOO
@1littlecoder Год назад
Here is a tutorial to get started - ruclips.net/video/21mHov4Whag/видео.html
@Bil17t Год назад
Can you make videos on flashattention and multiquery? Thanks
@pn4960 Год назад ⁺¹
Wow this is good 😊
@1littlecoder Год назад
Thanks
@remsee1608 Год назад
Is this better than the uncensored wizard vicuna?
@sr-3734tqp Год назад ⁺¹
So what if I make >1M/year? "Commercial agreement"? No clarity on what cut they'll take. A bit risky license imho
@michaelbarry755 Год назад
They just gifted you a model worth millions. Sure apache is better, but so is gratitude. When someone gives you a gift worth millions, the correct response is: thank you.
@sr-3734tqp Год назад
@@michaelbarry755 Sure. But my point is, businesses can’t operate without clarity. When they say 1M/year, is it revenue or profit? If it’s revenue, 1M in a year is not uncommon. At that point, if Falcon’s “commercial agreement” stipulates a 50% share, would you do it? How about 80%? Which of these is true? Nobody knows.
@michaelbarry755 Год назад
@@sr-3734tqp I suppose if you're planning on making money with it then yes of course you want clarity, you'll want to read the licence in full and get a legal opinion. But the same is true of any licence, GPL for example is a savage license designed to spread like a virus. So clarity is always neccesary hence why we love MIT/Apache licences. Sure this is a strange licence and in my opinion is a waste of time because no one is actually going to pay them when they can use other models for free, however for most people we're primarily interested in research and personal use, in which case this is a very expensive toy that has been gifted free of charge, open source.
@Hypersniper05 Год назад ⁺¹
QLora now it's possible to use these big models in 3090!
@michaelbarry755 Год назад
You could put a 7.5b on an iPhone lol 😂 this is just the beginning too, soon it will be on an arduino 😂
@jessicamarrero7678 Год назад ⁺¹
@Michael Barry pretty soon my microwave will be running a 30B model haha
@michaelbarry755 Год назад ⁺¹
@@jessicamarrero7678 just remember to be kind to our new robot overlords 😂 lmao
@nithinbhandari3075 Год назад ⁺¹
I don't know what to say about license:
The license say that If you generate $100 million in revenue, then you need to pay 10% of Attributable Revenue
Example: your revenue is 100 million dollar
Attributable Revenue = $100 million - $1 million = $99 million
Royalty Payment = 10% of $99 million
Royalty Payment = $9.9 million
Only few company has revenue such high, but suppose you use do not use this model largely, you just use this model very few times. So also we need to give $9.9 million, then it is bad.
But here we also need to see point 8.2) a) "In its written grant of permission, TII shall set the royalty rate that will apply to you as a Commercial User as a percentage of revenue ( “Relevant Percentage”), where, unless otherwise specified in the grant of permission, the Relevant Percentage shall be 10%; and ". So i believe the relevant percentage can also go down.
Other than that training a model cost a lot like this model costed around (384 A100 40GB for 2 month) = $608256 (1.1*384*24*60) (i may be wrong) (excluding the salary of engineer who were working, cost for collection parsing data and many parameters etc).
For A100 40GB for an hour cost around $1.1 on lambda labs.
As development cost is soo high, It make sense to use FALCON (if we are using heavily), chat gpt like model or some other model pay as you go model.
And its token limit is too low (we need atleast 100k token limit)
@kumargaurav2170 Год назад
While so many new models are coming out everyday but we've got to accept the fact that still there's no model which gives answers & exactly what user want in top notch format like chatGPT or GPT 4.
@shotelco Год назад
GPT4 .. for what? GPT4 knows a lot about everything, and can respond well (but will always need to be checked for accuracy). And since everyone on the planet can access GTT4, there is no way to leverage GPT4 commercially (why should anyone pay you to access GPT4 when they can do it themselves). *Only* OpenAI/Micro$oft can profit. And mega corporations have a legacy of addicting us with initial low cost, then upping the price as they see fit, and even worse, selling our personal "portfolio of interactions" to the highest bidder.
Perhaps some are good with this, but not me.
But for a commercial (for profit) use case, only a low intelligence person would tie their services to GPT4, because the price can rise from $20/Mo. to $2000/moth to $20,000/Month...or more You have zero control - once your married into their platform - on what your new partner (GPT4) decides how much of a cut of your business they want to take.
In a commercial use case, a real concern wants as much CONTROL as possible. Moreover, ChatGPT4 knows nothing about a particular vertical market that is 100% up to date, nor can it know how a particular business attracts business and what that businesses customers want. An in-place (private) LLM doesn't need to know the history of 17th Century British lawnmowing, it needs to be fed highly curated industry data and learn from dynamic new trends and potential opportunities every moment. This allows REAL VALUE with some _proprietary protection._ *By even interacting with GPT4, it is learning from you, and you're giving away your valuable knowledge to a deca-billion $ company for free.*
GPT4 Doesn't sound so good when I put it this way, does it?
@michaelbarry755 Год назад ⁺¹
It's only a matter of time, we still know very little about them, it's trial and error. When we find the right mix of data/architecture/etc we'll blow GPT4 out of the water, but then they'll make GPT5 and we'll play a game of cat of mouse the rest of eternity :)
@michaelbarry755 Год назад
@@shotelco I didn't read your whole message but I assume you said, Open source good, Closed source bad 😂. I think the world is slowly waking up, the internet runs on open source because engineers are smart, consumer devices run on proprietary because consumers tend to be ignorant of what's happening behind the scenes, it takes an engineer-like mindset. But the more big companies shit on their customers, the more people will go open source. Build it and they will come
@Michael-Humphrey Год назад
Have you taken a look at the tree of thoughts on git ?
@alqods80 Год назад
How is it compared to QLora
@xXWillyxWonkaXx Год назад ⁺¹
Wait, I'm under the impression that Llama and ChatGPT is available to use for commercial use as well? Or am i wrong
@mulira Год назад ⁺⁴
liama is for un commerical only, and openai gpt's are all closed, so you can't host then on your own system, you can only pay for the api
@henricbohm8455 Год назад
I am curious how to get this running on my own computer. What are the hardware requirements for the graphics card, RAM, CPU and disk space? At least how is your setting?
@holdthetruthhostage Год назад ⁺¹
It's a great license unreal level where you only pay if you make something & with open source Linnux is open source and it's basically used for everything The big corporations are basically attempting to create fear mongering and basically saying that it's better in our hands big corporate rather than the hands of the people when it's been shown that big corporates corruption has caused lives and destroyed millions
The thing they really fear about a eyes that it allows the average person to access the ability of over a million people when it comes to how much can be done with it which makes the major corporations advantage very very not as useful as it used to be not as dominating as it used to be cuz now if you have a AI with the right amount of GPU resources which can be rented online what you can accomplish is astounding and big corporate can't get keep your accomplishments anymore especially with open source unlike Linnux where they knew well you're just one person coding this with your bare hands you know how much can you actually accomplish
@1littlecoder Год назад ⁺¹
Thanks for sharing
@emmanuelkolawole6720 Год назад ⁺¹
what is the prompt token limit for the maximum prompt size the AI model can understand at a time? That is what matters the most
@1littlecoder Год назад ⁺²
2048 tokens my bad I didn't get into it
@michaelbarry755 Год назад ⁺³
soon to be unlimited, a paper just came out that with a few tweaks to the transformer will allow fine-tuning any pretrained model to infinite contexts. they fixed the quadrative memory issue. there is a few things to iron but... soon :) it's called "landmark" something and it can offload the context to RAM, whilst keeping an index in VRAM. that's an oversimplification, but yeah. context is now infinite for all intents and purposes.
@holdthetruthhostage Год назад ⁺¹
@@michaelbarry755 well depends if it can run locally
@xiaojinyusaudiobookswebnov4951 Год назад ⁺¹
@@holdthetruthhostage In the future, Microsoft's entry into the GPU market with their Athena chips will drive down prices and make us be able to run LLMs on consumer-level GPUs. Gotta wait for it.
@michaelbarry755 Год назад
@@holdthetruthhostage it can run locally, you could also run it in the cloud. If you're referring to proprietary AI like OpenAI then of course you're out of luck, however they would likely implement it and make it available through the API
@nattyzaddy6555 Год назад
For a frugalGPT setup, how would you use Falcon? Something like first try the 1B, then the 7B, then the 40B?
@michaelbarry755 Год назад
Why? It's free.
@nattyzaddy6555 Год назад
@@michaelbarry755 Lol I was thinking for a faster compute time but I’m not sure that will be the case when you are running the same prompt multiple times and having a separate model evaluate each response. But multi-AI usually does makes better results. Something llama -> vicuna -> falcon
@michaelbarry755 Год назад ⁺¹
@@nattyzaddy6555 I wouldn't bother, with the fast pace of research, whatever you build will be obsolete by the time you've finished building it. The models will improve and soon you'll have infinite context length with blazingly fast Inference, so I would just focus on prompting strategies and wait for the models to catch up. I would recommend finetuning falcon 7.5b, then switch out the model each time a new one is released, but your "scaffolding" will remain the same, that way you don't waste any effort and each new improved model can be "plugged in" and will immediately improve your product
@nattyzaddy6555 Год назад
@@michaelbarry755 I think creating a vector database will also be worth it as AI improve, I think I'll get to work on that and fine tuning a smaller model as well. For now hopefully I can get falcon7b to be a superior coder
@michaelbarry755 Год назад
@@nattyzaddy6555 Yes, vector databases will be very important, especially with regard to grounding and continuous learning. Although the way the research is exploding at the moment i wouldn't be surprised if that become obsolete too, it's so difficult to say at the moment. It's difficult to predict where its going to end up. My "prediction" is this: Everyone will have an AI on their smartphone. It will replace the browser. The internet will become effectively obsolete (websites, not the infrastructure) and all the AI's will be linked up in a peer to peer fashion. Lets say you way to know something about quantum mechanics, no need to search the web, ask your AI, your AI searches for prominent physicists, finds their AI 'address' and then directly asks their AI your question. Every single person in the entire world will have a virtual AI version of themselves and they'll each be able to communicate. Imagine dating. Your AI knows everything about you, does a quick search and finds your soulmate. Imagine democracy... Direct democracy powered by AI. This is bigger than anyone can even begin to imagine.
@SinanAkkoyun Год назад
How many tokens/second can it do on an A100 (or any other gpu)?
@arunachalpradesh399 Год назад
how much v ram needed?
@ryanl5817 Год назад ⁺¹
This is not an open source license
@michaelbarry755 Год назад
Yes it is. The code is open source and it is licenced to you. All open source is the same. Some licences are more or less permissive than others. But it's still open source. If you don't like it you can always pay someone a few million dollars to create one then you can choose the license.
@aidemalo Год назад
*in 10 european languages only
@AndrewQPower Год назад
Nothing wrong with the royalties licensing for over 1M..
@1littlecoder Год назад
That's good thought!
@Ez-se2dl Год назад
Context window?
@1littlecoder Год назад
2048 tokens
@catalindeluxus8545 Год назад
there is no step by step instruction like you said in the vidoe you likned to this one
@rakeshchowdhury202 Год назад
For those who don't know, unity license is hated by gamedev. It's the reason Godot exists.
@1littlecoder Год назад
Does it have the same licensing like this ?
@kalapita226 Год назад ⁺¹
be careful it's hindi
@1littlecoder Год назад
What's Hindi?
@BennySalto Год назад
I will never pay for models. Why?
You can't train a large model without using data you did not own. It is OUR data therefor they are OUR models.
@Imran-Alii Год назад
#1littlecoder@Always First... Great Work....

Следующие

Автовоспроизведение

More POWERFUL Coding AI Launched!!! - StarCoderPlus, StarChat Beta