I just received my invite to Devin. The cheapest plan offered "Personal (Devin Lite) users receive early access to Devin Lite for $50 / month, which includes 65 Devin Lite ACU / month built in. Additional Devin Lite ACUs can be purchased at our standard unit rate of $0.8 / ACU. Currently, each ACU is approximately equivalent to 10 minutes of active Devin Lite work." I'm having difficulty finding more information, but it seems to me, for $50 I get 650 minutes of computing. Looking at the lengths of time reported by Zack this seems like a very poor offer.
Chapters (Powered by ChapterMe) - 00:00 - Devon AI agent that claims to be worlds first fully autonomous 02:38 - Devons app turns world into museum 03:30 - Devon App builder with questions, planning, updates 05:20 - Android app asks users for help 06:08 - Deployment time 134 08:29 - Twohour CSS change to add features 08:51 - Devon walkthrough reveals power of commands 09:25 - Devon Excellent prototyping agent, impeccable UX 12:49 - AI software engineer Devons slow performance 14:59 - Devons powerful features, sign up now 15:22 - Lifted access for players
if it's a fraud, no venture capitalist would invest in it. Thus it's just a matter of time till it crash. And if it's the case, in the future, other people would try to get this idea off the ground because it's not impossible to do. Coding is less complex than human emotions so this is not impossible to do
Did you review the code to look for flaws in its functions logic? Did you extensively test the results it produced to make sure it was not bugged? I have yet to see a LLM produce code that isn't flawed. Either by being it being stupidly overengineered, really poorly structured in terms of performance hits (like for loops meant for huge lists with DOM calls within things like dragover, not using any cached data), or simply not covering all use cases. I also see LLMs typically get stuck on a "solution" they think is correct, even if you tell them to start from scratch, and your only option is to initiate a new instance of chatting with them to make it change their approach. LLMs also seem to lack the capabilities to grap simple logic that is observable to us humans, like try to ask it to figure out the next number in a sequence of numbers and sometimes it will get it right, but once you start getting more complicated, it will go completely off track and not understand the basic observable logic. This is obviously why ChatGPT and so on produce such flawed code usually. How did Devin perform in terms of actual good code? With good I mean stable, bug free and performance-focused. I couldn't care less about readable code if AI is writing and able to parse it. Speaking of, can it parse complex code and understand how it works? LLMs usually in my experience can manage to grasp the overall use for the code it is asked to analyze, but won't understand (again) some of the logical reasons for certain parts of the code. Unless Devin can produce good code, I see it as no better than any other LLM option.
So maybe it felt different using it, but it looked pretty horrible. I think most devs with Co-Pilot could do this infinitely faster. Not to mention remember the solution and re-impelement similar projects very easy in the future. The entire promise of Devin was a done for you software engineer. This looked horribly ineffficient. And if it needs this much input from a dev... Then why can't the dev just use co-pilot to implement it himself?
Sure, but non technical managers and execs would view this as a one less team to pay salaries and benefits on. Sure, only awful managers and execs would think this way, but let's be real, that's the majority of managers and execs.
The big advantage is it mostly works by itself and you don't have to pay it a salary. Alsooo, real devs also can take a long time to do seemingly simple things, especially when it involves weird bugs, strange css, or they have to figure out the design mostly by themselves (which was the case in this video from what I could tell)
@@spaceowl5957 lmao you do have to pay it a salary. Running models aren't free bro. At least as of now, its causing a lot more issues than its solving.
@@spaceowl5957Maybe you should learn the numbers. To run this bad model 24/7 for a year costs ~ $2,628,000. To execute a LangChain prompt (chains together multiple models and double checks work similar to what Devin is doing) a prompt of "What is the 23rd episode of Spongebob" just cost $4.50 on AWS Bedrock and took 1 minute and 20 seconds. Imagine how much money its costing to run this thing for 8 hours per task on a lot more complicated data. (And then get it wrong lol).
There were 2 things I wanted to see about Devin: How smart - which I guess is not amazing? I'm not sure how buggy the final product is, but it seemed to me like it ran into issues and required intervention. In the end, the main thing I guess is that it can do stuff, but if it's not as smart as Claude or GPT, then I might as well just copy and paste from the smarter llm instead of waiting for a dumber one automatically do it for me Basically, if you need to solve a coding problem, it does not seem like Devin is the way to do it How big is the context window - not sure from this I guess. Current problem with llm is that it's hard to have an entire project as context, so you have to find where to fix/add something and give them the info. I doubt Devin solved this, so I kind of want to see it given/generate a big project (at least bigger than the regular llm context windows) and told to fix something and see how it handles that. If those 2 things fail, then Devin is more or less a convenience thing - an AI that automatically runs what it generates, reads the error, and reprompts itself. I mean, these things already existed with AutoGPT and other stuff for a while now, so I'm not too invested, especially considering I can just run the llm generated code myself and give them the errors. So basically, it seems to boil down to convenience. As you said, it could take hours and give out a terrible buggy mess, but at least you didn't spend time on it. But if you truly want an actual product, it seems using smarter llms is still the way to go.
really great points. i agree - if you're stuck on a specific coding problem, it's probably a waste of time to use Devin versus trying to debug it yourself with Claude/ChatGPT/Cursor. after having access to Devin for a few weeks, I actually found myself using it less and less, which reinforces your point about convenience. it was also just hard to make a habit of opening Devin anytime I wanted to work on my projects.
@@wenquai then why did you say it is legit? I think you should take down this video or make another video about this. you shouldn't be misguiding people like this.
Why everyone who got access to Devin never shows real time interaction with Devin? Probably because it will revile how capable this ChatGPT wrapper is........
Actually i think the opposite! I think developers/aspiring developers can benefit from Devin the most. Plus, observing Devin as it works is a great way to learn a new programming language or framework
@@axelvirtus2514 no one needs juniors, before and now with AI, companies employ juniors in hopes that they educate them and they stay in the company for a long time. So nothing changes.
@kubakakauko Tool make developer faster -> More work done with same amount of developer -> Fire excess developer to save money -> Less job for developer
I just received my invite to Devin. The cheapest plan offered "Personal (Devin Lite) users receive early access to Devin Lite for $50 / month, which includes 65 Devin Lite ACU / month built in. Additional Devin Lite ACUs can be purchased at our standard unit rate of $0.8 / ACU. Currently, each ACU is approximately equivalent to 10 minutes of active Devin Lite work." I'm having difficulty finding more information, but it seems to me, for $50 I get 650 minutes of computing. Looking at the lengths of time reported by Zack this seems like a very poor offer.
Chapters (Powered by ChapterMe) -
00:00 - Devon AI agent that claims to be worlds first fully autonomous
02:38 - Devons app turns world into museum
03:30 - Devon App builder with questions, planning, updates
05:20 - Android app asks users for help
06:08 - Deployment time 134
08:29 - Twohour CSS change to add features
08:51 - Devon walkthrough reveals power of commands
09:25 - Devon Excellent prototyping agent, impeccable UX
12:49 - AI software engineer Devons slow performance
14:59 - Devons powerful features, sign up now
15:22 - Lifted access for players
Are you sure it is not like the Amazon AI, a bunch of real people behind the scenes hahaha, it seems too slow for an AI
Good point , Amazong ciborgs 😅😅
if it's a fraud, no venture capitalist would invest in it. Thus it's just a matter of time till it crash. And if it's the case, in the future, other people would try to get this idea off the ground because it's not impossible to do. Coding is less complex than human emotions so this is not impossible to do
@@GrowAndScaleSOLUTIONhave you not seen the Elizabeth Holmes case? VCs invest in fraud and scams all the time
@@GrowAndScaleSOLUTION human emotions are not complex, they are insanely easy to manipulate
@@rjackstheartofwealth6152 not all people are easy to manipulate. Try doing sales and you will understand. Try both inbound and outbound sales
Awesome to see you're back, love the overview
I would love to see a video of Devin working in an existing project.
I'd love to see it contribute to an existing code base maybe try it on projects of different sizes/complexities
Thank you, hojestly wild how this is the only review on devin, the rest are just predicitions lol.
fingers crossed that I get in and try it out.
Did you review the code to look for flaws in its functions logic? Did you extensively test the results it produced to make sure it was not bugged? I have yet to see a LLM produce code that isn't flawed. Either by being it being stupidly overengineered, really poorly structured in terms of performance hits (like for loops meant for huge lists with DOM calls within things like dragover, not using any cached data), or simply not covering all use cases. I also see LLMs typically get stuck on a "solution" they think is correct, even if you tell them to start from scratch, and your only option is to initiate a new instance of chatting with them to make it change their approach. LLMs also seem to lack the capabilities to grap simple logic that is observable to us humans, like try to ask it to figure out the next number in a sequence of numbers and sometimes it will get it right, but once you start getting more complicated, it will go completely off track and not understand the basic observable logic. This is obviously why ChatGPT and so on produce such flawed code usually. How did Devin perform in terms of actual good code? With good I mean stable, bug free and performance-focused. I couldn't care less about readable code if AI is writing and able to parse it. Speaking of, can it parse complex code and understand how it works? LLMs usually in my experience can manage to grasp the overall use for the code it is asked to analyze, but won't understand (again) some of the logical reasons for certain parts of the code. Unless Devin can produce good code, I see it as no better than any other LLM option.
btw,i am still have no chance to use it,could you tell me how do you get this access to use?
Great overview. I really appreciated this video.
Have you tried Devika?
I'm the only one who doesn't have access?
it looks like they're slowly starting to let more people in via the waitlist
Time to flip burgers now
Nope.
Hi, Zack! How quickly did you get off the waiting list?
about 4 months
Did this use gpt-4o?
Devin uses Indian developers in the backend to confirm the output. This is why it took like 3hours
not sure what model they're using! they dont seem to disclose it anywhere
@@wenquai Can you ask Devin? The API Gpt-4o answers that it is gpt-4-turbo while the previous versions don't know their model version.
Can it Code games ?
How install devon on win
So maybe it felt different using it, but it looked pretty horrible. I think most devs with Co-Pilot could do this infinitely faster. Not to mention remember the solution and re-impelement similar projects very easy in the future.
The entire promise of Devin was a done for you software engineer. This looked horribly ineffficient. And if it needs this much input from a dev... Then why can't the dev just use co-pilot to implement it himself?
Sure, but non technical managers and execs would view this as a one less team to pay salaries and benefits on. Sure, only awful managers and execs would think this way, but let's be real, that's the majority of managers and execs.
The big advantage is it mostly works by itself and you don't have to pay it a salary.
Alsooo, real devs also can take a long time to do seemingly simple things, especially when it involves weird bugs, strange css, or they have to figure out the design mostly by themselves (which was the case in this video from what I could tell)
@@spaceowl5957 lmao you do have to pay it a salary. Running models aren't free bro. At least as of now, its causing a lot more issues than its solving.
@@wonderfulworldofmarkets9033 I mean I don't know the numbers but I think this will be magnitudes cheaper per hour of work compared to a human
@@spaceowl5957Maybe you should learn the numbers. To run this bad model 24/7 for a year costs ~ $2,628,000. To execute a LangChain prompt (chains together multiple models and double checks work similar to what Devin is doing) a prompt of "What is the 23rd episode of Spongebob" just cost $4.50 on AWS Bedrock and took 1 minute and 20 seconds. Imagine how much money its costing to run this thing for 8 hours per task on a lot more complicated data. (And then get it wrong lol).
There were 2 things I wanted to see about Devin:
How smart - which I guess is not amazing? I'm not sure how buggy the final product is, but it seemed to me like it ran into issues and required intervention.
In the end, the main thing I guess is that it can do stuff, but if it's not as smart as Claude or GPT, then I might as well just copy and paste from the smarter llm instead of waiting for a dumber one automatically do it for me
Basically, if you need to solve a coding problem, it does not seem like Devin is the way to do it
How big is the context window - not sure from this I guess. Current problem with llm is that it's hard to have an entire project as context, so you have to find where to fix/add something and give them the info. I doubt Devin solved this, so I kind of want to see it given/generate a big project (at least bigger than the regular llm context windows) and told to fix something and see how it handles that.
If those 2 things fail, then Devin is more or less a convenience thing - an AI that automatically runs what it generates, reads the error, and reprompts itself. I mean, these things already existed with AutoGPT and other stuff for a while now, so I'm not too invested, especially considering I can just run the llm generated code myself and give them the errors.
So basically, it seems to boil down to convenience. As you said, it could take hours and give out a terrible buggy mess, but at least you didn't spend time on it. But if you truly want an actual product, it seems using smarter llms is still the way to go.
really great points. i agree - if you're stuck on a specific coding problem, it's probably a waste of time to use Devin versus trying to debug it yourself with Claude/ChatGPT/Cursor. after having access to Devin for a few weeks, I actually found myself using it less and less, which reinforces your point about convenience. it was also just hard to make a habit of opening Devin anytime I wanted to work on my projects.
@@wenquai then why did you say it is legit? I think you should take down this video or make another video about this. you shouldn't be misguiding people like this.
great video, the fan noise is slightly annoying though :-)
ty for the kind words! sorry about the fan - will fix in future vids!
Why everyone who got access to Devin never shows real time interaction with Devin? Probably because it will revile how capable this ChatGPT wrapper is........
will make a follow up vid that goes through a full run!
Oh no somebody is so insecure and being self denial here
So if i wanna be developer I'm fucked,ai can do all for me
Actually i think the opposite! I think developers/aspiring developers can benefit from Devin the most. Plus, observing Devin as it works is a great way to learn a new programming language or framework
@@wenquai maby it's a plus for experienced developers,no one want juniors now
@@axelvirtus2514 no one needs juniors, before and now with AI, companies employ juniors in hopes that they educate them and they stay in the company for a long time. So nothing changes.
@kubakakauko Tool make developer faster -> More work done with same amount of developer -> Fire excess developer to save money -> Less job for developer
Its ok bro just do it for free if its your passion
Davis Michael Martinez Margaret Thomas Barbara
I’m cooked
No, you're not.