I used the first AI Software Engineer for a week. This is happening.

Поделиться
HTML-код
  • Опубликовано: 6 янв 2025

Комментарии • 95

  • @tonywhite4476
    @tonywhite4476 8 месяцев назад +19

    It’s not going to replace all software engineers but they won’t need as many.

    • @paulocacella
      @paulocacella 8 месяцев назад

      That is the correct point.

    • @moozooh
      @moozooh 8 месяцев назад +3

      The fact that it's disproportionately disrupting the entry-level jobs first is much more dangerous than simply removing the need for a percentage of jobs per se because it creates a barrier for entry in the profession that will affect every future generation (the further in the future, the more it will), to the point where it's just not financially viable for newcomers to keep investing in it (because you won't get your money back until after you've reached senior level).

    • @manuelmaxgonzalez2432
      @manuelmaxgonzalez2432 8 месяцев назад

      I think it is still a little too early to tell. It depends on how fast this things get better. A drastic improvement in productivity per SWE might enable a lot of proyects that were too expensive before and end up increasing demand. But if this tools improve very fast, then supply will flood demand.

    • @homewardboundphotos
      @homewardboundphotos 3 месяца назад

      more likely it will just mean 10x as much software. don't forget, there's no limit to the number of pieces of software that are made.

  • @pensiveintrovert4318
    @pensiveintrovert4318 8 месяцев назад +4

    Summary: it is not useable for what it was claimed. I have now spent 4 days playing with gpt-pilot with Llama 3 70b. Goes around and around, making mistakes, trying to correct mistakes, them doing this infinitely.

  • @lokeshsharma4177
    @lokeshsharma4177 8 месяцев назад +1

    I second every single word you said. I have Computer Engineering background with 28 years in the industries (although tech part was only for first few years) have seen transformation from OnPrem-SelfService-Cloud journeys and as you say Rightly CHIEF , this is nothing but marketing stunt at this time and an ambition where we (the Human) wanted to be in future. God Bless You

  • @scretney1
    @scretney1 8 месяцев назад

    Thanks, Santiago - excellent review of Devin. Appreciate you.

  • @pabloarroyo7952
    @pabloarroyo7952 8 месяцев назад +2

    Very good video. One to look back to in a couple of years time

  • @inteligenciamilgrau
    @inteligenciamilgrau 8 месяцев назад +5

    The new kind of programmers are the ones who program AIs to program better for us!

    • @ShpanMan
      @ShpanMan 8 месяцев назад +3

      Yea, for 1-2 years. Then AI could do that too.

  • @demianclarke
    @demianclarke 8 месяцев назад +2

    Thanks Santiago for so valuable content. Un abrazo desde Barcelona

  • @villanianalytics
    @villanianalytics 8 месяцев назад +28

    This just goes to show that while AI can complete many tasks, right now there is a huge dependency on the user being knowledgeable about what is being requested. You got as far as you did because you were able to help point the AI in the right direction. Someone with no coding background wouldn't even be able to get a fraction of the progress you were able to get

    • @goatpepperherbaltea7895
      @goatpepperherbaltea7895 8 месяцев назад +6

      Yeah but rn computers take up a large room but one day they’ll fit in your pocket and be thousands of times faster

    • @RavishankarAyyakkannu
      @RavishankarAyyakkannu 8 месяцев назад +1

      The same applies for generative music or image generation. You should be more proficient as an artist or musician to get what you want instead of some random cute generation.

    • @zedmor
      @zedmor 8 месяцев назад +5

      First customers of systems like this would be developers.

    • @Yomi4D
      @Yomi4D 8 месяцев назад

      That's rn. This wi change.

    • @malartbecomes236
      @malartbecomes236 8 месяцев назад +2

      You'd be surprised what beginner coders can get out of models with enough specificity, especially if you provide it with the right context. The issue is that the models aren't adept at finding, or more importantly, recognizing the correct, up-to-date and actionable information via search and RAG; without very specific instructions, they lack the sufficiently complex, robust memory and reasoning skills that humans do. I don't think we are ever going to get to the point where a human can provide a non-specific prompt and have the model intuit exactly what the human left out, unless we do something ludicrous like training models to be lifelong companions and pairing models and humans at birth. The whole approach is wrong.
      We should be encouraging hallucinations and handling them differently. Not sure exactly how, but I know the FLARE framework tries to assess when a model is unsure about a token and uses that as an opportunity to perform RAG generation, but I think a much more effective method would probably be to allow the model to follow the alternative thought path (tangent), with some sort of way to summarize and classify the contents of the tangent, have another model attempt to verify the information, return the model to before the state where the tangent started, inject the information (along with the verification attempt) into some sort of internal thought register, so the model can 'register' the thought without compromising the current output, and then reassess the model's confidence in the next token. I know variations of this are already implemented elsewhere, but I don't think anyone is doing exactly this. It would be sort of similar to the tree of thoughts, but probably more robust, because it would bring up all sorts of other considerations to keep in mind, based on the problems the model ran into on each tangent.
      This would obviously get very expensive so it's probably a crappy idea, but I like thinking of stuff like this.
      I'm a beginner coder, if it wasn't obvious.

  • @javaparainiciantes
    @javaparainiciantes 8 месяцев назад +2

    02:45 - This is Devin - 1st test - mnist digit classification
    04:59 - Devin ask for help
    05:48 - Deploy in heroku
    06:28 - Devin said it deployed but didn't
    07:34 - Completed exercise but many dead code
    09:15 - Second project: tic tac toe
    10:21 - Ask Devin to move the button to below the board
    11:55 - Devin deployed at netlify
    12:05 - The third project: Lunar Lander Project
    13:21 - Devin figured out that he had to migrate the TF version
    15:30 - Impressive but disappointting. Devin broke the code
    16:10 - Python Backend Implementation - take home assessment
    17:54 - Improve the UI
    18:21 - Final Example - RAG Example -Almost worked but he had closed the session
    21:10 - Second try - complete failure
    23:00 - Devin feels very slow
    23:40 - Opinion: Biggest value of Devin
    24:10 - Conclusion

  • @abudhabi9850
    @abudhabi9850 8 месяцев назад +6

    So it kinda can solve somewhat easy problems while the solutions it creates are likely hard to maintainable and change. Nice for solutions which just have to work somewhat, however, when you require certainty you wouldn't want it to write your code.
    Maybe Devin would really benefit from a "project cleanup" command before it delivers a project?

    • @davidcrocombe1322
      @davidcrocombe1322 8 месяцев назад +1

      I think these AI should always do a cleanup automatically, however if they don’t then we need to ask for it as standard procedure.
      Come to think of it, we probably need to be specific about what cleanup we need - remove dead code, runtime performance, human readable code style & comments, dependencies allowed.

    • @carinebruyndoncx5331
      @carinebruyndoncx5331 8 месяцев назад +2

      As soon as you have a 3ork8ng program, tests automated , you can start refactoring and improving, look at the focus area of codium

  • @brucerosner3547
    @brucerosner3547 8 месяцев назад +6

    I think this missies the whole point. Coding is a mechanical process readily automated. Software engineering comprises first generating requirements, that is, defining what is to be done and then selecting the most appropriate solution to meet the requirements. Defining requirement requires knowledge of the problem space not just computer knowledge.

    • @raymond_luxury_yacht
      @raymond_luxury_yacht 8 месяцев назад +1

      Yup. It's concept Vs production. Production is just factory. And only robots work in factories now. Yup. High level conceptual work is the high value for work. Which means you need an imagination, just like Einstein said.

    • @bjrc
      @bjrc 8 месяцев назад

      This is exactly what I've concluded over the past few months. But it applies to many domains, not just coding. Retail for example: existing LLMs can give a lot of high level information about how to optimise a retail organisation, but without being spoon-fed very carefully constructed reports and tools, it won't get anywhere. I hope it will get better with future LLMs, but for now they need a lot of guidance.

  • @doshin2019
    @doshin2019 7 месяцев назад

    Thanks for your review! I have a question regarding the LLM model used. While I understand these models are typically trained on open-source data (please correct me if I'm mistaken), I'm curious about the potential future implications.
    What if, down the line, LLMs are trained on massive amounts of proprietary code? What kind of outcomes might we expect from such a shift? I'm interested in your thoughts on this.

  • @rsivakanth
    @rsivakanth 8 месяцев назад

    Good one Santiago, you put Devin to test, for sure 🙂Albeit, this is reassuring and SW Developers/Engineers aren't at threat, yet ;-) Thanks.

  • @AiNews-dq6ib
    @AiNews-dq6ib 4 месяца назад

    how did you get your hands on devin ?
    in any case it seems to be similar to the chatgpt 4o programming

  • @charith493-4
    @charith493-4 5 месяцев назад

    Thanks a lot for this awesome content❤ It would be super helpful if you could make a video for people starting their IT careers in 2024. Maybe cover what areas they should focus on. Thanks again!

    • @underfitted
      @underfitted  5 месяцев назад

      I recorded a video on how to start. A roadmap. Check my past content.

  • @henrymaddocks984
    @henrymaddocks984 8 месяцев назад +1

    This is a great video. "Some weird things inside" is not OK though. This is why we have senior developers

    • @moozooh
      @moozooh 8 месяцев назад +2

      This is the issue, though. Senior developers didn't start off senior; they were students, then possibly interns, juniors, middles, seniors. If AI disrupts this chain of skill cultivation by removing any need for internment and like 90% of juniors and some middles, how are they going to become seniors in the future? In fact, how would a future software engineer even enter the market and prove their competitive advantage?

  • @wwkk4964
    @wwkk4964 8 месяцев назад +1

    Looks like Devin wrote India's 2019 lunar lander code too, it crashed!

  • @AndrejsKarpovs
    @AndrejsKarpovs 8 месяцев назад +1

    Would definitely use Devin in its current form to boost my learning!

  • @FergusMeiklejohn
    @FergusMeiklejohn 8 месяцев назад

    What did it cost? I remember swyx said that Devin is expensive.. I wonder what the cost/performance would be if it used Llama3 70b through Groq

    • @underfitted
      @underfitted  8 месяцев назад

      I got free access to it.

  • @JD_2020
    @JD_2020 6 месяцев назад +1

    Is this a paid promo?

    • @underfitted
      @underfitted  6 месяцев назад

      No, it is not a paid (or unpaid) promo.

  • @francescociulla
    @francescociulla 8 месяцев назад

    Thanks for sharign Santiago!

  • @jofus521
    @jofus521 8 месяцев назад

    Do you think for the lunar lander, it would be useful to have it write tests first, then refactor the code afterwards? Would it be capable of writing the tests based on its understanding of the code without running it?

    • @underfitted
      @underfitted  8 месяцев назад

      I’m not sure. For the lunar lander, it’s a neural network what powers everything, so it would be very hard to test it with unit tests. More generally, tests can definitely help a tool like Devin

  • @henrymaddocks984
    @henrymaddocks984 8 месяцев назад +1

    After everything you saw I don't get why you think the quality of software will improve using these tools.

    • @underfitted
      @underfitted  8 месяцев назад +1

      Because today is Day 1. How much do you think this will change in 5 years?

    • @raymond_luxury_yacht
      @raymond_luxury_yacht 8 месяцев назад

      What quality software. All the sw I use is crap. Bugs, design issues poor ux, worse ui. It's can't be any worse than the nonsense we already have.

    • @goldmanguyok66292
      @goldmanguyok66292 8 месяцев назад

      @@underfitted you can make agent to remove unused code. agent to judge technology..all your comments in the video are easily fixable. already in 1 year or less it will be perfected

    • @henrymaddocks984
      @henrymaddocks984 8 месяцев назад

      @@raymond_luxury_yacht then make better choices.

  • @germainrodrigue367
    @germainrodrigue367 8 месяцев назад +1

    Santiago, You're amazing 🎉

  • @patrickwhite9902
    @patrickwhite9902 8 месяцев назад

    Soz if I missed it, but what LLM is behind the demo? I think the Devin mechanism is good but it's capability is model bound, right?

    • @underfitted
      @underfitted  8 месяцев назад

      I’m not sure what LLM they use. I don’t know if they disclose that.

    • @24-7gpts
      @24-7gpts 8 месяцев назад +1

      It's GPT 4 Turbo 2024 04 09 version

  • @tomas0413
    @tomas0413 8 месяцев назад

    Hey, Santiago, great video! I’m still on a waiting list for Devin, but I looked at OpenDevin a few weeks ago. It was perhaps a bit too early and I plan to have a look at OpenDevin again. Any thoughts / plans on making a Devin vs OpenDevin comparison?

    • @underfitted
      @underfitted  8 месяцев назад +2

      That’s a good idea!

  • @felixronnoh
    @felixronnoh 8 месяцев назад

    Nice review. Are you the first person to create the lunar lander?

  • @goldmanguyok66292
    @goldmanguyok66292 8 месяцев назад

    add agent to remove unused code
    agent for judging technology, which will be faster and easier
    all your comments are easily solvable

  • @tarekabiramia913
    @tarekabiramia913 8 месяцев назад

    How much time did they take to give you the access ?

    • @underfitted
      @underfitted  8 месяцев назад

      I reached out to them directly on social media. They probably gave me access because I have a large audience.

    • @tarekabiramia913
      @tarekabiramia913 8 месяцев назад

      @@underfitted i highly appreciate your quick reply, so i need to wait in the queue 😅

  • @divyapadhiyar9470
    @divyapadhiyar9470 8 месяцев назад

    What a proper explanation and help us to learn about ai

  • @ndrcntrl
    @ndrcntrl 8 месяцев назад +2

    Excellent, thanks for the detailed preview of Devin! It’s definitely the real deal. Now I can begin to understand the incredible valuation of such an early stage company. So many tasks from my current dev backlog could be assigned to multiple instances of Devin running in parallel. I can dream of being freed up from many of those mundane dev tasks to pursue the fun and interesting aspects of projects with the help of an AI assistant like Devin. Super excited to get access, hopefully in the not too distant future. Great video, love your content 🤩

    • @carinebruyndoncx5331
      @carinebruyndoncx5331 8 месяцев назад +1

      I feel the same way, I think I am going to invest in a multisession setup to multitask with Devin, devika, ... the future of a software engineer desk will look more like a control room I think

  • @greg-guy
    @greg-guy 8 месяцев назад

    Can you share how much you paid for token of each of the projects Devin was working on ?

    • @underfitted
      @underfitted  8 месяцев назад

      I got free access to Devin.

  • @davidcrocombe1322
    @davidcrocombe1322 8 месяцев назад

    It changed your request of recognising 0 to 10 numbers to 0 to 9.

  • @avi7278
    @avi7278 8 месяцев назад

    Can you ask Devin to integrate Branch deep linking into a cross platform flutter application for ios, Android and macos? Their documentation is notoriously sh** and i want to see how's it handles it. I must admit that your example are closer to real world tasks thank most people out here trying to hype this thing, which is something that has always bothered me. The people trying it seem to have little to no real professional development experience. I'm not looking for a junior dev that i have to babysit.

  • @BhargavSolankisolankibhargav
    @BhargavSolankisolankibhargav 8 месяцев назад

    do you usually always ask remarakbly and grammatically correct prompts?

    • @underfitted
      @underfitted  8 месяцев назад +1

      Only when I’m drunk

  • @tsaminamina_eheh
    @tsaminamina_eheh 8 месяцев назад

    Do they use their own LLM or an existing one under the hood?

    • @riderjohnny5117
      @riderjohnny5117 8 месяцев назад

      They use GPT-4

    • @underfitted
      @underfitted  8 месяцев назад

      Personally, I don’t know.

    • @raymond_luxury_yacht
      @raymond_luxury_yacht 8 месяцев назад

      It's all about the fine tune. I expect ppl are working on really getting specific models expert in specific languages to write apps for particular contexts.

  • @24-7gpts
    @24-7gpts 8 месяцев назад

    Awesome video!

  • @middle-agedmacdonald2965
    @middle-agedmacdonald2965 8 месяцев назад

    Thanks, first video I've seen. I don't share your optimism about the future. The idea is to eliminate paying for labor, or to get it as cheaply as possible.
    We're all guilty of wanting things cheap, so it's all of our faults.

  • @ShpanMan
    @ShpanMan 8 месяцев назад +2

    Haha, you are in the right direction but you don't appreciate how much smarter than humans AI will be in the coming years.
    There will be no need for a human anywhere in the flow (well except for setting the goal). Give me an example of something a human would be needed for and recognize that future AI will do that faster, better, and cheaper.
    Devin is just the beginning, it's cool, but you did recognize that improvements are a simple action of replacing the brain behind it with the smarter model - that's it.
    The singularity is near.

  • @prasadghumare
    @prasadghumare 8 месяцев назад

    Amazing!

  • @T___Brown
    @T___Brown 8 месяцев назад

    now you know how frustrating it is to be a BA. lol maybe devin should fix the BA first.

  • @dfsadsaaad
    @dfsadsaaad 8 месяцев назад

    Well done. However, these tools will only improve over time, and eventually, humans will not need to write the code; they will only need to test it, assess it, and determine its usefulness. I have been writing code for more than 30 years and in more industries and more applications. Easy software jobs will disappear and only real "engineers" will remain. Self-taught techies or graduates from code academies should think about the trades. MORE software will not be needed in the future. AI's will do all of this on the fly on demand within 5 years. I have built a Devin-like system with CrewAI and it works better than Devin. Wait until GPT5.

  • @raymond_luxury_yacht
    @raymond_luxury_yacht 8 месяцев назад

    The web is dead that god. The future will be publishing content is uploading data to an embedding model which is borged into llm for rag. The output will be generated on the fly to suit the question. Eg spoke , generated video, text, music etc no more web interfaces. Web designers better start retraining.

  • @surajm.s8561
    @surajm.s8561 8 месяцев назад

    thats a lot of tokens

  • @robertosolari__
    @robertosolari__ 8 месяцев назад +2

    So guys, learn maths, learn code, learn AI...

    • @ShpanMan
      @ShpanMan 8 месяцев назад +3

      More like plumbing..

    • @robertosolari__
      @robertosolari__ 8 месяцев назад

      @@ShpanMan yeah, also. I was thinking about agriculture

    • @chimwemwechinamale6716
      @chimwemwechinamale6716 8 месяцев назад

      You really don't need to learn AI no need for that only a handful of individuals mostly in research and at big corps matter

    • @EduardsRuzga
      @EduardsRuzga 8 месяцев назад

      AI will do math. AI will do code. AI will even do parts of entrepreneurship. AI already does AI :D Aka generates evals, synthetic data sets, picks model to fine tune, does that, runs evals, picks winners :D
      But like with Devin, question is on speed, price. Some of the tasks like even large software engineering needs to deal with a lot of uncertainties, UX, GDPR, a lot of random variables from hardware to infra, to OS to software and frameworks and dependencies in the project and with 3rd party modules. There is a lot of work to go trough even for team of expert humans.
      I do think we will get there in next 5 years. It feels like we are moving hard from imperative to declarative, and not in coding. Its about figuring out what to do, not how.
      Question then is, what is not a commodity, where costs are, what is valuable.
      Chips, compute, energy? Intelligence will be exchangeable for those. Kinda like now you can spend money to get back time by making other humans do things.
      There are things AI will not change though. Like it can't do anything about land. Land is finite resources and we care more about some places then others, and AI will not change that drastically. Aka AI will not change laws of physics.
      Weird times. I wonder if we can get to net 0 where tech can allow to get basic necessities close to 0 like food/shelter/health/education.

  • @chudchadanstud
    @chudchadanstud 4 месяца назад

    I stopped watching at "Take home assignment". I simply don't support this practice.