Видео 64
Просмотров 199 721

DeepSeek R1 vs OpenAI O1 & Claude 3.5 Sonnet - Hard Code Round 1

7:18

Cursor vs Cline | 240k Tokens Codebase Side-by-Side AI Coding Battle

17:25

Aider vs Cline Using DeepSeek 3: Codebase 20k Lines

29:40

Aider + Deepseek 3 vs Claude 3.5 Sonnet

24:02

Aider + Gemini 2 (Exp) versus Claude 3.5 Sonnet (AI Coding King!)

25:44

🚀 AIDER + Gemini 2 Flash vs Claude 3.5 Haiku (CODE TEST!) BEST Small LLM

16:06

NEW Gemini 2 Flash Thinking 0121 | First Impressions (Coding vs DeepSeek R1)

In this video, we delve into the capabilities of Google's Gemini 2.0 Flash Thinking model and compare it with DeepSeek's R1 in the realm of coding. Gemini 2.0 Flash Thinking is an experimental AI model that explicitly showcases its thought process, enhancing transparency and understanding in problem-solving.
On the other hand, DeepSeek's R1 is an open-source AI model developed by the Chinese startup DeepSeek, demonstrating impressive reasoning capabilities, particularly in coding and mathematical tasks.
Join us as we explore their coding proficiencies, evaluate their performance, and discuss their potential impact on developers and the AI community. Don't forget to like, comment, and subs...

Видео

DeepSeek R1 vs OpenAI O1 & Claude 3.5 Sonnet - Hard Code Round 1

7:18

DeepSeek R1 vs OpenAI O1 & Claude 3.5 Sonnet - Hard Code Round 1

Просмотров 38 тыс.День назад

DeepSeek R1 has emerged as a formidable contender, utilizing pure reinforcement learning to match, and in some cases surpass, the performance of OpenAI's O1, all while operating at 95% less cost. We test its coding abilities against the best SOTA coding LLMs! Join us as we delve into a series of complex coding challenges to evaluate which AI model stands supreme in terms of efficiency, accuracy...

Cursor vs Cline | 240k Tokens Codebase Side-by-Side AI Coding Battle

17:25

Cursor vs Cline | 240k Tokens Codebase Side-by-Side AI Coding Battle

Просмотров 20 тыс.14 дней назад

🚀 In this video, we use a 240000 token codebase to compare two top notch AI Coding tools against each other: Cursor and Cline. Watch as we compare their features, performance, and usability to determine which one stands out in the realm of AI-assisted coding. 0:00 Use case Intro 01:00 Bug Fix 1 04:31 Bug Fix 2 08:28 Main AI Feature! 15:52 Main Feature Showcase 16:53 Cline Costs Cursor is an AI-...

Aider vs Cline Using DeepSeek 3: Codebase 20k Lines

29:40

Aider vs Cline Using DeepSeek 3: Codebase 20k Lines

Просмотров 14 тыс.21 день назад

🚀 Aider vs Cline Using DeepSeek 3: Open Source AI Model Testing The two best AI Coders are compared in a 20k Lines of Code Codebase. Discover how they handle a large codebase and which one would work for you. Test and compare the functionality of Aider and Cline 3.0 with the powerful DeepSeek 3 model. This video dives deep into open-source solutions for local AI development, enabling efficient ...

24:02

Aider + Deepseek 3 vs Claude 3.5 Sonnet

Просмотров 22 тыс.Месяц назад

🚀 Aider DeepSeek 3 vs. Claude 3.5 Sonnet In this video, we delve into the capabilities of Aider, a powerful AI code editing tool, when paired with DeepSeek 3 and Claude 3.5 Sonnet. We'll explore how these combinations enhance coding efficiency, accuracy, and cost-effectiveness. Key Highlights: DeepSeek 3 Integration: DeepSeek 3, an open-source AI model, has demonstrated performance comparable t...

Aider + Gemini 2 (Exp) versus Claude 3.5 Sonnet (AI Coding King!)

25:44

Aider + Gemini 2 (Exp) versus Claude 3.5 Sonnet (AI Coding King!)

Просмотров 4,9 тыс.Месяц назад

In this video, we delve into a head-to-head comparison of two leading AI coding LLMs: Aider integrated with Gemini 2 Experimental versus Aider Integrated with Claude 3.5 Sonnet. We'll explore their coding capabilities, performance benchmarks, and real-world applications to determine which stands as the ultimate AI Coding King! 👑 0:00 Introduction 0:58 Aider Setup 1:38 Tetris 5:38 Edit Code: Pom...

🚀 AIDER + Gemini 2 Flash vs Claude 3.5 Haiku (CODE TEST!) BEST Small LLM

16:06

🚀 AIDER + Gemini 2 Flash vs Claude 3.5 Haiku (CODE TEST!) BEST Small LLM

Просмотров 1,1 тыс.Месяц назад

🚀 In this video, we put two leading small language models (LLMs) to the test: Google's Gemini 2.0 Flash and Anthropic's Claude 3.5 Haiku. 🖥️💡 0:00 Introduction 0:30 Aider Config 0:49 Tetris 4:12 Pomodoro Full Stack 10:31 Product Hunt clone 13:02 .NET URL Shortener API 15:40 Conclusion and Summary Key Highlights: Performance Benchmarks: Gemini 2.0 Flash delivers impressive results, rivaling even...

DeepSeek V2.5 1210 NEW Open Source LLM FULLY TESTED

16:24

DeepSeek V2.5 1210 NEW Open Source LLM FULLY TESTED

Просмотров 799Месяц назад

Introducing DeepSeek V2.5 (Build 1210) 🦾🚀 - the latest evolution in open-source Large Language Models (LLMs) from DeepSeek. In this video, we’ll dive deep into the newest features, performance benchmarks, and integration capabilities of our most robust release yet. Whether you’re a developer looking to enhance your apps with natural language understanding or a researcher pushing the boundaries ...

15:27

Amazon A.I. Nova Pro TESTED

Просмотров 1,2 тыс.Месяц назад

In this video we'll be testing Amazon's recently released flagship LLM model Nova Pro. We test it in Aider, in the playground, in Math, general knowledge and in reasoning. They mention that it is on par with other State of the Art models like GPT4o and Claude 3.5 Sonnet. 0:00 Intro 0:17 Nova Variants 0:38 Nova Bench vs Claude vs GPT4o 0:51 Logic and Reasoning Tests 11:13 Aider Tests X: x.com/ma...

4:51

Windsurf Image Upload vs Cursor

Просмотров 4082 месяца назад

Codeium Windsurf Image Upload vs Cursor: Which One's Right for You? In this video, we’re diving into the features of Windsurf Image Upload and Cursor Image Upload, breaking down their functionality and how they can improve your workflow or user experience. Whether you’re a developer or just love a bit of customization, this comparison is for you! 💻🎨 Windsurf Image Upload Windsurf, Codeium's cut...

Qwen 2.5 Coder 32b + Aider | Desktop App + Database

14:30

Qwen 2.5 Coder 32b + Aider | Desktop App + Database

Просмотров 1,4 тыс.2 месяца назад

In this video we use Qwen 2.5 Coder 32B to create a Cross-Platform Productivity Desktop Application step-by-step. We use the Aider AI Pair Programmer and use a SQLite database for persistence. GitHub Repo: github.com/marvijo-code/pomodoro-desktop Reddit channel: www.reddit.com/user/marvijo-software

Cursor vs Windsurf: AI Editor ULTIMATE TEST!

20:34

Cursor vs Windsurf: AI Editor ULTIMATE TEST!

Просмотров 2,8 тыс.2 месяца назад

🚀 In this video, we conduct an in-depth comparison between two leading AI-powered Integrated Development Environments (IDEs): Cursor and Windsurf. 🤖💻 We evaluate their features, performance, and user experience to determine which IDE offers superior coding assistance. ⚡ ✨ Key Highlights: 🔍 Feature Comparison: Analyze the unique functionalities of Cursor and Windsurf, including AI-driven code su...

13:42

New ChatGPT vs. Human: Insane CHESS

Просмотров 432 месяца назад

In this video, we test the strategic power of the latest ChatGPT against a human opponent in a competitive chess game. Can AI match the intuition and experience of a seasoned player? 🎯 What’s Inside: Clever moves and counter-strategies Key moments that could change the game A closer look at how AI approaches classic board games Perfect for chess enthusiasts and anyone curious about the future o...

Aider + Qwen 2.5 Coder 32B vs Claude 3.5 Sonnet (NEW)!

12:52

Aider + Qwen 2.5 Coder 32B vs Claude 3.5 Sonnet (NEW)!

Просмотров 2,1 тыс.2 месяца назад

In this video, we conduct a comprehensive comparison between Aider integrated with Qwen 2.5 Coder and the newly released Claude 3.5 Sonnet. Qwen 2.5 Coder, developed by the Qwen Team, is a scalable model tailored for coding tasks, supporting up to 128K tokens and offering variants like Qwen2.5-Coder and Qwen2.5-Math. It has demonstrated strong performance in benchmarks such as HumanEval and MAT...

NEW Claude 3.5 Haiku Code Test vs 3.5 Sonnet

10:48

NEW Claude 3.5 Haiku Code Test vs 3.5 Sonnet

Просмотров 1,5 тыс.2 месяца назад

Anthropic just dropped their new flagship small model, Claude 3.5 Haiku! They claim it's better than OpenAI's flagship GPT-4o. But is it really? 🤔 In this thrilling showdown, we put Claude Haiku 3.5 head-to-head against Claude 3.5 Sonnet to see which model excels in coding and creativity. Watch as we dive into real coding challenges and compare their strengths, weaknesses, and unique approaches...

9:12

Google vs ChatGPT Search vs Perplexity

Просмотров 5953 месяца назад

Google vs ChatGPT Search vs Perplexity

Aider + NEW Claude 3.5 Sonnet FULL Stack Skill Sharing App | ReactJS ExpressJS + MongoDB

20:33

Aider + NEW Claude 3.5 Sonnet FULL Stack Skill Sharing App | ReactJS ExpressJS + MongoDB

Просмотров 8093 месяца назад

Aider NEW Claude 3.5 Sonnet FULL Stack Skill Sharing App | ReactJS ExpressJS MongoDB

FREE LLM Arena VSCode Extension! Best Copilot Replacement

3:45

FREE LLM Arena VSCode Extension! Best Copilot Replacement

Просмотров 9743 месяца назад

FREE LLM Arena VSCode Extension! Best Copilot Replacement

Nvidia Llama 3.1 Nemotron Code Tested in Cline

12:29

Nvidia Llama 3.1 Nemotron Code Tested in Cline

Просмотров 7943 месяца назад

Nvidia Llama 3.1 Nemotron Code Tested in Cline

Aider AI Agent Full Stack AI News App ASP.NET + Vite

14:46

Aider AI Agent Full Stack AI News App ASP.NET + Vite

Просмотров 3023 месяца назад

Aider AI Agent Full Stack AI News App ASP.NET Vite

Aider AI Coding Installation + Claude 3.5 Sonnet + Terminal Pacman

2:44

Aider AI Coding Installation + Claude 3.5 Sonnet + Terminal Pacman

Просмотров 4593 месяца назад

Aider AI Coding Installation Claude 3.5 Sonnet Terminal Pacman

2:43

Official OpenAI library for .NET

Просмотров 2713 месяца назад

Official OpenAI library for .NET

AI News OpenAI TOTALLY Free O1-mini | Anthropic Contextual Retrieval RAG

4:38

AI News OpenAI TOTALLY Free O1-mini | Anthropic Contextual Retrieval RAG

Просмотров 3153 месяца назад

AI News OpenAI TOTALLY Free O1-mini | Anthropic Contextual Retrieval RAG

8:14

This AI Tool is Next Level Learning!

Просмотров 264 месяца назад

This AI Tool is Next Level Learning!

Tutorial: Azure Pipelines CI/CD from Scratch!

13:32

Tutorial: Azure Pipelines CI/CD from Scratch!

Просмотров 22410 месяцев назад

Tutorial: Azure Pipelines CI/CD from Scratch!

Tutorial: .NET Aspire Local Orchestration Multi-Repo | Ep03

10:32

Tutorial: .NET Aspire Local Orchestration Multi-Repo | Ep03

Просмотров 934Год назад

Tutorial: .NET Aspire Local Orchestration Multi-Repo | Ep03

Multi-Repo Microservice Communication using MassTransit + .NET 8 + Docker + RabbitMQ

8:10

Multi-Repo Microservice Communication using MassTransit + .NET 8 + Docker + RabbitMQ

Просмотров 661Год назад

Multi-Repo Microservice Communication using MassTransit .NET 8 Docker RabbitMQ

Tutorial: Deploy New .NET 8 Aspire Stack to Azure

7:58

Tutorial: Deploy New .NET 8 Aspire Stack to Azure

Просмотров 315Год назад

Tutorial: Deploy New .NET 8 Aspire Stack to Azure

A C# Clean Architecture Production Template

16:35

A C# Clean Architecture Production Template

Просмотров 81Год назад

A C# Clean Architecture Production Template

Deploy Docker Images to Azure Container Instance in a FLASH

12:12

Deploy Docker Images to Azure Container Instance in a FLASH

Просмотров 918Год назад

Deploy Docker Images to Azure Container Instance in a FLASH

@railh7566 2 часа назад
Subbed, very usefull info, thank you for your work
@Jadestonk 3 часа назад
anything built on top of vscode can only be crap
@serhiikrechko День назад
Thanks for the great work. Windsurf is a highly intriguing staff. Would it be possible for you to conduct a comparison between Windsurf and Aider?
@MarvijoSoftware 5 часов назад
I'll get to it, I've compared Cursor to Windsurf in the meantime: ruclips.net/video/duLRNDa-CR0/видео.html
@makers_lab День назад
Thanks for the testing. Also just went to your windsurf video, and tbh. your own voice and natural delivery is better than the AI (web "scrapping" lol), though I get why some would find the AI voice clearer. Would be good to at least use a voice with a UK accent though! The new Kokoro models are stellar, and small enough to run locally.
@mad00insane 2 дня назад
trae ai?
@MarvijoSoftware 6 часов назад
I've queued to cover it, thanks for the suggestion
@rahuldinesh2840 2 дня назад
Why not use both simultaneously?
@MarvijoSoftware 6 часов назад
I sometimes do, I wanted to compare them for people who want to choose
@rahuldinesh2840 2 часа назад
@@MarvijoSoftware Yeah. Good job. I think it will be a good idea to test all LLM'S and IDE's with a single heavy prompt for which you know what the output looks like.
@yoda_zen 2 дня назад
You should be using cline with architect mode with R1 for the plan and sonnet for the response. Or use sonnet for architecting then executing... The way you are using cline is not the best approach.
@MarvijoSoftware 6 часов назад
I do, I'll cover the video
@yoda_zen 3 часа назад
@@MarvijoSoftware Great! There are other things to consider when using cline, such as MCP servers, etc. Take a look at these things so you test cline at full potential vs cursor.
@moidrugag 4 дня назад
Can you compare OpenHands with the agents you've already tested? I’d like to see how it stacks up against them using DeepSeek V3 or R1.
@MarvijoSoftware 4 дня назад
@@moidrugag okay
@kabhiaao128 5 дней назад
Very informative videos all' time thankyou brother 👍❤️
@adroid27 5 дней назад
@3:11 Would you actually run a big python script generated by AI like this immediately without reviewing it? Feels like a hack waiting to happen. Or maybe you are in a VM / didn't show the review part?
@MarvijoSoftware 5 дней назад
@@adroid27 I'd review it every time
@shadowlegend9751 6 дней назад
I just subscribeed for this, this right here is a gem! 💎
@MarvijoSoftware 6 дней назад
@@shadowlegend9751 thank you!
@Funky028 6 дней назад
Capitalism has high-quality products, and socialism has prices that everyone can afford. No wonder the ugly American capitalists and politicians hate China so much.
@gayashanperera-w2e 6 дней назад
so called fake model now performs better than the original one
@Kevin-vf5wv 6 дней назад
for general questions, nothing beats chatgpt
@MarvijoSoftware 6 дней назад
You reckon? Questions in which field? Did you try DeepSeek web search?
@jeffsteyn7174 7 дней назад
The thing is that its not really a bug. Its a flow issue, so you need to explain what flow you want. You Told it the navigation is not showing betslip but a blank page with a hamburger menu. You actually tell it that currently i click on the nav and a blank page opens then i have to click a menu button to show the betslip. But What i want is the betslip to show immediately after clicking the nav. The less poweful a model is the more detail and leading you need to do
@dabaowang4032 7 дней назад
canceled openai subscription
@supplychainanalytics9114 7 дней назад
I have been playing with writing apps mainly with sonnet for many months. I am not an experienced coder, but i have a computer engineering degree but never coded professionally. I find abstract tests like these are hard to judge the AI for non experienced coders. Start out with clear english instructions, of a subset of the project i have in mind, then i try to build functionality fixing and updating. I have about 4 pages of what i expect in my profile. I have gotten the best results and most complete code base with Sonnet, typical applications may have around 2k lines of code in all. It becomes impossible after a while, and you have to plan to stop a session and hand over to another session. I have been using Deepseek for several days, and its definitely better than any of the gpt models, but i get the best results with sonnet. I have had a 40 yr career in oil and gas exploration, and it is fascinating that i can create solutions for all the ideas i have struggled with over the past decades. I have hardrives full of data and reports, and its just fascinating. At times i would use both gpt and cluade, (mainly to help Claude solve problems he cant find the solution to because his API specs are out of date. Troubleshooting is by far the worse. I think these tests give the illusion that it can do very well. Once the project gets over a size, it does nonsense - there is a cliff it fall off.
@KashifKhan-iw2ns 7 дней назад
Now I know why DeepSeek is so hated, it outperformed two of the most powerful ai without even flinching.
@geografiainfinitului 5 дней назад
It even started to be attacked in the last hours, hmm wonder why!
@SurvivalKompass 7 дней назад
Thanks!
@MosesMatsepane 8 дней назад
Claude Sonnet is the best at building medium to large code bases. I haven't found anything on the market that comes close. I will use Deepseek to QA my Claude codebase just to see how well it does. Metrics and real world usage are two totally different things.
@MarvijoSoftware 8 дней назад
@@MosesMatsepane agreed, but it's closer with DeepSeek than people think. I primarily call Sonnet when DeepSeek 3 is stuck. DeepSeek R1 never seems to get stuck, so I've kinda replaced Sonnet with R1. Ke leboga comment monna wa geshu
@MosesMatsepane 8 дней назад
@MarvijoSoftware He mogaetsho kante o dirisa AI monna. Nna keare ke reditse motho mo. A key didimale before le jesetsa setlhopa. 😂😂 O berekile go utlwala Morena.
@greendsnow 8 дней назад
Sonnet still beats R1
@MarvijoSoftware 8 дней назад
@@greendsnow depends
@lipinglin1994 9 дней назад
It reads 50 files. It wins.
@jasekraft430 9 дней назад
Wondering if R1 was trained on this particular challenge. Since it's open source, very well could be word for word in the training data. Either way, still impressive
@MarvijoSoftware 9 дней назад
Yeah, though the reasoning kinda makes sense that it solved it from first principles
@georgezorbas9036 9 дней назад
I subscribed because we do a very good job. So you think that Cline is better than Aider and Continue Dev? We could say that top to bottom Cursor AI - Cline - Aider - Continue?
@MarvijoSoftware 9 дней назад
Aider is above Cline according to me, if used correctly. It's more consistent than all of them. People like Cline because it has a GUI
@ninjaxae 9 дней назад
Please Do Cursor(Deepseek-R1) vs Windsurf(Sonnet 3.5) video .
@MarvijoSoftware 9 дней назад
Alright, I'll queue it up. The problem is that Cursor + R1 don't support Composer. Cursor vs Windsurf: ruclips.net/video/duLRNDa-CR0/видео.html
@ninjaxae 5 дней назад
@MarvijoSoftware Yes But you can do like Plan with R1 and give it as prompt to Composer which will use Sonnet , Just like cline and aider plan with one model and edit with other
@ninjaxae 9 дней назад
Please Use this voice in every video , and also please Do Cursor(Deepseek-R1) vs Windsurf(Sonnet 3.5) video .
@MarvijoSoftware 9 дней назад
The problem is that Cursor + R1 don't support Composer. Cursor vs Windsurf: ruclips.net/video/duLRNDa-CR0/видео.html
@ninjaxae 9 дней назад
Please Use this voice in every video , and also please Do Cursor(Deepseek-R1) vs Windsurf(Sonnet 3.5) video .
@MarvijoSoftware 9 дней назад
Alright, I'll queue it up. The problem is that Cursor + R1 don't support Composer. Cursor vs Windsurf: ruclips.net/video/duLRNDa-CR0/видео.html
@ninjaxae 9 дней назад
Please Use this voice in every video , and also please Do Cursor(Deepseek-R1) vs Windsurf(Sonnet 3.5) video .
@MarvijoSoftware 9 дней назад
Alright, I'll queue it up. The problem is that Cursor + R1 don't support Composer. Cursor vs Windsurf: ruclips.net/video/duLRNDa-CR0/видео.html
@aculz 9 дней назад
so just start from today, i will fully trust into this chinese company called “Deepseek”. because they will slap every US company with amazing their LLM result with cheap and OPEN SOURCE 😂 so in the future, when Us company release something, i will just wait for deepseek to slap them in 2-3 weeks later 😂
@ilyass-alami 9 дней назад
No ,deepseek R1 is number one, it batter then new Gemini 2 Flash thinking
@MarvijoSoftware 9 дней назад
I agree. That's the LMArena leaderboard which is based on random people voting
@Arron_Mottram 9 дней назад
I don't trust the chatbot arena, they put Claude 3.5 Sonnet in 11th place
@MarvijoSoftware 9 дней назад
I also don't trust it, but I get it. It's not Devs who vote for coding tasks, and people just cast random votes. That's why I believe actual benchmarks are needed, like we do on the channel, and what Aider does. The problem with Aider benchmarks is that LLMs can train on them because they are public
@Ludecan 9 дней назад
Hey man! Awesome channel you've setup here. So cool to have these comprehensive comparisons between coding tools in a real life Software Engineering project, even with comments about the expected complexity of the tasks and who could do it if it was a person 💯. I was wondering, have you tried Roo Cline vs Cline? Seems like Roo Cline is a faster developed version of Cline? And that they've now gone their separate ways. Also do you know if any of these are private if you run them with local models? I know aider doesn't send anything to the internet but curious about Cline/Roo Cline and Cursor too. Again many thanks. Will be following this channel closely.
@MarvijoSoftware 9 дней назад
@@Ludecan Thanks! I am planning on a Roo-Code vs Cline video hopefully this week. I haven't checked the Cline source to see if it collects user data, will check and let you know, it probably doesn't. Aider has an opt in to send anon stats, but it defaults to off. Both should be safe with a local LLM because DeepSeek and Gemini Exp models definitely collect data
@Ludecan 9 дней назад
@@MarvijoSoftware that is awesome, thanks! I'll stay tuned for that comparison!
@Swooshii-u4e 10 дней назад
Next time you should also include the little steps too: 1) 2:55 is this the code is generated? 2) how did you save it? The file names didn’t seem to change maybe you should’ve picked a whole new name. 3) would it be better to just create a file first name testdeep.py and just paste it in there first or doesn’t matter much. Or does the file name need to be exactly “rest_api_test.py” for the test to work? 4) do I need aider to test this? 5) I think your next video should be the same test covering some of the things I mention since I am about to attempt this even tho I am a n00b but the difference would be is that you would do this same deepseek test vs 5x ai google studio models with default settings + 5 models using value “0” for both temperature and top P to see if this increases accuracy and compare to default 1 / 0.95 value settings. Also I want to see if your 2.3k subs will go up after this since it will bring the google fans lol
@MarvijoSoftware 10 дней назад
Thanks for the suggestions! 3) Yes, the unit test needs the filename to be rest_api.py verbatim. 4) I made this test so you don't need Aider, we just used the simple Aider prompt, but you can ask the LLM something similar If you're stuck in anything, join the Discord and ask a question: discord.gg/SYn634DD
@Swooshii-u4e 10 дней назад
I told you last month you are going to blow up because you are making videos no one makes but people think about them
@MarvijoSoftware 10 дней назад
@@Swooshii-u4e You did Swoosh! I will never forget you. Also, thanks for supporting this channel, I'll give you a shoutout in upcoming videos!
@ragibhasan2.0 10 дней назад
This Open-source model is truly revolutionary!🔆🔆
@MarvijoSoftware 10 дней назад
@@ragibhasan2.0 it's not Open Source
@ragibhasan2.0 9 дней назад
@@MarvijoSoftware i mean deepseek r1 🤗
@AbdullahOllivierreIT 10 дней назад
### Summary: Cursor vs. Cline AI Coding Battle This video compares **Cursor** and **Cline**, two AI coding tools, as they tackle coding tasks in a large 240k-token codebase for a soccer prediction app. Here's a breakdown: #### **Tools Overview:** - **Cursor**: A proprietary Visual Studio Code fork with AI features. It has a Pro plan for $20/month (limited to 500 requests). Uses a vector database for context and embeds changes directly into the codebase. - **Cline**: An open-source Visual Studio Code extension that works with most large language models. The demo uses **Claude 3.5 Sonnet**, which is cost-effective but less refined. Cline supports transparency by showing payloads sent to LLMs and has Model Context Protocol integration for system communication. --- ### **Task Comparisons**: 1. **Bug: Search Box Focus** - Cursor identified possible issues (focus being stolen/z-index issues) and implemented reinforcement code. The functionality worked perfectly. - Cline failed to fix the issue, even after multiple retries. - **Winner**: Cursor. 2. **Bug: Filtering Tournaments by Match** - Cursor used efficient embeddings to isolate context and fixed the bug effectively. Filtered search worked correctly. - Cline failed to display filtered results and repeated errors, even after additional attempts. - **Winner**: Cursor. 3. **Feature: Match News Summarization** - Cursor successfully implemented web scraping (using Serper and Playwright) and integrated Google AI's summarization for structured predictions. It added a functional frontend button to display the news and predictions. - Cline struggled with scraping, faced repeated errors, and could not complete the feature. - **Winner**: Cursor. --- ### **Key Observations**: - **Cursor Advantages**: - Faster and more reliable bug fixes. - Fine-tuned small language models for quick tasks. - Effective use of embeddings for large codebases. - Generates structured summaries and predictions. - **Cline Advantages**: - Open-source and model-flexible. - Transparency in API payloads. - Better integration with custom systems via Model Context Protocol. --- ### **Outcome**: **Cursor outperformed Cline** in all three tasks, demonstrating higher efficiency, accuracy, and feature completion, despite Cline’s flexibility and open-source nature. For more detailed insights and future comparisons, the channel offers memberships for exclusive content.
@seoky6 3 дня назад
You're a God!
@jishan6992 10 дней назад
Deepseek R1 is a gift to us, i am glad people are recognizing it
@MarvijoSoftware 10 дней назад
Indeed it is
@Dom-zy1qy 10 дней назад
I think the new gemini models have been tuned pretty well in terms of human preference (at least I like the newer models more than their older ones). Claude imo is usually #1, then everyone else is about the same. However from my usage of the model, it seems like gemini 2.0 exp 1206's responses get pretty bad/mediocre after 60k tokens of context.
@MarvijoSoftware 10 дней назад
All models get dumber as tokens increase. Also, they start to output random characters at a certain context
@anandkanade9500 10 дней назад
its so satisfying to see spreadsheet of performance at end , 👍
@MarvijoSoftware 10 дней назад
😃
@ClipSeason3 10 дней назад
Use cline with DeepSeek R1. Very cheap. 👍
@mlsterlous 10 дней назад
How the f its number one on arena? Like one or two days after release? I had to wait a lot longer to see phi4 in that list.
@MarvijoSoftware 10 дней назад
I asked myself the same question after it was just released! So quick, who voted? Bots? Something might be off
@gemini_537 10 дней назад
❤ Gemini 2.0
@holymemoly3833 10 дней назад
I've been using deepseek v3 and it's so amazing , i turned off deep think and it became even better in coding , i hope it stays free forever
@rakly347 10 дней назад
You know you can use vectorized data of your codebase with Cline too right? with MCP tools. Best of all, you can ask the ai to do it for you, if you're willing to pay for tokens. Curser does it natively.
@MarvijoSoftware 10 дней назад
Yep, though it's schleb
@blackpiller3777 10 дней назад
I hope someday cline with DeepSeek will be better than nowadays Cursor + Claude 3.5Sonnet.
@MarvijoSoftware 10 дней назад
@@blackpiller3777 open source almost always wins
@the_proffesional1713 10 дней назад
Isn't Deepseek R1 not pretty insane at coding? I saw that its a big improvement on math, gpqa and somethinng. Btw im not an ai researcher. But i think claude 3.5 still beat any ai overthere
@MarvijoSoftware 10 дней назад
@@the_proffesional1713 it's actually very very good in coding. In my tests and in the Aider polyglot coding benchmarks, only behind o1 high
@sergioduque94 9 дней назад
@@MarvijoSoftware So, between ChatGPTs options, is O1 better than O1 mini? Im talking about o1 normal, not o1 PRO. I thought o1 mini was better at coding than O1.
@andrewandreas5795 10 дней назад
Thanks for the nice video. Please make a comparison of Roocline vs Aider vs Cursor
@MarvijoSoftware 10 дней назад
@@andrewandreas5795 A Roo-Cline video is incoming soon, after the R1 Architect video in a larger codebase
@mohegyux4072 10 дней назад
deepseek's thought process is really impressive, I tested it on my own weird problems and it passed with flying colors
@geografiainfinitului 5 дней назад
Yes is very good, and I noticed that is not that verbose as the other tools.
@TheBuzzati 10 дней назад
Appreciate your videos. Thanks for the comparisons and insights
@MarvijoSoftware 10 дней назад
@@TheBuzzati I appreciate your viewing 🙏🏾
@MrParad0x 11 дней назад
#1 my ass. I tested it with 10+ prompts and it hallucinated a lot. It would suddenly stop generating response in AI studio. As of today, it's quite buggy and unreliable. I wouldn't recommend using this.
@MarvijoSoftware 11 дней назад
@@MrParad0x yep, I don't trust LM Arena in the slightest, especially after it ranked the weak o1-mini so high for coding

Marvijo AI Software

Видео

Комментарии