@@hammeedabdo.82 will definitely do! Check out the other comparison videos on the channel, e.g., Cursor vs Cline: ruclips.net/video/AtuB7p-JU8Y/видео.html
TIP give the entire code and the assignment to a reasoning model, then ask claude sonnet as the agent to implement the plan generated - this works REALLY well
@@gabrielsandstedt I use "Cline"'s Memory Bank technique : docs.cline.bot/improving-your-prompting-skills/custom-instructions-library/cline-memory-bank
For Every new LLM models' agents are struggling to understand the LLM's output, in that case Claude Sonnet doing great, they produce response in same way except each version increase the accuracy of the response which is great for Agents
День назад+1
I run Cline and RooCline with Gemini's free API inside Windsurf. Best of all worlds
@MarvijoSoftware Claude still writes better code but Gemini is fast and the massive context windows are amazing. It fits my entire codebase in with no issue. Best of all, it's free for upwards of like 30 calls per minute, after that, wait a minute and go again. But whenever it struggles, I just switch back to Cascade with Claude to bugfix
I got like 5 forks of VSCode. Windsurf seems to be best, but I really wanna use my own API keys since I got credits in the Openrouter and dont wanna pay for subscription. We can do that in cursor but it defeats the whole purpose of agentic stuff. What would you recommend?
i agree that sonnet has born a bit longer than both two, so windsurf and cursor just fine-tune their tools with sonnet model. thats why it perform well. i think we just need to wait a bit more when both IDE fine tune also both model
TL; DR/Text take aways: Here are the findings from the review of using o3-mini and R1 in Cursor vs in Windsurf, with a 240k+ token codebase. The task was to integrate Supabase Authentication into the app: **TL;DR: When using Cursor or Windsurf in a relatively large codebase, Claude 3.5 Sonnet still seems to be the best option** \- o3-mini isn't practical yet, both in Cursor and Windsurf. It's buggy, error prone and doesn't produce the expected results \- Claude 3.5 Sonnet is still the best coder amongst the 3 reasoning models in current tests: against o3-mini, R1 and Gemini 2 Flash Thinking \- We might be approaching things wrong by coding with reasoning models, they're supposed to do the planning/architecting; e.g., R1 + 3.5 Sonnet are the best AI Coding duo in the Aider Polyglot benchmark (ref: [aider.chat/docs/leaderboards/](aider.chat/docs/leaderboards/) ) \- I'll see how R1 vs o3-mini compare as Software Architects, paired with DeepSeek V3 vs Claude 3.5 Sonnet. This should be an ultimate SOTA test \- I believe we shouldn't miss the point and spend an equivalent amount of time using AI Coders as real developers. If it takes > 60% of the estimated time for a human developer, it's probably not a good model... or the prompt needs to be refined \- if the prompt engineering + AI Coding takes as long as the human dev estimates, we're missing the point \- Both Cursor and Windsurf are either optimized for Claude 3.5 Sonnet, or Claude 3.5 Sonnet is just extremely optimized for coding and is probably better named Claude 3.5 Sonnet Coder. We know it's a good coder, but it shouldn't theoretically be competing with R1 since it's not a reasoning model \- it would be great to see how o3-mini-high performs in both Cursor and Windsurf
@@doublebucketz4661 what do you use? Unity or something else? The best way is to TRY BOTH with your codebase, if it's your product or your company permits. Cursor is just better in general
@@caseyhoward8261 😅 yes, cool dude! We cover similar topics but he focuses on 'free stuff' and just testing tools while I focus on tools which can be used in the workplace in larger codebases. So he'd have more videos for example 🙂
Thanks, Please more videos like this.
@@hammeedabdo.82 will definitely do! Check out the other comparison videos on the channel, e.g., Cursor vs Cline: ruclips.net/video/AtuB7p-JU8Y/видео.html
TIP give the entire code and the assignment to a reasoning model, then ask claude sonnet as the agent to implement the plan generated - this works REALLY well
@@gabrielsandstedt I use "Cline"'s Memory Bank technique : docs.cline.bot/improving-your-prompting-skills/custom-instructions-library/cline-memory-bank
@@MarvijoSoftware oh cool not heard of, I will look into it
Wonder if it works with Roo Code also (popular cline fork that is more agentic)
Love this content! I'll appreciate you testing RooCline :)
@@jonathan-k7z4y will definitely test it for you!
Thanks! Great test! I love Sonet! 🙏
For Every new LLM models' agents are struggling to understand the LLM's output, in that case Claude Sonnet doing great, they produce response in same way except each version increase the accuracy of the response which is great for Agents
I run Cline and RooCline with Gemini's free API inside Windsurf. Best of all worlds
Niiice! How's the performance?
@MarvijoSoftware Claude still writes better code but Gemini is fast and the massive context windows are amazing. It fits my entire codebase in with no issue. Best of all, it's free for upwards of like 30 calls per minute, after that, wait a minute and go again.
But whenever it struggles, I just switch back to Cascade with Claude to bugfix
Ty
Great video, I love it but you should make a video about making a comparison between cline working with open source model and close model.
@@renierdelacruz4652 will do. Which models do you want to compare?
@MarvijoSoftware could be deepseek R1 and Claude soonet or o1
I got like 5 forks of VSCode. Windsurf seems to be best, but I really wanna use my own API keys since I got credits in the Openrouter and dont wanna pay for subscription. We can do that in cursor but it defeats the whole purpose of agentic stuff. What would you recommend?
i agree that sonnet has born a bit longer than both two, so windsurf and cursor just fine-tune their tools with sonnet model. thats why it perform well.
i think we just need to wait a bit more when both IDE fine tune also both model
@@aculz I agree!
Hi. Let me know which AI tools you need to be compared and please consider supporting the channel using the Thanks button
TL; DR/Text take aways:
Here are the findings from the review of using o3-mini and R1 in Cursor vs in Windsurf, with a 240k+ token codebase. The task was to integrate Supabase Authentication into the app:
**TL;DR: When using Cursor or Windsurf in a relatively large codebase, Claude 3.5 Sonnet still seems to be the best option**
\- o3-mini isn't practical yet, both in Cursor and Windsurf. It's buggy, error prone and doesn't produce the expected results
\- Claude 3.5 Sonnet is still the best coder amongst the 3 reasoning models in current tests: against o3-mini, R1 and Gemini 2 Flash Thinking
\- We might be approaching things wrong by coding with reasoning models, they're supposed to do the planning/architecting; e.g., R1 + 3.5 Sonnet are the best AI Coding duo in the Aider Polyglot benchmark (ref: [aider.chat/docs/leaderboards/](aider.chat/docs/leaderboards/) )
\- I'll see how R1 vs o3-mini compare as Software Architects, paired with DeepSeek V3 vs Claude 3.5 Sonnet. This should be an ultimate SOTA test
\- I believe we shouldn't miss the point and spend an equivalent amount of time using AI Coders as real developers. If it takes > 60% of the estimated time for a human developer, it's probably not a good model... or the prompt needs to be refined
\- if the prompt engineering + AI Coding takes as long as the human dev estimates, we're missing the point
\- Both Cursor and Windsurf are either optimized for Claude 3.5 Sonnet, or Claude 3.5 Sonnet is just extremely optimized for coding and is probably better named Claude 3.5 Sonnet Coder. We know it's a good coder, but it shouldn't theoretically be competing with R1 since it's not a reasoning model
\- it would be great to see how o3-mini-high performs in both Cursor and Windsurf
Do a video testing "Augment Code". I've had better success using it in a large project compared to Cursor or Windsurf.
@@anon1999-h5j I tested it, I'll see if I can provide feedback
Always show .windsurfrules please, so we know what context you give to models. Models are more accurate if provided with project rules.
Cursor or windsurf as a game developer? I can never telllll
@@doublebucketz4661 what do you use? Unity or something else? The best way is to TRY BOTH with your codebase, if it's your product or your company permits. Cursor is just better in general
Cursor vs Windsurf: Round 1: ruclips.net/video/duLRNDa-CR0/видео.html
DeepSeek R1 vs OpenAI O1 & Claude 3.5 Sonnet - Hard Code Round 1: ruclips.net/video/EkFt9Bk_wmg/видео.html
Why am I getting AICodeKing vibes? 😉
@@caseyhoward8261 😅 yes, cool dude! We cover similar topics but he focuses on 'free stuff' and just testing tools while I focus on tools which can be used in the workplace in larger codebases. So he'd have more videos for example 🙂
Cursor vs Cline | 240k Tokens Codebase Side-by-Side: ruclips.net/video/AtuB7p-JU8Y/видео.html
Aider vs Cline Using DeepSeek 3: Codebase 20k Lines: ruclips.net/video/e1oDWeYvPbY/видео.html