great assessment, my biggest two gripes with the new 3.5 model is that it first, seems ever so slightly lobotomized compared to the previous one, and I'm worried they might continue in that direction. and second, is that it tries so hard to save up on resources by writing for example 300 words even though the prompt specifically states 1000. and it tries to mask that by asking unnecessary questions like: (would you like me to continue with this writing style?/ would you like a continuation of this scene?) it can get exhausting sometimes.
I use the latest and greatest version of Claude 3.5 Sonnet and have found it very helpful and useful. Of course, I spend a lot of time before actually writing to develop the project knowledge so that it has valuable reference material to aid in the actual prose drafting of stories and novels I work on. I love the Projects feature, it has become indispensable to me to achieve quality output.
Thanks for informing me this exists. I just tried it ... I'm personally more impressed with Claude, as in I didn't see it do anything Claude couldn't, and if anything it seemed a little more dry, but I didn't put in much effort. All the best. :)
@@Switchflixchannel I find the deepthink feature to be a huge advantage (considering it is currently free) that Claude & ChatGPT don't offer in the same way. I did find that perhaps prose is similar, but it was less censored on particular topics and did not often preface conversations with disclaimers.
We need more people doing work like you do. Too many benchmarks regarding accuracy on mathematics and science topics, not really enough about the subjective nature of writing and all that, and also being able to generally generate new and random and intriguing prose.
I've been experimenting with adding test time memorization similar to the Google Titans paper but using Project content documents and a super prompt structure to emulate the Titan analysis. Claude read the paper and came up with this structure for screenplay generation. Memory Architecture Components Short-term Memory: Immediate context and current scene dynamics Long-term Memory: Character histories, plot threads, thematic elements Persistent Memory: Core story rules, world-building elements, genre conventions Dynamic Context Management The paper's approach to memory as context (MAC) suggests organizing screenplay elements into: Contextual Memory: Scene-specific details that influence current action Historical Information: Previous scene outcomes that affect present choices Fixed Knowledge: Unchanging story rules and character traits Memory Integration Methods Three key approaches for managing screenplay information: Memory as Context: Using previous scenes to inform current writing Gated Memory: Selective incorporation of past elements Layered Memory: Building scene complexity through accumulated context With this architecture, Claude understands all the character arcs, scene intentions, and story beats in memory while improvising dialog for the scene descriptions I give it. Combined with using styles for voicing the screenplay and principles taken from leading screenwriters, I'm seeing surprisingly good dialog with subtext and a deeper understanding of what each character wants from each scene.
I Used to use Claude and I even had a subscription before. It was pretty superior to chatgpt but now ive been using GPT more. Specially if you can fine tune, i noticed gives me good results. Also for my genre there is no way to use claude in many scenes. Even simple kisses or scenes that the characters might look a bit uncomfortable or vulnerable. Even if the character is by their own, it simply refuses and the only way to go it is to ask line per line. So claude in geneeral in a no-go for me. Even if the prose was very good.
i agree. I mostly use Sonnet 3.5 and supplement with Mistral Large for NSFW - though the more i use Mistral the more its shortcomings are revealed. Mistral uses the same descriptive lines and phrases over and over and over. we need a high quality NSFW model.
This was very helpful thank you. Could you do one of these on Mistral Large. I like sonnet but sometimes it feels like it overwrites sentences. I’ve tried Mistral a few times and to me it feels less AI. (P.S. I am very new to this so if it sounds like I don’t know what I’m talking about just ignore me. 😂)
Lately, for like the past 3 days now, the free plan keeps switching to 3.5 Haiku, citing "high demand on the 3.5 Sonnet". I really hope they fix that soon. Claude is my favourite for fiction writing.
I tried to access your prose prompts in your last video via the link you provided, but myself and others in the comments were unable to figure out how. The link sent us to a community page with no further guidance on how to sign up for an email list to receive those prompts. Any thoughts?
In your statistics how do you account for the case where a model seems to change in its level of intelligence? I'll often find (here in Australia GMT+11) that in the morning claude3.5 provides better content than in the afternoon. Or the first two responses will be good in a session, but the following three prompts the answers are not even close to the same level of awareness. For me the hardest thing about using Claude is consistency in its intelligence level.
Really surprised Opus didn’t do better. I have a couple of absolutely magical scenes that Opus wrote. Opus also tends to keep going past the scene beats you wrote and once in a while comes up with some amazing idea you can take and run with. I use Sonnet 3.6 now but I truly miss Opus. (Too expensive)
I have the paid claude sonnet 3.5 ...it is very difficult to get over 1200 words... if i give it a 2500 word document that i personally wrote and ask it to grammar check and format for a book im writing and i tell it the final version must have at least 2500 words...100% of the time it will reduce my writing to about 1200 words. It is basically worthless for what i want it to do.
I have been using Gemini Advanced and the truth is that it leaves a lot to be desired. Every now and then it forgets the previous prompt and wants me to remind it or it says that it is an artificial intelligence so it can't do that. Then I tell it what happened and it tells me that it had a technical error, but then we start again and it no longer has a technical error. It's absurd, having to write the prompts all the time. I don't think it's worth paying for that.
I haven't watched it yet, and I am voting for Claude 3 Opus. I will add that it will definitely do NSFW content. Keep in mind, I am using the Anthropic UI. I would put an example out, but it would be immediately deleted by YT. Edit: I would say that while New Sonnet 3.5 is better at a great many things, it is not better at writing prose than OPUS. There are some types of scenes that it is better at. It's great for Science Fiction, especially if you are doing hard science fiction, but you are still going to need to put that through OPUS to give it any sort of human feel. For romance, and certainly love scenes, you don't want Sonnet to touch it at all. One, it gets an attitude, and two, it's extremely mechanical.
Actually I posted on one of your videos that Claude can do NSFW. However, you have to prompt it correctly and throw in some trigger words. In my book, Chapter 1 has a long descriptive love making scene. lol. The Claude model I used for it was Sonnet 3.5.
This is actually not true if you know how to prompt it right. It’s reasonably susceptible to simple jailbreak prompting. That said, I did mark it as not capable of doing NSFW because it requires jailbreaking. I only make that as a yes if it does NSFW by default.
Can we please discuss Anthropics Usage Limits and the fact that if you are using projects you'll be lucky to get a handful of generations out of it before you need to wait 3 hours for it to reset. I've been a loyal Claude user for a long time, but I'm finding the message limits on the professional paln so infuriating I'm ready to cancel my subscription.
This checks out with my experience too. Sonnet 3.5 remains my go-to model for prose writing, even if its stingy output makes it a bit more expensive in practice than the token price might suggest. It's also the only model that does well with sample prose snippets to model its output on. For iteration, I still like Gemini Pro 1.5 a lot, maybe not for 'give 10 ideas' type prompting, but for more guided back-and-forth iteration where you tell it to keep what you like, but change what you don't - not all models do this well, and often change the things I want to keep, despite instructing it not to.
great assessment, my biggest two gripes with the new 3.5 model is that it first, seems ever so slightly lobotomized compared to the previous one, and I'm worried they might continue in that direction. and second, is that it tries so hard to save up on resources by writing for example 300 words even though the prompt specifically states 1000. and it tries to mask that by asking unnecessary questions like: (would you like me to continue with this writing style?/ would you like a continuation of this scene?) it can get exhausting sometimes.
I use the latest and greatest version of Claude 3.5 Sonnet and have found it very helpful and useful. Of course, I spend a lot of time before actually writing to develop the project knowledge so that it has valuable reference material to aid in the actual prose drafting of stories and novels I work on. I love the Projects feature, it has become indispensable to me to achieve quality output.
You should do this test for the new Deepseek models, including the bigger distilled ones.
I also would be very interested to hear your thoughts on Deepseek R1 and how it matches up to the Claude models
Thanks for informing me this exists. I just tried it ... I'm personally more impressed with Claude, as in I didn't see it do anything Claude couldn't, and if anything it seemed a little more dry, but I didn't put in much effort. All the best. :)
@@Switchflixchannel I find the deepthink feature to be a huge advantage (considering it is currently free) that Claude & ChatGPT don't offer in the same way. I did find that perhaps prose is similar, but it was less censored on particular topics and did not often preface conversations with disclaimers.
We need more people doing work like you do. Too many benchmarks regarding accuracy on mathematics and science topics, not really enough about the subjective nature of writing and all that, and also being able to generally generate new and random and intriguing prose.
HI Jason, have you checked out Deepseek R1?
Thank you so much for taking the time to do this. It was very help. Thank you ❤
I've been experimenting with adding test time memorization similar to the Google Titans paper but using Project content documents and a super prompt structure to emulate the Titan analysis. Claude read the paper and came up with this structure for screenplay generation.
Memory Architecture Components
Short-term Memory: Immediate context and current scene dynamics
Long-term Memory: Character histories, plot threads, thematic elements
Persistent Memory: Core story rules, world-building elements, genre conventions
Dynamic Context Management
The paper's approach to memory as context (MAC) suggests organizing screenplay elements into:
Contextual Memory: Scene-specific details that influence current action
Historical Information: Previous scene outcomes that affect present choices
Fixed Knowledge: Unchanging story rules and character traits
Memory Integration Methods
Three key approaches for managing screenplay information:
Memory as Context: Using previous scenes to inform current writing
Gated Memory: Selective incorporation of past elements
Layered Memory: Building scene complexity through accumulated context
With this architecture, Claude understands all the character arcs, scene intentions, and story beats in memory while improvising dialog for the scene descriptions I give it. Combined with using styles for voicing the screenplay and principles taken from leading screenwriters, I'm seeing surprisingly good dialog with subtext and a deeper understanding of what each character wants from each scene.
Nice work, Jason! Thank you 🙏
That was fun and interesting - thanks!
I Used to use Claude and I even had a subscription before. It was pretty superior to chatgpt but now ive been using GPT more. Specially if you can fine tune, i noticed gives me good results. Also for my genre there is no way to use claude in many scenes. Even simple kisses or scenes that the characters might look a bit uncomfortable or vulnerable. Even if the character is by their own, it simply refuses and the only way to go it is to ask line per line. So claude in geneeral in a no-go for me. Even if the prose was very good.
i agree. I mostly use Sonnet 3.5 and supplement with Mistral Large for NSFW - though the more i use Mistral the more its shortcomings are revealed. Mistral uses the same descriptive lines and phrases over and over and over. we need a high quality NSFW model.
This was very helpful thank you. Could you do one of these on Mistral Large. I like sonnet but sometimes it feels like it overwrites sentences. I’ve tried Mistral a few times and to me it feels less AI. (P.S. I am very new to this so if it sounds like I don’t know what I’m talking about just ignore me. 😂)
I love how your brain works. If you took the Clifton Strengthfinders I bet your top 3 might be Ideation, Strategic, and Leadership
Lately, for like the past 3 days now, the free plan keeps switching to 3.5 Haiku, citing "high demand on the 3.5 Sonnet". I really hope they fix that soon. Claude is my favourite for fiction writing.
Same
Free plan rarely allows the higher models. You get what you pay for.
I tried to access your prose prompts in your last video via the link you provided, but myself and others in the comments were unable to figure out how. The link sent us to a community page with no further guidance on how to sign up for an email list to receive those prompts. Any thoughts?
It’s all in the community now.
You can ask Sonnet 3.5 to expand your chapter up to 2,000 words and it does work.
In your statistics how do you account for the case where a model seems to change in its level of intelligence? I'll often find (here in Australia GMT+11) that in the morning claude3.5 provides better content than in the afternoon. Or the first two responses will be good in a session, but the following three prompts the answers are not even close to the same level of awareness. For me the hardest thing about using Claude is consistency in its intelligence level.
That’s very interesting.
Really surprised Opus didn’t do better. I have a couple of absolutely magical scenes that Opus wrote. Opus also tends to keep going past the scene beats you wrote and once in a while comes up with some amazing idea you can take and run with. I use Sonnet 3.6 now but I truly miss Opus. (Too expensive)
I have the paid claude sonnet 3.5 ...it is very difficult to get over 1200 words... if i give it a 2500 word document that i personally wrote and ask it to grammar check and format for a book im writing and i tell it the final version must have at least 2500 words...100% of the time it will reduce my writing to about 1200 words. It is basically worthless for what i want it to do.
Has anyone tried Grok?
I have been using Gemini Advanced and the truth is that it leaves a lot to be desired. Every now and then it forgets the previous prompt and wants me to remind it or it says that it is an artificial intelligence so it can't do that. Then I tell it what happened and it tells me that it had a technical error, but then we start again and it no longer has a technical error. It's absurd, having to write the prompts all the time. I don't think it's worth paying for that.
I haven't watched it yet, and I am voting for Claude 3 Opus. I will add that it will definitely do NSFW content. Keep in mind, I am using the Anthropic UI. I would put an example out, but it would be immediately deleted by YT.
Edit: I would say that while New Sonnet 3.5 is better at a great many things, it is not better at writing prose than OPUS. There are some types of scenes that it is better at. It's great for Science Fiction, especially if you are doing hard science fiction, but you are still going to need to put that through OPUS to give it any sort of human feel. For romance, and certainly love scenes, you don't want Sonnet to touch it at all. One, it gets an attitude, and two, it's extremely mechanical.
Im surprised. I never managed to create any NSFW scene using any Cloude model.😊
Actually I posted on one of your videos that Claude can do NSFW. However, you have to prompt it correctly and throw in some trigger words. In my book, Chapter 1 has a long descriptive love making scene. lol. The Claude model I used for it was Sonnet 3.5.
great content
I wonder when Anthropic is releasing Opus 3.5. Seems like its been ages.
My guess is that, since O1 has come out, they’re designing Opus 3.5 to be a reasoning model.
Claude is extremely censored, dude, you can't write anything with it that isn't perfectly child-friendly.
This is actually not true if you know how to prompt it right. It’s reasonably susceptible to simple jailbreak prompting. That said, I did mark it as not capable of doing NSFW because it requires jailbreaking. I only make that as a yes if it does NSFW by default.
Can we please discuss Anthropics Usage Limits and the fact that if you are using projects you'll be lucky to get a handful of generations out of it before you need to wait 3 hours for it to reset. I've been a loyal Claude user for a long time, but I'm finding the message limits on the professional paln so infuriating I'm ready to cancel my subscription.
Just use it through OpenRouter the Console. You’ll get unlimited usage and the rates are very reasonable. You’ll get a lot more usage for $20.
This checks out with my experience too. Sonnet 3.5 remains my go-to model for prose writing, even if its stingy output makes it a bit more expensive in practice than the token price might suggest. It's also the only model that does well with sample prose snippets to model its output on. For iteration, I still like Gemini Pro 1.5 a lot, maybe not for 'give 10 ideas' type prompting, but for more guided back-and-forth iteration where you tell it to keep what you like, but change what you don't - not all models do this well, and often change the things I want to keep, despite instructing it not to.
❤