Python 3.11 did not add just in time compilation. 3.11 is an improvement of the Python VM in how data is managed, especially object types. The bytecode is still interpreted though.
@@CoderSpaceChannel oh wow, thats awesome. I didn't know they planned on doing this. I kept seeing them saying they wanted to try to avoid JIT for as long as possible in search of other solutions.
Found that Arcade performance can be increased by 25% by replacing the line *self.center_x, self.center_y = self.x, self.y* in the SpriteUnit class with *self.set_position(self.x, self.y)* Is this a bug or a feature, I don't know how to interpret it. And you can achieve even better performance if you don't care about collisions between sprites: SpriteUnit: super().__init__(hit_box_algorithm=None)
Hi, great video! I'm one of Arcade's maintainers. The difference in setting center_x and center_y vs using set_position is that internally the position in Sprite is represented as a tuple, and the center_x and center_y properties are mostly helper properties to make it easier to set/access just one of the values. So setting these both actually updates the position twice, whereas set_position only changes it internally one time. I'm also curious if this is tested with Arcade 2.6? We have an upcoming 3.0 release that is approximately a year of development ahead of 2.6 and should offer many speedups, as well as alternative classes like BasicSprite which strips out a lot of the unnecessary weight of Sprite so you can build faster Sprites if you don't need some of the extra niceties of the full class. We're also experimenting with a package to provide speedups via Rust, but that is still very early stages and mostly focused on speeding up collision detection at the moment.
I have seen test somewhere else, simple setup, but it shows 3-5x increase in speed when using C++ in raylib. Which is expected, since there is no interpreter.
I was curious as well so I coded it up in C with Raylib (I'm more comfortable with straight C than C++). The answer is: a lot faster. I downloaded his Python code to compare. Python could render 3,400 sprites at 60 FPS. C could render 44,500 sprites. At 3,400 sprites C was getting ~800 FPS. About a 13x performance difference.
Have just tried it here: I tried 2 different ways: Using sprites (textures) and using shapes (drawing filled circles), using both Pygame and C Raylib. (Standard Python 3.7 interpreter, all with same algorithm (no sprite rotation, just borders collision), no numpy, no jit, nothing). As a bonus, also used p5JS for shapes. First using circle shape: Pygame reached 60 fps for 7.5K circles C Raylib reached 60fps for 1.7K circles p5JS reached 60 fps for 1.7K circles So, Pygame, for circle shape drawing, was 4x faster than C Raylib or p5JS. Then I changed for sprite (40x40 png, no rotation): Pygame reached 60 fps for 3K sprites (performance dropped 50% from shapes to sprite, yes, I used convert_alpha()) C Raylib reached 60 fps with 95K sprites p5JS not tested. So, Pygame got half of previous performance and C Raylib improved 53x, being 31x faster that Pygame using textures. For some reason, textures are easy for C Raylib and a pain for Python... and shapes are a bit easy for Pygame and a real pain for C Raylib. Here are the codes (simply to change comments to swap between textures and shapes). The shape I used was a 40x40 png, a ball with transparent background. Environment: old 7 year old laptop: Win10 64, i7 6 gen, nvidia 980m, 16GB DRAM **************************** C Raylib **************************** #include "raylib.h" #include typedef struct Ball { Vector2 position; Vector2 speed; Color color; int radius; } Ball; int main() { const int W= 1600; const int H = 800;
pygame.quit() if __name__ == "__main__": run() ******************************************** Anyone could explain the huge jump from C Raylib when using textures???
I'm curious if you could look at Kivy game programming as pygame isn't supported on Android or iOS, there is pygame subset 4 android but that seems out of date and still doesn't support iOS. Mobile platforms are a major market and Kivy seems to be the most recommended even for games. Also what IDE are you using? Is it PyCharm? Still a Great Video! Great Job! :)
I think Ren'py works for Android but it's not very optimised for that in my opinion, and very visual novel orientated. I think Kivy is still probably the best bet for for python on Android right now
Raylib is a pure C library with over 50 language bindings. I chose it in part for this reason. After developing a game in Python, you have the option of moving to another language for learning or performance reasons, while still keeping your knowledge of the library and all the same calls.
That's so true! I plan to build a game engine for 2D games in Raylib. Python and Raylib works too well. Also, Raylib is even simpler than PyGame and has a better implementation. For example, if you want to get the key inputs only once in PyGame, you're dependent on pygame's own event loop. But in Raylib you have a function call for that and you can associate it with any class.
@@alexale5488 I'm probably switching to Godot. Raylib was fun, but Godot 4.x just has more features out of the box including easy exporting of your game to different platforms.
nice trick of the cached rotations! I learned something today, if I haven't watched this video I'd have never think of it, though I'm familiar with cache.
Damn! The pygame GPU is CRAZY! Do you know when this stuff was implemented? There seems to be essentially no information on this anywhere, but seems to be such a magical way to speed up pygame rendering yet no one is talking about it?
They say the api is subject to change but it hasn't changed in a year so I'd say it's safe to atleast test out. At worst you can modify your code to any changes they make
If we talk about 2D games, then Arcade. But this is only for desktops - Windows, Linux, macOS. Arcade is not suitable for Android, you need something based on OpenGL ES or Pygame
This was unexpected. I love pygame but I thought it would be the worst. Anyway amazing video. And one more thing how did you install pygame on python 3.11 shouldn't it be not supported yet. I always get an error when using pip
.convert_alpha() is used for surfaces in software (CPU) rendering. When using the GPU to create a texture from an image, .convert_alpha() will result in an error
Awesome tutorial, I tried pygame cpu and cpu cache using a sprite sheet instead of separate files and the performance is surprisingly the same, spritesheets are better for use with programs like Tiled. Although I have one question, how would the numba and taichi libraries be used in this type of test?
You pay a lot just for the call overhead itself in python. By simplifying the code making less methods you can gain quite a bit of extra speed, but I think the classes you made are probably closer to what people make in real life. Maybe the separate translate and rotate method is overkill.
The problem is moving sprites, not drawing them. Arcade for example can easily draw several hundred thousand static sprites(albeit with a high load time), but moving them requires updating the data in the OpenGL buffers, pygame pulls ahead when using the GPU here because it is doing parts of this in C, whereas Arcade is 100% python.
Do you recommend me using Raylib for a commerical game? I am considering Godot, but I am a Python engineer and I'd find it more pleasurable to keep everything in Python. I have 1 year experience with PyGame, but Raylib seems more lightweight and more performant.
@@thinkingspace3438 Not exactly... If the information is in CPU cache sure the CPU could probably calculate it faster than what the GPU for small data sets could, but when you have large data sets of the same type, the GPU is much faster because of the scale of all of its cores. You are limited by the bus transfer speed from the CPU to the GPU. So sending batches of similar large data to the GPU so that the GPU can do those calculations is much faster than having to go back and forth from the CPU to the GPU and back, etc... hundreds or even thousands of times per frame. This causes a major bottleneck. It is not the same.
In arcade you can cheat and hookup up a compute or transform shader wirting to the position and rotation buffer in the spritelist if you want to get millions of sprites. It's probably cheating.. but you can do it :D
don't know why, but I have python 3.11 and the Arcade test went a little better than the pygame + gpu version, while the raylib was the same as pygame + gpu
Your speed is like 7×, can you please slower it down a bit I would be very thankful, and can you please make a game where we will have two ships, one being in our control and the other one will be automatically controlled by computer
Arcade is not better in everything though. I tried An nbody simulation with about 500 particles(each only 1 pixel) pygame did perfectly fine with Around 50-60 FPS. But arcade was like 2-5 FPS
Its definitively not a benchmark test if you use different hardware to get the results. All of them should be benchmarked on GPU or CPU and comparee objectively, otherwise you are fudging the results in favor of Pygame, because you kept reimplementing it just so it would out perform the others, while failing to actually ensure GPUs were used for the others as well. Biased data is useless data.
By just changing self.center_x, self.center_y = self.x, self.y to self.postion = self.x, self.y you can increase arcade test count from 3k to 4k (on my machine). You moving each sprite twice per frame and looks like moving is expensive operation in arcade. It sounds stupid I know but it really works. You can easily achieve 6k on arcade by skipping position changing processing for each sprite. To skip arcade position change processing for each sprite: add to SpriteUnit init: self.position = [self.x, self.y] # use list for sprite coords SpriteUnit def update(self): self.rotate() self.translate() self.position[:] = self.x, self.y # skip setter SpriteHandler: def update(self): self.sprites.update() for s in self.sprites: self.sprites.update_position(s) # update gl buffer. Looks like use sprite to just drawing is overkill. It has too many logic in python.
@@CoderSpaceChannel set_postion is just self.position = (center_x, center_y) it still include the same overhead. PS. I found how to get 5k without "hacking" just need to disable hit_box in SpriteUnit: super().__init__(hit_box_algorithm=None)
Last time i checked, Raylib didn't have any python support (and was almost strictly C++, so that was long ago...), so this news bring me joy. You could also try to use @lru_chache from funktools for sprite operations. Could be helpful. Or not, I didn't checked. (or even better, just @cache because it's the same as @lru_cache(maximize = none)... And as somebody before already pointed out -> JIT is not here yet, but some data management is...
Python 3.11 did not add just in time compilation. 3.11 is an improvement of the Python VM in how data is managed, especially object types. The bytecode is still interpreted though.
Thanks, I was wrong. But just in time compilation for small code fragments is planned to be implemented in 3.12, and for large ones in 3.13
@@CoderSpaceChannel oh wow, thats awesome. I didn't know they planned on doing this. I kept seeing them saying they wanted to try to avoid JIT for as long as possible in search of other solutions.
Respect for guys who testing things as librarys etc
Found that Arcade performance can be increased by 25% by replacing the line
*self.center_x, self.center_y = self.x, self.y*
in the SpriteUnit class with
*self.set_position(self.x, self.y)*
Is this a bug or a feature, I don't know how to interpret it.
And you can achieve even better performance if you don't care about collisions between sprites:
SpriteUnit:
super().__init__(hit_box_algorithm=None)
Hi, great video! I'm one of Arcade's maintainers. The difference in setting center_x and center_y vs using set_position is that internally the position in Sprite is represented as a tuple, and the center_x and center_y properties are mostly helper properties to make it easier to set/access just one of the values. So setting these both actually updates the position twice, whereas set_position only changes it internally one time.
I'm also curious if this is tested with Arcade 2.6? We have an upcoming 3.0 release that is approximately a year of development ahead of 2.6 and should offer many speedups, as well as alternative classes like BasicSprite which strips out a lot of the unnecessary weight of Sprite so you can build faster Sprites if you don't need some of the extra niceties of the full class. We're also experimenting with a package to provide speedups via Rust, but that is still very early stages and mostly focused on speeding up collision detection at the moment.
@@DarrenEberly Thank you for your contribution to the development of Arcade, this is a very promising and convenient library for creating 2D games.
Wow! pygame-gpu performance is unbelievable... on my MacBook Air M1 I reached 17401 sprites and holding 60 FPS.
And there's probably a lot of other tricks on could implement to even improve upon that...
great video, and would like to see more with raylib!!!!
It's just the comparison I was looking for, I think it's great that you did it, thank you very much!
Your tutorial are amazing, really impressive the update speed after upgrade to 3.11 !!!
great episode. i've always wanted to see a comparison between libs. good work!
I understand this is a Python comparison, but can't help to wonder what Raylib in C++ could've achieved.
I would imagine not much better because raylib python is calling c/c++ functions
I have seen test somewhere else, simple setup, but it shows 3-5x increase in speed when using C++ in raylib.
Which is expected, since there is no interpreter.
I was curious as well so I coded it up in C with Raylib (I'm more comfortable with straight C than C++). The answer is: a lot faster. I downloaded his Python code to compare. Python could render 3,400 sprites at 60 FPS. C could render 44,500 sprites. At 3,400 sprites C was getting ~800 FPS. About a 13x performance difference.
@@weirddan455 nice; thanks for sharing
Have just tried it here:
I tried 2 different ways: Using sprites (textures) and using shapes (drawing filled circles), using both Pygame and C Raylib. (Standard Python 3.7 interpreter, all with same algorithm (no sprite rotation, just borders collision), no numpy, no jit, nothing). As a bonus, also used p5JS for shapes.
First using circle shape:
Pygame reached 60 fps for 7.5K circles
C Raylib reached 60fps for 1.7K circles
p5JS reached 60 fps for 1.7K circles
So, Pygame, for circle shape drawing, was 4x faster than C Raylib or p5JS.
Then I changed for sprite (40x40 png, no rotation):
Pygame reached 60 fps for 3K sprites (performance dropped 50% from shapes to sprite, yes, I used convert_alpha())
C Raylib reached 60 fps with 95K sprites
p5JS not tested.
So, Pygame got half of previous performance and C Raylib improved 53x, being 31x faster that Pygame using textures.
For some reason, textures are easy for C Raylib and a pain for Python... and shapes are a bit easy for Pygame and a real pain for C Raylib.
Here are the codes (simply to change comments to swap between textures and shapes). The shape I used was a 40x40 png, a ball with transparent background.
Environment: old 7 year old laptop: Win10 64, i7 6 gen, nvidia 980m, 16GB DRAM
****************************
C Raylib
****************************
#include "raylib.h"
#include
typedef struct Ball {
Vector2 position;
Vector2 speed;
Color color;
int radius;
} Ball;
int main()
{
const int W= 1600;
const int H = 800;
//SetConfigFlags(FLAG_VSYNC_HINT);
InitWindow(W, H, "Sander");
SetTargetFPS(240);
const int N = 1700;
//Texture2D textura = LoadTexture("ball2.png");
Ball *balls = (Ball *)malloc(N*sizeof(Ball));
for (int i = 0; i < N; i++)
{
balls[i].position.x = (int)GetRandomValue(50, W-50);
balls[i].position.y = (int)GetRandomValue(50, H-50);
balls[i].speed.x = (float)GetRandomValue(-4.0, 4.0);
balls[i].speed.y = (float)GetRandomValue(-4.0, 4.0);
balls[i].color = (Color){ GetRandomValue(50, 240), GetRandomValue(80, 240), GetRandomValue(100, 240), 255 };
balls[i].radius = (int)GetRandomValue(5, 10);
}
while(!WindowShouldClose())
{
BeginDrawing();
ClearBackground(RAYWHITE);
for (int i = 0; i < N; i++)
{
balls[i].position.x += balls[i].speed.x;
balls[i].position.y += balls[i].speed.y;
if (((balls[i].position.x + 30) > W) ||
((balls[i].position.x - 30) < 0)) balls[i].speed.x *= -1;
if (((balls[i].position.y + 30) > H) ||
((balls[i].position.y - 30) < 0)) balls[i].speed.y *= -1;
DrawCircle((int)balls[i].position.x, (int)balls[i].position.y, (int)balls[i].radius, balls[i].color);
//DrawTexture(textura, (int)balls[i].position.x, (int)balls[i].position.y, balls[i].color);
}
DrawFPS(50,50);
EndDrawing();
}
free(balls);
//UnloadTexture(textura);
CloseWindow();
return 0;
}
****************************
Pygame
****************************
import pygame
import math
import random
pygame.init()
w, h = 1600, 800
window = pygame.display.set_mode((w, h))
font = pygame.font.SysFont("Arial", 30)
clock = pygame.time.Clock()
#imagem = pygame.image.load("ball2.png").convert_alpha()
def run():
run = True
q = 7500
x = []
y = []
vy = []
vx = []
r = []
cor = []
for i in range(q):
r.append(random.randint(5, 10))
x.append(random.randint(30, w-30))
y.append(random.randint(30, h-30))
cor.append((random.randint(10, 255), random.randint(10, 255), random.randint(10, 255)))
vx.append(random.randint(-4, 4))
vy.append(random.randint(-4, 4))
if vy[i] == 0:
vy[i] = 1
while run:
window.fill((0, 0, 250))
for event in pygame.event.get():
if event.type == pygame.QUIT:
run = False
break
for i in range(q):
if y[i] > h-r[i] or y[i] < r[i]:
vy[i] *= -1
if x[i] < r[i] or x[i] > w-r[i]:
vx[i] *= -1
x[i] += vx[i];
y[i] += vy[i];
pygame.draw.circle(window,cor[i],(x[i],y[i]),r[i])
#window.blit(imagem,(x[i]+20, y[i]+20))
clock.tick()
fp = int(clock.get_fps())
txtsurf = font.render("FPS: " + str(fp), True, (0,0,0))
window.blit(txtsurf, (25,45))
pygame.display.update()
pygame.quit()
if __name__ == "__main__":
run()
********************************************
Anyone could explain the huge jump from C Raylib when using textures???
I'm curious if you could look at Kivy game programming as pygame isn't supported on Android or iOS, there is pygame subset 4 android but that seems out of date and still doesn't support iOS. Mobile platforms are a major market and Kivy seems to be the most recommended even for games. Also what IDE are you using? Is it PyCharm? Still a Great Video! Great Job! :)
I think Ren'py works for Android but it's not very optimised for that in my opinion, and very visual novel orientated. I think Kivy is still probably the best bet for for python on Android right now
AWESOME video, congrats, if you can do more videos like this, pleassse.
Also, please make more 3D rendering things on python
Raylib is a pure C library with over 50 language bindings. I chose it in part for this reason. After developing a game in Python, you have the option of moving to another language for learning or performance reasons, while still keeping your knowledge of the library and all the same calls.
That's so true!
I plan to build a game engine for 2D games in Raylib.
Python and Raylib works too well.
Also, Raylib is even simpler than PyGame and has a better implementation.
For example, if you want to get the key inputs only once in PyGame, you're dependent on pygame's own event loop. But in Raylib you have a function call for that and you can associate it with any class.
@@alexale5488 I'm probably switching to Godot. Raylib was fun, but Godot 4.x just has more features out of the box including easy exporting of your game to different platforms.
nice trick of the cached rotations! I learned something today, if I haven't watched this video I'd have never think of it, though I'm familiar with cache.
Amazing video, thank you for sharing
Amazing test thanks a lot for it!
How technic you use to typewriter constantily and fast?
Thanks, it was interesting to watch.
Damn! The pygame GPU is CRAZY! Do you know when this stuff was implemented? There seems to be essentially no information on this anywhere, but seems to be such a magical way to speed up pygame rendering yet no one is talking about it?
this is still under development and the final API may change
@@CoderSpaceChannel yeah, but it seems to be too useful of a feature to miss out on right now
@@danub5551 a big missing feature is loading textures with the gpu, but you can use the sdl2 functions with ctypes actually
They say the api is subject to change but it hasn't changed in a year so I'd say it's safe to atleast test out. At worst you can modify your code to any changes they make
You're wok is great! Keep it up!
😎 , cool video!
I've always meant to ask: What voice synthesizer do you use? It's so good :)
cloud.google.com/text-to-speech
Question: What would PyQt rank in your test?
How do I use the gpu with pygame?
Any updates on this SDL 2 Video?
Quick question - is it better to choose Pyglet or Arcade for game development?
If we talk about 2D games, then Arcade. But this is only for desktops - Windows, Linux, macOS. Arcade is not suitable for Android, you need something based on OpenGL ES or Pygame
This was unexpected. I love pygame but I thought it would be the worst. Anyway amazing video. And one more thing how did you install pygame on python 3.11 shouldn't it be not supported yet. I always get an error when using pip
pip install pygame==2.1.3.dev8
what are the results for python 3.12 and what are they for the new 3.13 beta?
Hi! great job. Do you have any information about CPU and GPU ram consumption?
It would be interesting to also see some significant game logic running. These tests feel like profiling different c library wrappers.
Is there some good tutorial how to use pygame sdl2?
Would PyGame GPU have any performance benefits from caching the same way PyGame CPU did?
How do you like the idea of creating Pacman on pygame?
Correct me if I'm wrong but can't the performance of pygame be improved by using the .convert_alpha() method on the loaded images
.convert_alpha() is used for surfaces in software (CPU) rendering. When using the GPU to create a texture from an image, .convert_alpha() will result in an error
did you consider pygame-ce ?
I wonder how that compares to something like Bevy or naylib (Nim)
Awesome tutorial, I tried pygame cpu and cpu cache using a sprite sheet instead of separate files and the performance is surprisingly the same, spritesheets are better for use with programs like Tiled.
Although I have one question, how would the numba and taichi libraries be used in this type of test?
using spritesheets only improves loading time, not performance while the program is running
You pay a lot just for the call overhead itself in python. By simplifying the code making less methods you can gain quite a bit of extra speed, but I think the classes you made are probably closer to what people make in real life. Maybe the separate translate and rotate method is overkill.
That's were a 4x4 matrix class with all of its operators and functions would come in handy.
Are the sprites not being batched? I made sprite renderer in C and it can do 100K without breaking a sweat.
The problem is moving sprites, not drawing them. Arcade for example can easily draw several hundred thousand static sprites(albeit with a high load time), but moving them requires updating the data in the OpenGL buffers, pygame pulls ahead when using the GPU here because it is doing parts of this in C, whereas Arcade is 100% python.
man, I am very goood with Python, but never worked with game engines.
Your work is really good!
Where can I donate some money?
Thanks Button 👍
Do you recommend me using Raylib for a commerical game?
I am considering Godot, but I am a Python engineer and I'd find it more pleasurable to keep everything in Python.
I have 1 year experience with PyGame, but Raylib seems more lightweight and more performant.
Imagine raylib with GPU.... Infinite power!
when are you going to make a video of raylib with angle vs pygame
Thank you
Very Nice .
I think you should use a sprite sheet instead of individual sprite images. Should make everything faster.
thanks, helpful advice 👍
it makes stuff faster only for stuff using the gpu i guess, using the same texture means less processing
@ThinkingSpace doesn't rotation and movement use GPU? Changing texture coordinates could be done in vertex shader
@@yds6268 gpu can rotate on the go yea, but in cpu cache mode the surface also already exists so it's the same
@@thinkingspace3438 Not exactly... If the information is in CPU cache sure the CPU could probably calculate it faster than what the GPU for small data sets could, but when you have large data sets of the same type, the GPU is much faster because of the scale of all of its cores. You are limited by the bus transfer speed from the CPU to the GPU. So sending batches of similar large data to the GPU so that the GPU can do those calculations is much faster than having to go back and forth from the CPU to the GPU and back, etc... hundreds or even thousands of times per frame. This causes a major bottleneck. It is not the same.
should test with pygame-ce (pygame community edition)
In arcade you can cheat and hookup up a compute or transform shader wirting to the position and rotation buffer in the spritelist if you want to get millions of sprites. It's probably cheating.. but you can do it :D
pls a tutorial about pygame.sdl2_video
In your Raylib class, every call to update() and draw() creates garbage: a list that is never used. An imperative "for" loop would be more efficient.
don't know why, but I have python 3.11 and the Arcade test went a little better than the pygame + gpu version, while the raylib was the same as pygame + gpu
its probably the hardware that you are using
Yeah, other Python game engines are also important.
But how about ray lib with C? surly that is a lot faster?
Go with Nim (naylib) if you want Python-like syntax and speed
Your speed is like 7×, can you please slower it down a bit I would be very thankful, and can you please make a game where we will have two ships, one being in our control and the other one will be automatically controlled by computer
you should add pygame2
Arcade is not better in everything though.
I tried An nbody simulation with about 500 particles(each only 1 pixel)
pygame did perfectly fine with Around 50-60 FPS.
But arcade was like 2-5 FPS
amongASS
Its definitively not a benchmark test if you use different hardware to get the results.
All of them should be benchmarked on GPU or CPU and comparee objectively, otherwise you are fudging the results in favor of Pygame, because you kept reimplementing it just so it would out perform the others, while failing to actually ensure GPUs were used for the others as well.
Biased data is useless data.
By just changing self.center_x, self.center_y = self.x, self.y to self.postion = self.x, self.y you can increase arcade test count from 3k to 4k (on my machine). You moving each sprite twice per frame and looks like moving is expensive operation in arcade. It sounds stupid I know but it really works. You can easily achieve 6k on arcade by skipping position changing processing for each sprite.
To skip arcade position change processing for each sprite:
add to SpriteUnit init:
self.position = [self.x, self.y] # use list for sprite coords
SpriteUnit
def update(self):
self.rotate()
self.translate()
self.position[:] = self.x, self.y # skip setter
SpriteHandler:
def update(self):
self.sprites.update()
for s in self.sprites:
self.sprites.update_position(s) # update gl buffer.
Looks like use sprite to just drawing is overkill. It has too many logic in python.
Indeed, it matters.
self.set_position(self.x, self.y)
i think it's ok to skip setter when you don't need the collision functionality
@@CoderSpaceChannel
set_postion is just
self.position = (center_x, center_y)
it still include the same overhead.
PS. I found how to get 5k without "hacking"
just need to disable hit_box in SpriteUnit:
super().__init__(hit_box_algorithm=None)
@@itnabigator Thanks, that's pretty useful information when developing with Arcade
@@CoderSpaceChannel Thanks for your videos $) Without it I will not even try to investigate that.
Last time i checked, Raylib didn't have any python support (and was almost strictly C++, so that was long ago...), so this news bring me joy.
You could also try to use @lru_chache from funktools for sprite operations. Could be helpful. Or not, I didn't checked.
(or even better, just @cache because it's the same as @lru_cache(maximize = none)...
And as somebody before already pointed out -> JIT is not here yet, but some data management is...