You are correct! I didn't explain this caveat well in the video.
Год назад+22
I thought so too, but the video made me doubt whether I was confusing it for something else. The reason it's called AOF and not WAL is because it's not written ahead of the change itself 😁
Hey! Nice one! Online Engineering Lead at Ubisoft here... Sorted Sets in Redis are the go-to solution to do leaderboards in the game industry actually, but you're constantly optimizing to battle server costs since everything stays in memory, and larger leaderboards need machines with more memory, which happen to be more expensive. But honestly, as soon as you want persistence and you enable AOF for every write operation, you'll start to run into performance issues on Redis too. There's no magic bullet in engineering - weigh the pros and cons for each use-case and then choose the tool.
Hey! Awesome to hear from someone actually using Redis in this use case. If I may ask about your experience a bit more, how much memory was Redis utilizing per instance in a server or a pod in Kubernetes for a given X number of entries (X number of users most likely) in a leaderboard? I want to gauge if the in-memory tradeoff is worth it for the speedy performance compared to RDMS (I'm assuming Redis is much lightweight compared to those databases as it doesn't have a lot of those abstractions and advanced structures built-in). I'm very excited to hear about your experience on this.
@@Peter-bg1ku write locks and index rebuilding would be a nightmare in this kind of scenario. The former is somewhat a solved problem (most storage engine have row-level lock) but the latter is detrimental to the underlying behavior of relational db.
@@bepamungkas good point. Yeah, indexes would be a massive pita to manage. But what if you stored your relational DB in memory. Would it not be quicker?
You don’t need a cache if your database has enough memory. You don’t need a database if your cache has enough disk space. You don’t need any of them if you have no users.
@@jack171380 you don't need it if the data is not dynamic and no writes are performed, such data can be stored in a file too. I seen many people who don't really need writes and updates only use reads and store in db instead of a file
@@mythbuster6126 I've used Supabase (which is a software stack that includes PostgREST and Auth, etc. using PG) for a few projects and it really speeds up development to not have to create tedious CRUD endpoints for everything. Database logic using PLpgSQL is nice but yeah, we also had the problem that changing something in the procedure code is annoying because you have to recreate the whole method in a migration.
I see this more as for this kind of situation: "Redis would be perfect for this use-case, except that for this little bit of functionality we need to do a 'relational' operation" And instead of going "welp, then let's discard Redis", you can go "for that case we can just setup [workaround] and this design would still work"
@@JohnWillikersmoney for infrastructure does not matter for small companies (in compression with SE salary). Worst thing happens when system has a lot of cache that can be removed with some tricks and better understanding of system as a whole.
In my past experiments with using redis as a primary store it tended to not scale well once the project grew to a certain level of use and complexity. I often use it as a primary store for specific slices of a given application's data persistence strategy when and where it provides advantages over a traditional RDB. Inserts when using a RDB tend to be expensive (especially if the table has a bunch of indices) so Redis can really shine in very write heavy areas of your app like messaging, sensor data capture, etc.
yeah this feels like the work of the smart quirky programmer who didn't want to use traditional databases because "i can do it myself with redis and it will be faster because redis" I don't think a serious software engineer would actually use redis like this for a work project
Yep, cool video and nice production quality, but it’s pretty irresponsible to seriously recommend doing this without many massive disclaimers.. I pity the junior devs who are going to watch this and blindly implement it in important projects in place of an RDB.
He is also missing another key point - if you give any traditional RDBMS (Postgres, Maria, whatever) enough memory for buffer cache, disable write-ahead logs and gonna be using index-organized tables, you will get what Redis gives you out of the box but with some free extra features.
True. At some point technologies become a vocabulary. So when you say "Redis", others read it as "Cache, I know how to use it". This dude just words and made his Esperanto with them. That's like using SQL as a message queue (which I've also encountered and ran away immediately).
@@harleyspeedthrust4013 without a database schema definition, how does one figure out what is going on in a redis database? With an RDBMS, you can inherit a database without docs and sort of figure out what's happening. Not sure if the same is true for Redis. Very good tutorial as I learnt some neat Redis tricks but I wouldn't recommend this to my worst enemy as the way to build apps with persistence.
Next video idea: most people scoff at the idea of using filesystem as a database, but did you know we can recreate Postgres with fs in c++ and achieve similar functionality and safety
I used a file system as a key/value store db about 8 years ago and it was very simple and very fast! Granted looking back, it had many holes such as potential for race conditions as addressed in this video, it still did the job beautifully!
So instead of using postgres I can use redis and recreate most of the SQL functionality myself, with less safety and more complexity and space for error... I struggle to see why to use this in any real world setting. Even for side projects, if I wanna set something up quickly, I don't wanna have to have to create indexing etc myself before being able to develop the app. I'd also be interested how the performance actually plays out once you start parsing long JSONs, maybe it will still be faster, but it's something you didn't consider. Also if you eventually decide to switch to postgres that could be a really hard rewrite.
For years now, once in a while some devs say redis can be used as more than a cache but they themselves never use it as the main db or even as a message broker(bc of its limitations), they only used it as a cache. If they use it as a main db thats only for a short time and end up switching back to relational dbs. So just because Redis can be use as more than a cache doesnt mean it should be. Redis is the ideal choice for caching similar to how SQL dbs are ideal in supporting multiple tables and joining those tables.
I see this as more of an exercise in what a tool can do, less so a "you should absolutely do this". He even goes over at the end of when you wouldn't want to do this and when you "might" want to. I don't think he's trying to push Redis as a be-all-end-all database. He just really likes the tool and is going over the capabilities to get people excited about it.
As always it depends on the workload etc., there isn't a single correct database for all projects. If you need an SQL database then use an SQL database. The example here of directly comparing it to SQL is a bit extreme and only to show that they can be done, not that they necessarily should. Now on the other hand if your workload just requires blazing fast lookups/inserts without complex relations, eg. maybe an auth, permission or chat service then using redis as the main database starts to make sense.
I agree with your points. Redis isn't a full on replacement for postgres (I'm a postgres maximalist personally). However, I do think for a lot of simpler applications, Redis is pretty viable as the starting data store before you even know what your data model may look like. It also helps to reduce complexity of your overall application in order to get deployed more quickly.
Amazing work! Not only the editing and presentation which is always so delightful, the idea itself of using Redis as a database and the persistency concerns are perfectly explained. THANK YOU!
You've been doing great. The content is presented at a reasonable speed, well at least for me. And the content its self has been very interesting and informative. Personally, I'm really interested to hear about your home lab, so please do an overview of it and maybe down the line a deep dive ;-) Thanks again, bud!
Spoken like someone who has no idea what things even mean, bravo. What Redis does is giving lookups in sub milliseconds, providing a significant amount of data types (hashsets, linked list) that can be used to do distributed operations, which are necessary in cloud native applications. They also provide very fast pub/sub and stream patterns that can be used for message busses, none of which SQL provides. Also distributed locking that can be used to synchronize microservices along with probabilistic structures and much more. Plainly said, the Redis use-case is entirely different from SQL database. Their overlap is minimal at best. If you use a database for all of the things I just listed you are braindead. If you use Redis for data operations such as (aggregations, grouping) you are braindead as well.
Broadly agree but if you have a very specific application with clear and (more or less) fixed data definitions you do pay at run time for SQL's flexibility. Almost all SQL systems can be replaced by faster solutions tailored for the specific context. SQL's real power is that it's "good enough" for a vast range of requirements.
You missed a key point. During snapshot redis will fork the process and dump what is in that forked process to disk. This means you need 2x the ram to perform this action but also it’s a global db lock during the forking process. If you have a small dataset this generally fine but if you have a large db this process will cause large spikes in latency when the snapshot begins.
Another aspect of this is the time it takes to reload large snapshots into RAM. After rebooting my server, I often see errors for the first 30-60s as my application tries to read/write from Redis, but Redis rejects the commands it’s still loading the snapshot into memory in its entirety.
On Unix, forking is fast - it only copies the page tables and sets them all to copy-on-write - you'll only incur 2x memory costs and anything resembling "global db lock" if you're regularly writing to the whole database. This is the reason Redis chose to implement snapshotting via forking in the first place.
Not entirely true. Forks, at least on linux, only marks the pages as copy on write, so they are only copied when one of the two processes write to them. It might be theoretically possible to end up with that happening, but very likely not. You are right that this will create some lag spikes when the page fault is triggered, but it does not need to copy all the memory.
I just started working on backend and we have a fairly complex setup. This video dropped just in time. Not only this tells you what redis can do or can't but is a small primer to starting with redis. Top content.
Totally agree with the video, the biggest drawback is the lack of higher level primitives but in some cases it's indeed worth implementing them yourself on top of redis. The company I work for had a chat system as part of our django + postgresql api that could handle around 1k messages per second before it started chocking up (can't remember the exact numbers, it's been years). Partly because of django websocket implementation and partly because of the database model designs, they could have been optimized further but the performance would have still been far from ideal. We reimplemented it as a nodejs microservice with redis as main database and based on tests a single instance could easily handle 10k+ messages per second, no specific optimizations just from switching language and database. After deploying to production it has never gotten high enough real world load to use more than a fraction of a single cpu core in the cluster.
So now I know that I *could* sort of replicate a relational-ish storage model in Redis by reimplementing abstractions that are already in place in any RDBMS by hand at a lower abstraction level. Which is indeed quite interesting. The question is: Why should I, instead of just using an RDBMS?
Our company uses Redis as a runtime database for our live chat system, and it saves off transcripts into a real RDB at various checkpoints. The footprint in redis for all of that is 400MB, and we saved an immense amount of maintenance/development cost doing it this way..
For me, I fell in love with Redis when I had to do work on a logistics management system. Out of the blue, just reading the documentation, I found out that it has in-built geo location functionality and can easily determine the straight line distance between two points, accounting for the curvature of the earth (correct me if I'm wrong, but I believe under the hood, it uses the Great Circle function). Thanks for the vid.
My main concern of using Redis as a main database is memory, not persistence. Most people know that Redis can persist data on disk. Memory is much more expensive and complicated to expend than simply adding storage. And the fact you need to think about scaling sooner, rather than later. Scaling now depends more on the amount of data you need to always keep in memory (even if you're not using it), not on the number of users(who typically generate profit). If only there was a way to offload unused data and only keep in memory frequently accessed entries? Oh right, caching! That's why I prefer to use it as a cache. Redis used to have Virtual Memory feature(to swap unused data to disk), which is now deprecated for some reason. One can use swap of the OS however, but it never explained if it's a good idea to do so. P.S. Scaling Redis seems to be easier, than scaling SQL databases, but this aspect never mentioned when talking about using Redis as a main database.
I see some usecases in which you could use redis as you said, but for me the main drawback is data consistency. Having a very well defined schema of your data and the responsability of shaping it right makes things way more maintanable. That s my personal problem in general with nosql
It might be good to add that redis transactions are not nearly as powerful as SQL transactions. You can resort to using LUA scripts to fix that but this is really annoying to maintain and to write proper tests for.
I've already been using it as a primary database. Our data is very small and most of it is non-persistent (status hash tables that expire after 30 days). But we do keep record of transactions, which is an integer timestamp and a simple 41 character id string. I also commit an apparent faux pas, and I use the Logical DB set to organize my data. Apparently it's an anti-pattern because some people may implicitly think they're actually separate systems when they're not. But that doesn't really matter since the way I use them as almost like folders to help organize my data.
Killing the pod will just send sigterm and the DB can actually catch that and save the data. I think it might be much more interesting to have a program running in the background spam writing and then sigkilling the DB to see how frequently it saves to disk, if any data corruptions arise and how it handles these. It surely doesnt write every single action to disk without any batching, that would be too slow
You can configure the AOF, on my instance it was flushing the disk every 1 second. It does store every write operation and it does impact performance. If you want a fast cache and don't care about data loss then it's not worth enabling.
I use it to lock out a request across multiple instances of an app. Where money and transactions are completed, I couldn’t have a user start more than one thread across multiple apps, so it’s GREAT to lock out that process single file by user
Very good and informative video, thank you. As a user of Postgresql since 2001, you are missing the biggest and most important feature of RDBMS -> data and reference integrity. This is unachievable by redis or any other nosql solution. Of course, if you have complex data models and you have the need and skills to write good SQL, redis falls short big time. (CTEs, Window functions, triggers, aggregates, etc) But if you are building something really simple or without structure, redis is great.
Postgres Transactional speed > Redis Transactional speed. Also Redis is single thread write/read so you need compex cluster and one command could lock everything. You used Redis correct, now you start using Redis not what is designed for. Seems like you doing bold claims without real usage in big systems those approaches.
Yep, you definitely have to consider the use cases for each approach. That being said, we've got a decent cluster than does some real heavy lifting, but we are using a 5 node cluster to achieve it with sharding.
The awesome bomb you threw was *"dont use KEY" in production unless you're debugging, it's slow and blocking* . I seriously didn't know that. I searched it up and it is absolutely true.
Using Redis for a db is similar to using MongoDb or any key/value NoSql database. You outlined the cons quite well but addressed them with Redis datatypes and transactions. I think the big pro to using Redis as a NoSql database is the simplicity if one has a use case that doesn’t need complex querying. This is referred to as impedance mismatch I believe, where the dataset your UI + API match that of the database.
Your videos are amazing. So clear and well-paced. This is the first one of yours that ive come across. Definitely subscribing. Do you use after effects for all your anims?
Thank you! I'm transitioning more and more to Davinci Resolve (Fusion) entirely, but there's some things I still use After Effects for! Fusion is really powerful, but there's a lot less resources on how to use it.
@@dreamsofcodeyeah I used to use AE a little. I use Resolve now but have been a bit slow to dive into Fusion. It's actually pretty good from what I can tell. Thanks for letting me know. I' always interested to see how people make their videos cool.
For personal projects I use IndexedDB and then I sync it to sqlite on the back end. With everything on the front end it makes it super simple to have things nice and speedy. Granted, I don't need to worry about live data this way. So, it isn't a good model for all apps.
Good to know. Sometimes you just happen to have a redis instance and need a minimum of database functionality. In my former job I implimented a caching service, were the invalidation logic was very complex I ended up implimenting it in postgres, but we allready had a redis instance for caching auth sessions, so I could have done it there as well.
If I had to chose a favorite DB of any kind it would probably be redis. I also recommend anyone to build a simple crud app only using redis as the database and implementing all the operations with that. It’ll teach you so much about structuring data and how to query it.
At this point, I would like to see a video on how you make your RUclips videos. Very clean and interesting. Please make a video on how you achieve this including all the fun and interesting animations. Thanks
Nice I like REDIS but there is a lot to manage and that is a lot of data sets, a normal DB lets me abstract this away with table index magic and in this case I might as well just use python/TCL/C++ with hahses, sets and arrays (though TCL will let me push raw commands in avoiding the need for an API) and write every transaction to a rotating log file with a threaded snapshot by time.
I'm just learning programming, so from my perspective I'd say Cache always, as it seems to be really god at it, nothing can replace it & is fast to implement; as Database Manager makes no sense, as mature DM can do the work way faster. So this would be my default config until I need to change it; 'coz each is where they do a good job, so the default should be the more efficicent; therefore a new tool or expanding the capacity of either, should only be done when it's more efficient that way, in both ways: implementing and using
I use tarantool, can support SQL, have append only file also, can store more than ram if using vinyl engine (not the default memtx engine) the gotchas is that no date data type (so i use epoch/timestamp without timezone), and the table/column name all capitalized so if you create table with lowercase/mixed-case you must quote it on query so can overcome all redis limitation.
for non OLTP use case i use clickhouse, so for log (compressed, searchable), metrics (using materialized view), and all events stored there, if i need to read it fast i update periodically i push it back to tarantool, so all use case can be fulfilled with combination of both oh also there's one more gotchas in tarantool, you cannot alter table to insert column in the middle (only at the end is possible), have to copy whole table, but since it's really2 fast 200k rps like with redis/aerospike, it's really easy to change schema with full copy
I worked on a project where redis transactions out of the box were insufficient. Fortunately Redis also supports loading Lua scripts onto the Redis server. I think this is the only way to do certain business logic transactionally? But its cool, because in the analogy of this video, you can kinda point and say 'look. Stored procedures too!'
The lua integration is super powerful. I believe you're correct and all Lua scripting is done in transactions, which does allow for more business logic. I think the major caveat is that it locks the database so as long as it's not doing anything slow then it's all good.
I would argue against using Redis as a primary database, although I'm sure there are use cases as you pointed out. But in general, a primary database will need to be more flexible than what Redis can provide without creating too complexity. This includes hundreds of tables, various indexes required, the primary data needs to be easily accessible by traditional SQL queries for different users or departments, ease of generating reports, hundreds of millions of records and replication. If you start with Redis and later migrate to relational database, that is not trivial. One of the biggest thing (that I think a primary database has to do) is generating reports with several joins and multiple criteria and really messy SQL that it's just not practical to use Redis. I think it could be fine using Redis as a database but not as primary database.
I really, really hope to never maintain anything you or anyone write in this way. Also hope this is a video with "this is possible, but should not be done in real applications". By the point you finish your program you will have an entire framework based on Redis to handle data. This is not only reinventing the wheel, it is reinventing the whole history-line =)
I think if you need a very high throughput for read/write operations, sure. But making redis a drop in replacement for something like postgres or mongo db.. It's not the same tool, it wasn't really designed to do the things you are suggesting so I feel you are trying to force a square peg in a round hole.
Thank you for this video, you tought me some new things about Redis! Did you leave out Redis JSON on purpose, because it's an awesome feature; it allows for storing JSON documents as values, and creating indexes on the actual JSON content. We're using it in our CQRS setup, both as message store as well as the Query model.
Im working as a developer and i knew this stuff before i got there and kept wondering why people arent just considering redis as the db instead of just a chaching layer. I actually didn't know it could be used as a cache aswell before then 😂
AOF persistence may be is fine if your doing a read-heavy workload, but limits your write performance to no better than a database that’s writing to disk. On the other side of the equation, if your willing to dedicate enough memory to your primary DB to hold your entire dataset in-memory you can accomplish that with fairly simple cache settings. There are still some things that redis will do better, but it’s also going to come with a lot of trade offs. With a big one being that you can’t really use an ORM, and that a lot of really large-scale databasing strategies just don’t work well with redis (at least without a ton of work)
I think your missing a lot of important drawbacks / missing features while comparing with traditional relational databases what makes this video quite misleading imho. Some examples: relations between tables (or generally powerful constraints) for ensuring data integrity, performant complex queries (+query planner), availability of tooling, data compression, read performance from disk, … I can see Redis in the real world only as a high-performant database (for often used data) where data integrity is not a hard requirement and data structures are simple. Which in most instances is basically caching. The thing it’s built for. That beeing said, the video had a lot of interesting insight to Redis and I appreciated it! I just want to make clear, that I've wished for a more critical comparison with traditional primary databases that do have a lot more requirements for most applications.
Next step: wrapper for MySQL to use Redis as storage engine. Btw, relational databases HAVE build in caches of various sorts and can pull data into memory for faster access. Zoomers just do not have time to learn sql, they are busy stacking cool toys into their CVs.
@Dreams of Code , How would you classify a data base to be small or big? what if my writes to database is for 20,000 users with around 3000 concurrent requests to DB, Will Redis be ok in this scenario or Postgres might work out better? I will definitely go for a regression testing in both as the project itself is small - but still would like to know everyone's perspective as well.
Once worked with an application that persisted all data in redis, and had a catastrophic data loss event: all data for the company's customers went up in smoke when the redis server restarted because it'd been misconfigured to have only the password for the server set, so I'd really not recommend using redis as your primary data storage
I used redis as a database for location data which on moving object is ephemeral my default. Then the geospatial store was super usefull finding objects that are closeby
I would truly appreciate a homelab setup from you, love your videos, God Speed!
Thank you so much!! I'm really looking forward to making the homelab content. Glad you are enjoying the channel as well!
Absolutely, thank you for making the content. You get exactly what you deserve :)
Oh yes please! Love this concept and the surprise factor!
Be aware that in AOF (Append Only File) persistence, Redis saves logs every second by default, not every write.
You are correct! I didn't explain this caveat well in the video.
I thought so too, but the video made me doubt whether I was confusing it for something else. The reason it's called AOF and not WAL is because it's not written ahead of the change itself 😁
You can make it log every write, but it's slow.
If redish will do that it looses much of it speed at writing operation.
I found Redis to be too slow, so I sped things up by no longer storing any data
Hey! Nice one! Online Engineering Lead at Ubisoft here... Sorted Sets in Redis are the go-to solution to do leaderboards in the game industry actually, but you're constantly optimizing to battle server costs since everything stays in memory, and larger leaderboards need machines with more memory, which happen to be more expensive.
But honestly, as soon as you want persistence and you enable AOF for every write operation, you'll start to run into performance issues on Redis too. There's no magic bullet in engineering - weigh the pros and cons for each use-case and then choose the tool.
Nice to see a fellow Indian here. Preach brother. ❤
Hey! Awesome to hear from someone actually using Redis in this use case. If I may ask about your experience a bit more, how much memory was Redis utilizing per instance in a server or a pod in Kubernetes for a given X number of entries (X number of users most likely) in a leaderboard? I want to gauge if the in-memory tradeoff is worth it for the speedy performance compared to RDMS (I'm assuming Redis is much lightweight compared to those databases as it doesn't have a lot of those abstractions and advanced structures built-in). I'm very excited to hear about your experience on this.
Why not use a relational database for the leaderboards?
@@Peter-bg1ku write locks and index rebuilding would be a nightmare in this kind of scenario. The former is somewhat a solved problem (most storage engine have row-level lock) but the latter is detrimental to the underlying behavior of relational db.
@@bepamungkas good point. Yeah, indexes would be a massive pita to manage. But what if you stored your relational DB in memory. Would it not be quicker?
You don’t need a cache if your database has enough memory. You don’t need a database if your cache has enough disk space. You don’t need any of them if you have no users.
@@jack171380 you don't need it if the data is not dynamic and no writes are performed, such data can be stored in a file too. I seen many people who don't really need writes and updates only use reads and store in db instead of a file
@@jack171380 love it!
Perfect example for: Just because we can doesn't mean we should.
But content wise this video is top notch
@@mythbuster6126 I've used Supabase (which is a software stack that includes PostgREST and Auth, etc. using PG) for a few projects and it really speeds up development to not have to create tedious CRUD endpoints for everything. Database logic using PLpgSQL is nice but yeah, we also had the problem that changing something in the procedure code is annoying because you have to recreate the whole method in a migration.
I see this more as for this kind of situation:
"Redis would be perfect for this use-case, except that for this little bit of functionality we need to do a 'relational' operation"
And instead of going "welp, then let's discard Redis", you can go "for that case we can just setup [workaround] and this design would still work"
Same feeling, you have to Plan ahead and do so many workaround to replace what in sql Is basically free.
@@RoyBellingan Its more then that, fully replicated redis is so expensive. What would be ~$600 USD in SQL s like ~$1300 in redis.
@@JohnWillikersmoney for infrastructure does not matter for small companies (in compression with SE salary). Worst thing happens when system has a lot of cache that can be removed with some tricks and better understanding of system as a whole.
In my past experiments with using redis as a primary store it tended to not scale well once the project grew to a certain level of use and complexity. I often use it as a primary store for specific slices of a given application's data persistence strategy when and where it provides advantages over a traditional RDB. Inserts when using a RDB tend to be expensive (especially if the table has a bunch of indices) so Redis can really shine in very write heavy areas of your app like messaging, sensor data capture, etc.
Dedicated HomeLab pls
Absolutely!
Let's gooo 🥳
Lets goo!
Yes, so we can figure out the networking and how to do it securely.
yayyy
Can confirm: you were using redis the right way at the start. Enjoyed watching this but wowzer, I can't imagine inheriting a project setup this way...
yeah this feels like the work of the smart quirky programmer who didn't want to use traditional databases because "i can do it myself with redis and it will be faster because redis"
I don't think a serious software engineer would actually use redis like this for a work project
Yep, cool video and nice production quality, but it’s pretty irresponsible to seriously recommend doing this without many massive disclaimers.. I pity the junior devs who are going to watch this and blindly implement it in important projects in place of an RDB.
He is also missing another key point - if you give any traditional RDBMS (Postgres, Maria, whatever) enough memory for buffer cache, disable write-ahead logs and gonna be using index-organized tables, you will get what Redis gives you out of the box but with some free extra features.
True. At some point technologies become a vocabulary. So when you say "Redis", others read it as "Cache, I know how to use it". This dude just words and made his Esperanto with them. That's like using SQL as a message queue (which I've also encountered and ran away immediately).
@@harleyspeedthrust4013 without a database schema definition, how does one figure out what is going on in a redis database? With an RDBMS, you can inherit a database without docs and sort of figure out what's happening. Not sure if the same is true for Redis. Very good tutorial as I learnt some neat Redis tricks but I wouldn't recommend this to my worst enemy as the way to build apps with persistence.
Next video idea: most people scoff at the idea of using filesystem as a database, but did you know we can recreate Postgres with fs in c++ and achieve similar functionality and safety
This is a fun idea. Maybe even building a simple database from scratch!
basically a sqlite
I’ve used git as a versioned nosql fs db. Beware of inode exhaustion
use pastebin instead of the local file system and boom, cloud database! give me 500 million in funding plz k thnx
I used a file system as a key/value store db about 8 years ago and it was very simple and very fast! Granted looking back, it had many holes such as potential for race conditions as addressed in this video, it still did the job beautifully!
So instead of using postgres I can use redis and recreate most of the SQL functionality myself, with less safety and more complexity and space for error...
I struggle to see why to use this in any real world setting. Even for side projects, if I wanna set something up quickly, I don't wanna have to have to create indexing etc myself before being able to develop the app. I'd also be interested how the performance actually plays out once you start parsing long JSONs, maybe it will still be faster, but it's something you didn't consider. Also if you eventually decide to switch to postgres that could be a really hard rewrite.
For years now, once in a while some devs say redis can be used as more than a cache but they themselves never use it as the main db or even as a message broker(bc of its limitations), they only used it as a cache. If they use it as a main db thats only for a short time and end up switching back to relational dbs. So just because Redis can be use as more than a cache doesnt mean it should be. Redis is the ideal choice for caching similar to how SQL dbs are ideal in supporting multiple tables and joining those tables.
I see this as more of an exercise in what a tool can do, less so a "you should absolutely do this". He even goes over at the end of when you wouldn't want to do this and when you "might" want to. I don't think he's trying to push Redis as a be-all-end-all database. He just really likes the tool and is going over the capabilities to get people excited about it.
As always it depends on the workload etc., there isn't a single correct database for all projects. If you need an SQL database then use an SQL database. The example here of directly comparing it to SQL is a bit extreme and only to show that they can be done, not that they necessarily should. Now on the other hand if your workload just requires blazing fast lookups/inserts without complex relations, eg. maybe an auth, permission or chat service then using redis as the main database starts to make sense.
@@novadea1643 using redis for auth, permission etc is mostly to support the main db and not using it as the main db
I agree with your points. Redis isn't a full on replacement for postgres (I'm a postgres maximalist personally).
However, I do think for a lot of simpler applications, Redis is pretty viable as the starting data store before you even know what your data model may look like. It also helps to reduce complexity of your overall application in order to get deployed more quickly.
Amazing work! Not only the editing and presentation which is always so delightful, the idea itself of using Redis as a database and the persistency concerns are perfectly explained.
THANK YOU!
Thank you very much!
You've been doing great. The content is presented at a reasonable speed, well at least for me. And the content its self has been very interesting and informative. Personally, I'm really interested to hear about your home lab, so please do an overview of it and maybe down the line a deep dive ;-) Thanks again, bud!
Thank you! I'm glad you enjoyed it.
@@dreamsofcode I am also interested in your homelab. Everything very well explained in this video, a pleasure to watch and learn
This made me appreciate SQL more. Look what they need to mimic a fraction of SQL power
Weird flex... and incorrect.
Is this a joke comment?
Spoken like someone who has no idea what things even mean, bravo.
What Redis does is giving lookups in sub milliseconds, providing a significant amount of data types (hashsets, linked list) that can be used to do distributed operations, which are necessary in cloud native applications. They also provide very fast pub/sub and stream patterns that can be used for message busses, none of which SQL provides. Also distributed locking that can be used to synchronize microservices along with probabilistic structures and much more.
Plainly said, the Redis use-case is entirely different from SQL database. Their overlap is minimal at best. If you use a database for all of the things I just listed you are braindead. If you use Redis for data operations such as (aggregations, grouping) you are braindead as well.
Likewise. I'm not sure why they all fear using SQL so much. Truly a strange phobia.
Broadly agree but if you have a very specific application with clear and (more or less) fixed data definitions you do pay at run time for SQL's flexibility. Almost all SQL systems can be replaced by faster solutions tailored for the specific context. SQL's real power is that it's "good enough" for a vast range of requirements.
You missed a key point. During snapshot redis will fork the process and dump what is in that forked process to disk. This means you need 2x the ram to perform this action but also it’s a global db lock during the forking process. If you have a small dataset this generally fine but if you have a large db this process will cause large spikes in latency when the snapshot begins.
Thank you for covering it! Definitely worth understanding when considering Redis.
Another aspect of this is the time it takes to reload large snapshots into RAM. After rebooting my server, I often see errors for the first 30-60s as my application tries to read/write from Redis, but Redis rejects the commands it’s still loading the snapshot into memory in its entirety.
yikes! key indeed
On Unix, forking is fast - it only copies the page tables and sets them all to copy-on-write - you'll only incur 2x memory costs and anything resembling "global db lock" if you're regularly writing to the whole database. This is the reason Redis chose to implement snapshotting via forking in the first place.
Not entirely true. Forks, at least on linux, only marks the pages as copy on write, so they are only copied when one of the two processes write to them. It might be theoretically possible to end up with that happening, but very likely not. You are right that this will create some lag spikes when the page fault is triggered, but it does not need to copy all the memory.
0:01 not anymore 🥲
You are really great at this. Your explanations are concise and clear. The pacing is perfect. The editing is quite good.
I just started working on backend and we have a fairly complex setup. This video dropped just in time. Not only this tells you what redis can do or can't but is a small primer to starting with redis. Top content.
Totally agree with the video, the biggest drawback is the lack of higher level primitives but in some cases it's indeed worth implementing them yourself on top of redis.
The company I work for had a chat system as part of our django + postgresql api that could handle around 1k messages per second before it started chocking up (can't remember the exact numbers, it's been years). Partly because of django websocket implementation and partly because of the database model designs, they could have been optimized further but the performance would have still been far from ideal.
We reimplemented it as a nodejs microservice with redis as main database and based on tests a single instance could easily handle 10k+ messages per second, no specific optimizations just from switching language and database. After deploying to production it has never gotten high enough real world load to use more than a fraction of a single cpu core in the cluster.
"redis is an open source..."
hmm, unfortunately this aged bad
They done did us dirty
So now I know that I *could* sort of replicate a relational-ish storage model in Redis by reimplementing abstractions that are already in place in any RDBMS by hand at a lower abstraction level. Which is indeed quite interesting. The question is: Why should I, instead of just using an RDBMS?
Our company uses Redis as a runtime database for our live chat system, and it saves off transcripts into a real RDB at various checkpoints. The footprint in redis for all of that is 400MB, and we saved an immense amount of maintenance/development cost doing it this way..
I think you've convinced me to stick with RDBMS + SQL as the primary database for my relational data 😁
Really enjoyed this. It was cool to see the reasoning behind certain redis features. Thanks!
Wow. Just wow. You totally delivered, forwarded this to my colleagues.
Thank you so much!
I was already using redis as a primary database, so seeing this video cheered me up!
Perfect explanation. All of the useful usecases you introduced and easy to understand with real-world examples. Thanks!
For me, I fell in love with Redis when I had to do work on a logistics management system. Out of the blue, just reading the documentation, I found out that it has in-built geo location functionality and can easily determine the straight line distance between two points, accounting for the curvature of the earth (correct me if I'm wrong, but I believe under the hood, it uses the Great Circle function). Thanks for the vid.
My main concern of using Redis as a main database is memory, not persistence. Most people know that Redis can persist data on disk.
Memory is much more expensive and complicated to expend than simply adding storage.
And the fact you need to think about scaling sooner, rather than later. Scaling now depends more on the amount of data you need to always keep in memory (even if you're not using it), not on the number of users(who typically generate profit). If only there was a way to offload unused data and only keep in memory frequently accessed entries? Oh right, caching! That's why I prefer to use it as a cache. Redis used to have Virtual Memory feature(to swap unused data to disk), which is now deprecated for some reason. One can use swap of the OS however, but it never explained if it's a good idea to do so.
P.S. Scaling Redis seems to be easier, than scaling SQL databases, but this aspect never mentioned when talking about using Redis as a main database.
Did you watch until the end?
@@dreamsofcodenot yet
I see some usecases in which you could use redis as you said, but for me the main drawback is data consistency. Having a very well defined schema of your data and the responsability of shaping it right makes things way more maintanable. That s my personal problem in general with nosql
It might be good to add that redis transactions are not nearly as powerful as SQL transactions.
You can resort to using LUA scripts to fix that but this is really annoying to maintain and to write proper tests for.
straightforward, well written, clear and simple videos, love them!
I can't believe how quick you made a 21 minute video about Redis feel. Great video
Thank you!
I’m reminded off the scene in Jurassic Park when Ian Malcolm says ‘Just because you can, doesn’t mean you should’. Or something like that.
I've already been using it as a primary database. Our data is very small and most of it is non-persistent (status hash tables that expire after 30 days). But we do keep record of transactions, which is an integer timestamp and a simple 41 character id string. I also commit an apparent faux pas, and I use the Logical DB set to organize my data. Apparently it's an anti-pattern because some people may implicitly think they're actually separate systems when they're not. But that doesn't really matter since the way I use them as almost like folders to help organize my data.
Definitely interested in your setup :P - thanks for the hard work on this content!
My pleasure! Glad you enjoyed it!
Such high-quality content we need more content like this on yt
High quality content thanks Dreams of Code.
Glad you enjoyed it!
Was using redis as my primary database, but still learned a lot from watching this
Killing the pod will just send sigterm and the DB can actually catch that and save the data. I think it might be much more interesting to have a program running in the background spam writing and then sigkilling the DB to see how frequently it saves to disk, if any data corruptions arise and how it handles these. It surely doesnt write every single action to disk without any batching, that would be too slow
You can configure the AOF, on my instance it was flushing the disk every 1 second. It does store every write operation and it does impact performance. If you want a fast cache and don't care about data loss then it's not worth enabling.
I use it to lock out a request across multiple instances of an app. Where money and transactions are completed, I couldn’t have a user start more than one thread across multiple apps, so it’s GREAT to lock out that process single file by user
Very good and informative video, thank you.
As a user of Postgresql since 2001, you are missing the biggest and most important feature of RDBMS -> data and reference integrity. This is unachievable by redis or any other nosql solution.
Of course, if you have complex data models and you have the need and skills to write good SQL, redis falls short big time. (CTEs, Window functions, triggers, aggregates, etc)
But if you are building something really simple or without structure, redis is great.
ayyeee congrats getting a home lab set up that's sick!!
+1 For the homelab setup video!
Postgres Transactional speed > Redis Transactional speed. Also Redis is single thread write/read so you need compex cluster and one command could lock everything.
You used Redis correct, now you start using Redis not what is designed for. Seems like you doing bold claims without real usage in big systems those approaches.
Yep, you definitely have to consider the use cases for each approach. That being said, we've got a decent cluster than does some real heavy lifting, but we are using a 5 node cluster to achieve it with sharding.
The awesome bomb you threw was *"dont use KEY" in production unless you're debugging, it's slow and blocking* . I seriously didn't know that. I searched it up and it is absolutely true.
It's a rite of passage when debugging in prod at times!
Using Redis for a db is similar to using MongoDb or any key/value NoSql database. You outlined the cons quite well but addressed them with Redis datatypes and transactions. I think the big pro to using Redis as a NoSql database is the simplicity if one has a use case that doesn’t need complex querying. This is referred to as impedance mismatch I believe, where the dataset your UI + API match that of the database.
Your videos are amazing. So clear and well-paced. This is the first one of yours that ive come across. Definitely subscribing. Do you use after effects for all your anims?
Thank you!
I'm transitioning more and more to Davinci Resolve (Fusion) entirely, but there's some things I still use After Effects for! Fusion is really powerful, but there's a lot less resources on how to use it.
@@dreamsofcodeyeah I used to use AE a little. I use Resolve now but have been a bit slow to dive into Fusion. It's actually pretty good from what I can tell. Thanks for letting me know. I' always interested to see how people make their videos cool.
For personal projects I use IndexedDB and then I sync it to sqlite on the back end. With everything on the front end it makes it super simple to have things nice and speedy. Granted, I don't need to worry about live data this way. So, it isn't a good model for all apps.
I love this concept, but that api on indexedDB though... ooof, its not great
Good to know. Sometimes you just happen to have a redis instance and need a minimum of database functionality. In my former job I implimented a caching service, were the invalidation logic was very complex I ended up implimenting it in postgres, but we allready had a redis instance for caching auth sessions, so I could have done it there as well.
Really good introduction to Redis! Huge thank you! I see you're a man of culture as well with those DrafonBallZ and Naruto references ❤
Awesome work! Thanks! Please also do a seperate tutorial for "Python and Redis Queues"
this is a great video, i hope redis doesn't change its license to a non-open source license
they already did,.
This 20 minute video taught me so much more than all the other redis videos on the internet, thanks ❤
If I had to chose a favorite DB of any kind it would probably be redis. I also recommend anyone to build a simple crud app only using redis as the database and implementing all the operations with that. It’ll teach you so much about structuring data and how to query it.
Thank you for the beautiful content, infos, effort... thank you
At this point, I would like to see a video on how you make your RUclips videos. Very clean and interesting. Please make a video on how you achieve this including all the fun and interesting animations.
Thanks
I always wondered what Redis was used for, this is a great video to go from zero to having some clue.
Glad you enjoyed it!
I will stick with a Relational Database
It's also nice to know about ACID to understand why redis is not a good idea for a primary database. Overall nice video, thanks
I should do a video on ACID!
believe it or not, many non ACID databases are used as primary databases, such as MySQL(myISAM), MongoDB, CouchDB!
Nice I like REDIS but there is a lot to manage and that is a lot of data sets, a normal DB lets me abstract this away with table index magic and in this case I might as well just use python/TCL/C++ with hahses, sets and arrays (though TCL will let me push raw commands in avoiding the need for an API) and write every transaction to a rotating log file with a threaded snapshot by time.
I'm just learning programming, so from my perspective I'd say Cache always, as it seems to be really god at it, nothing can replace it & is fast to implement; as Database Manager makes no sense, as mature DM can do the work way faster. So this would be my default config until I need to change it; 'coz each is where they do a good job, so the default should be the more efficicent; therefore a new tool or expanding the capacity of either, should only be done when it's more efficient that way, in both ways: implementing and using
I am very interested in your updated Home Lab Setup, please do a video or post about it.😀
Homelab video would be really interesting to watch!
Absolute great video, I can't thank enough 🙏 sir you are a master jedi in redis ...P/S your tom bombadil part caught my eyes :)
Thank you! I'm glad you noticed that!
silmarillion moment
I use tarantool, can support SQL, have append only file also, can store more than ram if using vinyl engine (not the default memtx engine) the gotchas is that no date data type (so i use epoch/timestamp without timezone), and the table/column name all capitalized so if you create table with lowercase/mixed-case you must quote it on query
so can overcome all redis limitation.
for non OLTP use case i use clickhouse, so for log (compressed, searchable), metrics (using materialized view), and all events stored there, if i need to read it fast i update periodically i push it back to tarantool, so all use case can be fulfilled with combination of both
oh also there's one more gotchas in tarantool, you cannot alter table to insert column in the middle (only at the end is possible), have to copy whole table, but since it's really2 fast 200k rps like with redis/aerospike, it's really easy to change schema with full copy
Beautiful editing
Great insights, thank you!
Great video. I'm convinced Redis is only good for the caching layer.
I worked on a project where redis transactions out of the box were insufficient. Fortunately Redis also supports loading Lua scripts onto the Redis server. I think this is the only way to do certain business logic transactionally? But its cool, because in the analogy of this video, you can kinda point and say 'look. Stored procedures too!'
The lua integration is super powerful. I believe you're correct and all Lua scripting is done in transactions, which does allow for more business logic. I think the major caveat is that it locks the database so as long as it's not doing anything slow then it's all good.
I would argue against using Redis as a primary database, although I'm sure there are use cases as you pointed out. But in general, a primary database will need to be more flexible than what Redis can provide without creating too complexity. This includes hundreds of tables, various indexes required, the primary data needs to be easily accessible by traditional SQL queries for different users or departments, ease of generating reports, hundreds of millions of records and replication. If you start with Redis and later migrate to relational database, that is not trivial. One of the biggest thing (that I think a primary database has to do) is generating reports with several joins and multiple criteria and really messy SQL that it's just not practical to use Redis. I think it could be fine using Redis as a database but not as primary database.
Definitely do a home lab setup tutorial!
I really, really hope to never maintain anything you or anyone write in this way.
Also hope this is a video with "this is possible, but should not be done in real applications".
By the point you finish your program you will have an entire framework based on Redis to handle data.
This is not only reinventing the wheel, it is reinventing the whole history-line =)
I think if you need a very high throughput for read/write operations, sure. But making redis a drop in replacement for something like postgres or mongo db.. It's not the same tool, it wasn't really designed to do the things you are suggesting so I feel you are trying to force a square peg in a round hole.
1:47 Exactly! Please create video on setting up homelab. By the way, I have only single laptop. :)
Alright, I'll try it for the project I'm working. If this fails we'll be back with our pitchforks
Thank you for this video, you tought me some new things about Redis!
Did you leave out Redis JSON on purpose, because it's an awesome feature; it allows for storing JSON documents as values, and creating indexes on the actual JSON content. We're using it in our CQRS setup, both as message store as well as the Query model.
Im working as a developer and i knew this stuff before i got there and kept wondering why people arent just considering redis as the db instead of just a chaching layer. I actually didn't know it could be used as a cache aswell before then 😂
interesting with your terminal, can you share what theme did you used? 2:23
Great video. Thanks
Would like to see your video on your homelab. Was there any already? Or would be? Thanks!
Good video. I am so sick of hearing Redis described as a "cache" or as a "name/value store." It is both of these things, but it is SO much more.
This video taught a lot about Redis and thinking in redis. But, I will still choose PostgresSQL as the primary one
1:50 that would be great!
Dedicated HomeLab channel?!?
Yes, please
i second this so hard
Redis is a wonderful tool. I used also for GiS (GeoHash commands) and as a powerful reverse index.
Great!
Would love one like it on redis-stack please!
AOF persistence may be is fine if your doing a read-heavy workload, but limits your write performance to no better than a database that’s writing to disk.
On the other side of the equation, if your willing to dedicate enough memory to your primary DB to hold your entire dataset in-memory you can accomplish that with fairly simple cache settings.
There are still some things that redis will do better, but it’s also going to come with a lot of trade offs. With a big one being that you can’t really use an ORM, and that a lot of really large-scale databasing strategies just don’t work well with redis (at least without a ton of work)
Awesome
I didn't understand half of what you said😅
Will rewatch it again 😅😅
Thanks for the content 😊
Great content! Subbed.
This should be official documentation to redis. Excellent.
Such a healing rendition .
Don't the Redis OM libraries take care of the first drawback you mentioned?
I think your missing a lot of important drawbacks / missing features while comparing with traditional relational databases what makes this video quite misleading imho.
Some examples: relations between tables (or generally powerful constraints) for ensuring data integrity, performant complex queries (+query planner), availability of tooling, data compression, read performance from disk, …
I can see Redis in the real world only as a high-performant database (for often used data) where data integrity is not a hard requirement and data structures are simple. Which in most instances is basically caching. The thing it’s built for.
That beeing said, the video had a lot of interesting insight to Redis and I appreciated it! I just want to make clear, that I've wished for a more critical comparison with traditional primary databases that do have a lot more requirements for most applications.
Next step: wrapper for MySQL to use Redis as storage engine.
Btw, relational databases HAVE build in caches of various sorts and can pull data into memory for faster access. Zoomers just do not have time to learn sql, they are busy stacking cool toys into their CVs.
@Dreams of Code , How would you classify a data base to be small or big? what if my writes to database is for 20,000 users with around 3000 concurrent requests to DB, Will Redis be ok in this scenario or Postgres might work out better? I will definitely go for a regression testing in both as the project itself is small - but still would like to know everyone's perspective as well.
Once worked with an application that persisted all data in redis, and had a catastrophic data loss event: all data for the company's customers went up in smoke when the redis server restarted because it'd been misconfigured to have only the password for the server set, so I'd really not recommend using redis as your primary data storage
I used redis as a database for location data which on moving object is ephemeral my default. Then the geospatial store was super usefull finding objects that are closeby
amazing video
Thank you!
you should also cover the redis extensions. they're great
This is a great idea! I shall add it to my backlog
Great video 👍