MongoDB vs. PostgreSQL: Performance & Functionality

Поделиться
HTML-код
  • Опубликовано: 8 фев 2025

Комментарии • 274

  • @AntonPutra
    @AntonPutra  2 месяца назад +28

    🔴 To support my channel, I'd like to offer Mentorship/On-the-Job Support/Consulting (me@antonputra.com)

    • @LotnyLotnik
      @LotnyLotnik 2 месяца назад +3

      I think you might have an error here with MongoDB configuration. MongoDB writes all write operations to disk as well for data consistency. Given that we only see spikes, that means you did not provide proper paths to all writable information.
      MongoBD writes and keeps data in memory, yes. But to preserve consistency it writes to Journal. Journal is by default configured on a different path then actual data of the MongoDB! Please make sure that you put journal on the same external drive to check for write usage! It might be that you speedup mongo by allowing it to use two separate drives for all operations!

  • @svetlinzarev3453
    @svetlinzarev3453 2 месяца назад +133

    The postgres test is not taking advantage of the GIN indexes. Each index in PG supports only specific "operator classes". For GIN's jsonb_ops, it supports "? ?& ?| @>" and does not support "->". So when you are doing filtering, you should use "@>" instead. Also instead of the default jsonb_ops, you can create the index with "jsonb_path_ops" which is smaller and faster, but supports only @>

    • @luca4479
      @luca4479 2 месяца назад +8

      Create a pull request

    • @pytorche6206
      @pytorche6206 2 месяца назад

      Interesting but how would you rewrite the query to use the index ?
      The @> test if a document contains some path/values from what I read. Here the filtering is done on a price lower than a given value , not having a specific value.
      Maybe the product's price should be externalized in its own column and indexed... For every update you'd need to update the 2 columns.
      Sorry if the question is stupid I know close to 0 on the document/json features of postgresql.

    • @svetlinzarev3453
      @svetlinzarev3453 2 месяца назад +2

      @@pytorche6206 You cannot with a GIN index :) But one can create a BTREE index for that specific property to use with the "->" operator. The issue with BTREE indexes is that you cannot index the whole JSON blob with them, but have to create a separate index for each property. Also you have to use the exact expression you used during the index creation when filtering, otherwise the DB will not use the index.

    • @ojonathan
      @ojonathan 2 месяца назад

      @@svetlinzarev3453 You can, with jsonb_path_query, which uses the GIN index. It was not possible in the past, but now we have a lot of new jsonb functions.

    • @professortrog7742
      @professortrog7742 2 месяца назад +3

      No sane dba would model the data such that the price field would be within a json blob.

  • @zuzelstein
    @zuzelstein 2 месяца назад +36

    In this test databases work in different Durability mode:
    1. MongoDB buffers writes in RAM and flushes them to disk each 100ms by default.
    2. On the graph Postgresql IOPS are equal to the RPS which makes it clear that Postgresql writes to the disk immediately.
    To make Postgresql act like Mongodb in postgresql.conf set: "synchronous_commit = off" and "wal_writer_delay = 100ms"

    • @lucasrueda3089
      @lucasrueda3089 Месяц назад

      so, these graphics are wrong?

    • @zuzelstein
      @zuzelstein Месяц назад +9

      @@lucasrueda3089 these graphs reflect how two databases work in different modes. The configuration is "wrong" here.

  • @codeSTACKr
    @codeSTACKr 2 месяца назад +1

    Thank your for this one! Great comparison.

  • @sudipmandal2497
    @sudipmandal2497 2 месяца назад +5

    This is the video I was looking for after studying about DBMS scaling and ACID, BASE properties.

    • @Zizaco
      @Zizaco 2 месяца назад

      MongoDB is ACID compliant btw

  • @Neuroszima
    @Neuroszima 2 месяца назад +11

    The most awaited battle of history!
    EDIT: Ok so i was in PostgreSQL camp, i think you managed to convince me to start learning MongoDB as well and this weird syntax that they have

    • @AntonPutra
      @AntonPutra  2 месяца назад +2

      if you want json, mongo is a no-brainer

    • @MrMikomi
      @MrMikomi 8 дней назад

      ​@@AntonPutra not necessarily

  • @alexvass
    @alexvass 2 месяца назад +1

    Thanks

  • @nidinpereira5898
    @nidinpereira5898 2 месяца назад +72

    Let's get this man to the top of RUclips

  • @spruslaks26
    @spruslaks26 2 месяца назад +2

    Thanks!

    • @AntonPutra
      @AntonPutra  2 месяца назад

      thank you for support!! ❤️

  • @ramonpereira4460
    @ramonpereira4460 2 месяца назад +116

    Postgres is actually writing data to the disk. Mongo is not. That is why you see such higher throughput and lower io/s

    • @navidfiisaraee9992
      @navidfiisaraee9992 2 месяца назад +5

      mongo is writing 20 GB in disk usage at the end
      it's just that maybe it's indexing mechanism does not need to be updated as constantly

    • @twitchizle
      @twitchizle 2 месяца назад +3

      To ram?

    • @GolderiQ
      @GolderiQ 2 месяца назад +16

      Where does Mongo write?

    • @andreydonkrot
      @andreydonkrot 2 месяца назад +1

      It is indexing high likely. I think we eill have another test later with the optimized indexes.

    • @supercompooper
      @supercompooper 2 месяца назад +2

      ​Mongo has different types of write concerns so you can say it's okay just store it in RAM or you can say I want it written to disk or I want to written to certain number of cluster members etc.. This is in case the power goes out you might lose something so you can choose the option when you write to the API. ​@@GolderiQ

  • @CodeVault
    @CodeVault 2 месяца назад +5

    Very nice video, very informative, thank you!
    I'll be keeping a keen eye on the channel for an update to see the differences when implementing the recommendations listed by the people in the comments. Also, would be nice to see the difference in using SQL (on Postgres) vs JSONB (on Mongodb). I feel that's where most people would like to find the differences to (me included)

  • @daymaker_trading
    @daymaker_trading 2 месяца назад +7

    Bro, you came now to databases, wow! :D These comparison tests are so fun!

    • @AntonPutra
      @AntonPutra  2 месяца назад +1

      i'll do more, redis vs dragonfly is next

  • @MrLinusBack
    @MrLinusBack 2 месяца назад +7

    So in regards to the large amount of disc writes, as some people have pointed out it most likelly is related to WAL, you could try to turn off fsync to check.
    Another related thing is how updates are implemented in postgres, it is essentially a delete followed by an insert (not exactly but it is creating a new tuple and marking another one as dead).
    It would be interesting to know if excluding the update test changed the amount of disc writes or not for any of the tests.
    I actually had a real life example of a rather small table getting millions uppon millions of upserts each hour where almost all became updates that didn't change any values. It was about 100mb big if you copied the data to a new db or did a full vacuum. But around 15gb otherwise.
    Writing a where statement where we checked every single one of the 15 columns against the incomming values, and only updating the rows that differed where actually much more performant in the long run (but probably a rather nieche case since the entire table fit into memory when not bloated). But it probably would be worth trying.

  • @cotneit
    @cotneit 2 месяца назад +25

    Maybe a bit niche, but I would love to see SurrealDB compared to mainstream databases, maybe in-memory comparison with sqlite and server comparison with PostgreSQL

    • @antoniong4380
      @antoniong4380 2 месяца назад +1

      I would like to it happen in some later date. Because if we assume it can only happen once, then I would rather see it after some time than now (I'm betting that it might improve a lot in some later date or that there's a clearer understanding where surrealdb perform the best )

    • @cotneit
      @cotneit 2 месяца назад

      @@antoniong4380 It has been quite some time since SurrealDB released. Heck, they released v2!

    • @def-an-f3q
      @def-an-f3q 2 месяца назад

      @@antoniong4380not once, at least, there are two videos which compare nginx and traefik

    • @TotalImmort7l
      @TotalImmort7l Месяц назад +3

      I was building a tool for a company. Initially we used SurrealDB, but ended up going back to SQLite because SurrealDB is too slow.

  • @Future_me_66525
    @Future_me_66525 2 месяца назад +2

    Thanks Anton, unvaluable video

  • @Nikhil-jz5nm
    @Nikhil-jz5nm 2 месяца назад +47

    Good video!
    Postgres prioritizes consistency so it would always write to disk, and mongo prioritizes availability and promises eventual consistency so it writes to memory first... The same experiment with queries across shards &/ replica sets should give interesting results

    • @LotnyLotnik
      @LotnyLotnik 2 месяца назад

      I think that's where the diffrence would matter. What is shown here is maybe 99% of startups will get, but at one point you will need to scale and it would be interesting to see how sharding/replica sets improve/degrate performance on bots.
      Also Transactions!

    • @andyb012345
      @andyb012345 2 месяца назад +15

      This isn't true. MongoDB default configuration will always write the journal to disk on every write operation, the same as PostgreSQL.

    • @BosonCollider
      @BosonCollider 2 месяца назад

      @@LotnyLotnik Postgres has an excellent sharding story with Citus (which is used in Cosmos DB on azure but is also self-hostable, I've briefly tried it with stackgres). Most people should just scale writes vertically and add replicas to scale reads though. Sharding is only actually needed at the PB scale and most people should just ensure that they are not running on something slower than a bare metal $100/mo hetzner server.
      Postgres also has a better insert performance than mongoDB if you are doing bulk inserts with COPY (performs better than insert already at 2-3 rows per copy statement), which generally is what you will use anyway if the data being written is from a message broker like Kafka or something similar. For bulk operations the overhead of committing ensuring that it flushes to disk is negligible.
      In Go webservers another common pattern is to use a channel to queue up writes instead of having the handler write straight to the DB, which is better than the plain repository pattern because it gives the repository full freedom to autobatch writes. With that said, in those case you have a strongly typed channel of struct types, and then you can usually just write straight to a normalized table instead of writing jsonb, and then postgres is much faster because normalization often means you only need to write a tenth of the data or less (I've seen order of magnitude size reductions migrating mongo/elastic dbs to normalized postgres).

    • @industrial-wave
      @industrial-wave 6 дней назад

      Postgres can be configured to do the same thing mongodb is doing here by disabling synchronous commits it seems

  • @JonCanning
    @JonCanning 2 месяца назад +8

    Thanks

  • @thesupercomputer1
    @thesupercomputer1 2 месяца назад +6

    I would really like to see mariadb VS. MySQL.
    Mariadb is a drop in replacement for MySQL, developed for some years now independent from MySQL.
    Would be interesting to see who has gotten the better performance since the split.

  • @PedrodeCastroTedesco
    @PedrodeCastroTedesco 2 месяца назад +1

    Anton, I love your videos because they highlight the trade offs in software architecture. I dont know if it is too much to ask but could you highlight the pros and cons of the databases, the ideal cenarios for each one? That will be very helpful.
    Thanks!

  • @kamurashev
    @kamurashev 2 месяца назад +6

    For the disk usage, I’m not an expert in PostgreSQL configuration but eg for MySQL/innodb there’s innodb_buffer_pool_size which gives you the ability to take advantage of RAM if you have a lot. Like in one of the projects I was working for we had 128/256Gb of ram and we were able to fit entire dataset there, it gave us sub ms blazing fast queries.
    Another issue might be that your dataset is small and you are reading and updating the same data all the time thus evicting/eliminating the page caching.

    • @LtdJorge
      @LtdJorge 2 месяца назад

      For Postgres, you have shared_buffers (recommended baseline is 25% of RAM) which is the internal Postgres cache. Then there is effective_cache_size (recommended baseline is rest of RAM, so 75% here), which hints to Postgres how much memory will be used by the page cache, for files. This is the Linux cache itself, which Postgres will benefit from when reading pages from disk. This last one is only taken into account by the optimizer to parallelize aggregates and such, unless the parallelization would use so much page cache that it would invalidate the performance gains. Postgres also has work_mem which is the memory used by each worker for things like sorting or hashing, it’s a relatively very low value compared to shared_buffers, at least on machines with ample RAM (higher than 16GB).
      Edit: TimescaleDB recommends setting work_mem to 25% RAM / connections. This a bigger than normal value, since Timescale uses more memory heavy algorithms, but it gives an idea.

  • @stefanszasz6510
    @stefanszasz6510 2 месяца назад +7

    Anton, your statement regarding disk write iops "Postgres always performs many more operations compared to many other databases [in MySql]" contradicts your "Postgres vs MySql benchmark" where you've shown that MySql writes ~3x more...

  • @mkvalor
    @mkvalor 2 месяца назад +10

    Best practice for high throughput with PostgreSQL is to mount yet another separate physical disk volume for the Write Ahead Log (WAL). The write I/O on the log volume will always be sequential, while the data volume will experience much more random read and write I/O ops as the database engine processes the updates, deletions, and reads associated with the SQL statements.
    It's not about the "speed" of an SSD, in this case it's about isolating the two dramatically different I/O patterns onto different firmware controllers (of the separate volumes).

    • @eddypartey1075
      @eddypartey1075 2 месяца назад

      "in this case it's about isolating the two dramatically different I/O patterns onto different firmware controllers" why is this benefitial? Please, explain where profit comes from. I'm curious

  • @TheAaandyyy
    @TheAaandyyy 2 месяца назад +1

    What i would also like to see at the end is the bill from the AWS :D A cost comparison between these two would be super nice, as it is also a big factor when it comes to choosing the database. Great wideo anyways!

    • @AntonPutra
      @AntonPutra  2 месяца назад +2

      thanks, it's about $15-$20 for the test, including prep, for running it for 2-3 hours

  • @ErmandDurro
    @ErmandDurro 2 месяца назад +2

    Thank you for this video. It was very insightful 🙂

  • @AliMuharram-q3l
    @AliMuharram-q3l 2 месяца назад +94

    We need scylladb vs casandra

    • @Алексей-о9б4г
      @Алексей-о9б4г 2 месяца назад +5

      Yes, and CouchDB.

    • @kamurashev
      @kamurashev 2 месяца назад +1

      Maybe worth looking at new sql solutions like tidb. I might not understand your use case but Cassandra is a sh!tty thing.

    • @RupamKarmakar-s8z
      @RupamKarmakar-s8z 2 месяца назад

      ScyllaDB vs ClickHouse

    • @rida_brahim
      @rida_brahim 2 месяца назад

      if these databases are that important why would anyone teach MySQL, MongoDb or Postgresdb! i don't get it. is it a legacy software or a big tech company thing or a usecase thing or what?

    • @kamurashev
      @kamurashev 2 месяца назад +1

      @@rida_brahim cause in CAP they are CA dbs where as 1 node postgre is CA and mongo is CP and yes it’s use case specific. And also there are many more things to consider and that’s why there are so many of them

  • @sanchitwadehra
    @sanchitwadehra 2 месяца назад +1

    Dhanyavad

  • @brahimDev
    @brahimDev 2 месяца назад +3

    SurrealDB is perfectly balancing between mongodb and postgresql, but the only concern right now is it's not fully ready for production

  • @sh4lo
    @sh4lo 2 месяца назад +1

    Nice job, thanks!

  • @andyb012345
    @andyb012345 2 месяца назад +3

    I think it would have made more sense to compare a b-tree index in postgresql as this is how they are implemented in mongodb.

  • @jm-alan
    @jm-alan 2 месяца назад +2

    I'd love to see this test repeated w/ the Postgres table created UNLOGGED; I bet it would drastically reduce the disk usage.
    For tables with very hot writes, if you're willing to sacrifice some crash reliability, it can substantially speed up writes.

  • @wardevley
    @wardevley 2 месяца назад +1

    I loved the dubbing

  • @Mr.T999
    @Mr.T999 2 месяца назад +4

    Could you create a video, how you are creating this infrastructure, test loads, configurations, dashboards etc.

    • @AntonPutra
      @AntonPutra  2 месяца назад

      well i have a bunch of those tutorials on my channel including terraform, kubernetes, prometheus etc

  • @jricardoprog
    @jricardoprog 2 месяца назад +2

    I believe it would be fairer when inserting/... data into MongoDB to use the writeConcern option so that it behaves more like PostgreSQL. Otherwise, it will trade data durability for performance.

  • @cmoullasnet
    @cmoullasnet 2 месяца назад +4

    For disk writes, perhaps Postgres is configured more conservatively with respect to always making sync writes out of the box compared to Mongo?

  • @georgehelyar
    @georgehelyar 2 месяца назад +10

    When you actually get to this scale, big mongo for enterprise gets expensive.
    It's a bit apples and oranges anyway though. If you have highly relational data then use relational database like postgres, and scale it with something like citus. If you are just reading documents that don't have a lot of relationships with each other then use nosql like mongo.
    The real cost in nosql is in dev time when you need to query it in a new way and then need to denormalise it and write it all the different ways so that you can query it efficiently.

    • @ikomayuk
      @ikomayuk 2 месяца назад

      I've left a comment about Citus and decided to look into whether I'm alone here with it. Good evening, sir ))

  • @BeeBeeEight
    @BeeBeeEight 2 месяца назад +1

    Finally, a match of the juggernauts! Although, I somewhat expected this result as I knew MongoDB works better with JSON documents than SQL databases. The conclusion here is really, there is no one size fits all database - while MongoDB works well with JSON, data with rigidly defined shapes are perfect for SQL databases, and SQLite is well suited for mobile/embedded applications.

  • @roccociccone597
    @roccociccone597 9 дней назад

    I don't use PG for pure document style databases. I do how ever really like to combine the two types where it makes sense, it's really nice

  • @amvid3464
    @amvid3464 2 месяца назад

    Nice video, thanks man! Would be nice to see EdgeDB in upcoming videos

  • @MelroyvandenBerg
    @MelroyvandenBerg 2 месяца назад +3

    It's me again :). If you want to reduce the PostgreSQL writes to disk try to set commit_delay = 300 or higher.

    • @luca4479
      @luca4479 2 месяца назад

      Je kunt een pull request op de repo achterlaten, of een issue. Grotere kans dat het dan gezien wordt.

    • @szpl
      @szpl 2 месяца назад

      My first impression was also, at this order of magnitude difference is either pg doing something unnecessary or mongo skipping corners

    • @MelroyvandenBerg
      @MelroyvandenBerg 2 месяца назад

      @@luca4479 ow dat heb ik al genoeg gedaan. maar in dit geval kon ik de juiste code niet vinden. Want hij deelde zijn postgresql config desze keer.

    • @AntonPutra
      @AntonPutra  2 месяца назад +1

      no, i'll keep it as is, but thanks for the tip! i'll take a look

    • @MelroyvandenBerg
      @MelroyvandenBerg 2 месяца назад

      @@AntonPutra no problem.. @

  • @LucasVinicius-ex4lr
    @LucasVinicius-ex4lr 2 месяца назад +1

    Thanks for providing an audio track in Portuguese!

  • @news3951
    @news3951 2 месяца назад +2

    Very Nice video Love you brother 🥰🥰🥰🥰🥰🥰🥰🥰🥰🥰🥰🥰🥰

  • @utubiene
    @utubiene 5 дней назад

    Anton, thanks for your work. Performance Benchmarks will be determined by the fileformat you are using. Did you do your test on xfs or zfs (or other COW filesystem) formats?

  • @luizfernandoalves6625
    @luizfernandoalves6625 2 месяца назад +1

    What are the configurations of this PostgreSQL database? Number of connections, WAL, concurrency, etc.?

  • @MarmadukeTheHamster
    @MarmadukeTheHamster 2 месяца назад +17

    4:40 this database design NOT okay for eCommerce, please don't design your ecommerce database like this!
    The concept of database normalisation demonstrated here is of course accurate and important to learn but, in this specific case, your order table should contain the duplicated address and product price information. This is so that you can query historical orders and find accurate information about the price the customer actually paid at the time, and the address that the order was actually delivered to. Using foreign keys exclusively in this example, would mean that your historical order information will be inaccurate if you ever decide to change product prices, or if a customer updates their address.

    • @szpl
      @szpl 2 месяца назад +2

      I guess it is just a small sample database with a basic use case which meant to be easy to understand.
      As you correctly emhasize, accountability for past transactions is a business requirement at many cases, especially in commerce / banking / bookkeeping etc

    • @tom_marsden
      @tom_marsden 2 месяца назад

      Having an order date/time and a product history table would solve this. Same approach you'd use for anything that needs to be audited.

    • @sanchitwadehra
      @sanchitwadehra 2 месяца назад

      Dhanyavad for commenting this I was working on one such project and its my first time coding such a full stack project I was about to make the mistake you mentioned you saved me from hours of frustration

  • @MusKel
    @MusKel 2 месяца назад +45

    When NATS vs Kafka?

  • @DavidDLee
    @DavidDLee 2 месяца назад +3

    The guarantees here are quite different. Postgres has strong guarantees, which means you can't lose data and transactions are consistent.
    Not an expert on MongoDB, but I think that a successful write has weak default guarantees, e.g. the write may not be flushed to disk and no consistency guarantees beyond a single collection write.
    That said, I am sure Postgres can be tweaked here to perform better.

    • @Zizaco
      @Zizaco 2 месяца назад +1

      It was the case 10 years ago. but now mongoDB defaults to strong guarantees too

    • @AntonPutra
      @AntonPutra  2 месяца назад +1

      interesting, i'll take a look

  • @Takatou__Yogiri
    @Takatou__Yogiri 2 месяца назад +1

    finally it's here.

  • @jacsamg
    @jacsamg 2 месяца назад

    ¡Esta en españo! ¡Increíble! Muchas gracias 😬

  • @anuxtr1n
    @anuxtr1n 23 часа назад +2

    Do you have some video teaching making this type of tests?

    • @AntonPutra
      @AntonPutra  21 час назад

      i have some tutorials covering monitoring

  • @luizfernandoalves6625
    @luizfernandoalves6625 2 месяца назад +2

    Postgres performs more disk operations because of transactional logs, doesn't it?

  • @user-lv3hn6uz4e
    @user-lv3hn6uz4e 2 месяца назад +2

    Which value have default_toast_compression parameter? you should try lz4 in PostgreSQL for test with blob types.
    And storage and related parameters also very important

  • @anassfiqhi3517
    @anassfiqhi3517 2 месяца назад +3

    Very good video, thank you. Can you test scaling up both databases

    • @AntonPutra
      @AntonPutra  2 месяца назад

      yes i was thinking about performing some distributed tests, i'll start with redis cluster next with dragonfly

  • @enginy88
    @enginy88 2 месяца назад

    MySQL also provides support for native JSON data type from version 5.7. It supports optional indexing by auto-extracting values from JSON fields. It will be great if you include MySQL to this test or do a standalone MySQL vs Postgres.

    • @AntonPutra
      @AntonPutra  2 месяца назад

      thanks, i may do it in the future but i don't think most people would use postgres and mysql for storing json docs

    • @enginy88
      @enginy88 2 месяца назад

      @@AntonPutra Thank you for your response! In fact, people realized that they don't want to maintain another database server in addition to the RDMSs. That's why nowadays everyone interested in storing unstructured JSON data in RDMSs.

  • @johnsmith21123
    @johnsmith21123 2 месяца назад +7

    Apparently Postgres writes WAL to the disk

    • @IvanRandomDude
      @IvanRandomDude 2 месяца назад +11

      It has to. Otherwise you can lose the data. WAL is essential for consistency.

  • @chneau
    @chneau 2 месяца назад +1

    Awesome video

  • @minhphamhoang2894
    @minhphamhoang2894 2 месяца назад +2

    I have checked the repo and it seems the database is not tuned. Also, it is better to use the alpine-based image as it already consumed a quite of resources. Also in PostgreSQL the WAL and archiving is still there throughout the test, so from my point of view, we need to tune those configuration first as all two database default settings are not too good.

    • @AntonPutra
      @AntonPutra  2 месяца назад +1

      sorry i forgot to upload postgres config, it is optimized. i use pgtune for that. i don't use docker to run databases, i use systemd

    • @minhphamhoang2894
      @minhphamhoang2894 2 месяца назад

      @@AntonPutra The pgtune actually tune only the resources, specifically the memory like shared_buffers, but it ignored the WAL. Since we aimed to make test reliably from business level, please check and extend the archive_timeout, checkpoint_timeout, disable full_page_write, ... . For synchronous_commit, since we dont use replication, set it to local mode.

    • @minhphamhoang2894
      @minhphamhoang2894 2 месяца назад

      @@AntonPutra Also, please can you switch to the alpine image to see if it can perform better as most production workload are based on Alpine OS rather than Debian

  • @anotherme4638
    @anotherme4638 15 дней назад

    It is interesting to see that Postgres always have more writes to disk compared to other DBs,
    And, how do you setup these test? could you please create a tutorial for that ? it would be helpful
    Thanks a lot, I really appreciate your work.

  • @renatocron
    @renatocron 2 месяца назад

    the reason that postgres uses more io is that it's actually a DATA base so when it's says it's written to disk, it's. if you disable fsync you boy gonna not find anything better in one node, except maybe sqlite or duckdb but that's another beast

    • @AntonPutra
      @AntonPutra  2 месяца назад

      make sense 😊

    • @jesusruiz4073
      @jesusruiz4073 2 месяца назад

      You do not have to disable fsync completely. I would say that doing in PostgreSQL the same "trick" that MongoDB is using here would increase performance by an enormous amount. MongoDB in the default standalone server config (like here) writes the journal to disk every 100ms.
      If your use case allows some data loss, then PostgreSQL is a beast. However, do not try this for financial applications, pease ...

  • @SDAravind
    @SDAravind 2 месяца назад

    Please mention higher/lower is better for each metric in all the videos and please provide your conclusion in the end.

  • @reybontje2375
    @reybontje2375 2 месяца назад

    It'd be interesting to see this compared to embedded databases that use LSM trees, such as RocksDB.
    RocksDB rust crate alongside serde_json or simd_json versus Postgres would be an interesting comparison.

  • @naczu
    @naczu 2 месяца назад +5

    Wow. I am really surprised. This is RDBMS vs NoSQL. And I thought NoSQL always faster than all rdbms out there. But I see postgresql is really doing great job here. Amazing video. Thanks.

  • @prashanthb6521
    @prashanthb6521 2 месяца назад +1

    Postgres has a shared buffer lock & yield problem !
    The problem will get pronounced as you increase the number of threads.

    • @AntonPutra
      @AntonPutra  2 месяца назад

      ok interesting i'll take a look

    • @prashanthb6521
      @prashanthb6521 2 месяца назад

      @@AntonPutra Read this paper : Why Parallelize When You Can Distribute ? By Tudor-Ioan Salomie, Ionut Emanuel Subasu, Jana Giceva, Gustavo Alonso at ETH Zurich, Switzerland.

  • @wm5265
    @wm5265 17 дней назад

    Looks like amounts of data was small as mongo operates on lookups normally. Could you add some more data and retest ? In my experience, if collection has mln docs and became painful to make lookups in mongo.

  • @shahriarshojib
    @shahriarshojib 2 месяца назад +1

    Would've loved to see a more apples to apples comparison and memory usage.

  • @galofte6067
    @galofte6067 2 месяца назад

    Hey this is a great video. How you created those infographic btw?

  • @jesusruiz4073
    @jesusruiz4073 2 месяца назад +33

    This confirms (I already knew it) that I DO NOT NEED MongoDB for my applications, even if they use JSON.
    PostgreSQL provides the flexibility of SQL (which I need), and also BETTER performance. The default MongoDB container stores the WAL (write-ahead log) in memory, writing it to disk periodically, impacting durability and improving performance. As the MongoDB manual describes: "In between write operations, while the journal records remain in the WiredTiger buffers, updates can be lost following a hard shutdown of mongod."
    By the way, this is probably the reason for PosgreSQL writing more to disk: default PostgreSQL provides better durability, which is what I need in my applications.
    Even with this "trick", both are similar in performance below 5.000 qps (which is more than what I need in practice).
    If MongoDB is configured with the same ACID properties (especially the D in ACID), I expect MongoDB to have worst performance even below 5.000 qps.
    Good job, Anton, and thanks.

    • @jesusruiz4073
      @jesusruiz4073 2 месяца назад +1

      I add my summary: for applications which can not tolerate data loss, go with PosgreSQL. When you can afford data loss in some occasions, you can use MongoDB.
      For me, the situation is clear (it has been for a long time, by the way).

    • @andyb012345
      @andyb012345 2 месяца назад +10

      Your durability argument is misleading, this is only true when journaling is set to false on a write operation. The default configuration sets journaling to true when write concern is majority, which is the default write concern.
      This is like saying "I can turn fsync off in postgresql so it isn't durable!"

    • @supercompooper
      @supercompooper 2 месяца назад

      ​@@jesusruiz4073When you use MongoDB you can specify the write concern so you can totally ensure it's written to disk just fine and safely.

    • @jesusruiz4073
      @jesusruiz4073 2 месяца назад +7

      @@andyb012345 No, it is your argument the one misleading, because you confuse journaling with where the journal is written and when.
      For a deployment with a single server with the default configuration (like in this benchmark), journaling is enabled but the default write concern is { w: 1 }.
      But the journal is flushed to disk every 100 milliseconds (because w:1 does not imply j:true)
      With 5.000 writes/sec, up to 500 transactions can be lost.
      When a replica set is used, then I agree with you that by default the write concern is w:majority, which implies j:true and the journal is flushed to disk immediately.
      But this is not what was compared to PostgreSQL in this benchmark.

    • @Zizaco
      @Zizaco 2 месяца назад

      ​​​​@@jesusruiz4073 Nah. writeConcernMajorityJournalDefault is the default. And the default writeConcern is majority.
      Even for a single node replicaset

  • @jashakimov8578
    @jashakimov8578 2 месяца назад +6

    Couchbase vs MongoDB

  • @omarsoufiane4evr
    @omarsoufiane4evr 2 месяца назад +1

    according to some benchmarks i saw online postrgres outperforms MongoDB in paginated queries

    • @AntonPutra
      @AntonPutra  2 месяца назад +2

      well maybe but i hardcoded a limit of 10 for both, as soon as i get enough feedback i'll refresh it

  • @nstrepp
    @nstrepp 2 месяца назад +1

    Video idea: a postgres X postgres with fsync turned off :) that'd be very cool!

  • @kamurashev
    @kamurashev 2 месяца назад

    If we are talking dbs, there’s a new player out there called “newSQL” like eg several big companies I worked/work for are using TiDB and switching their existing solutions to it. It’d be a bit hard so setup a cluster but I’d do anything to see the performance comparison with both classic relational 1 node db and eg mongo.

  • @the-programing
    @the-programing 2 месяца назад +1

    I still don't understand if he did do join and what settings are used on Postgres. If he used the default settings, then this is not a good test at all. MongoDB is caching so much more, and Postgres never caches more than few kb's in the default settings. This is dumb... Comparing disk writes to in-memory database...

    • @AntonPutra
      @AntonPutra  2 месяца назад

      i used pgtune to optimize postgres, just forgot to upload it to github

    • @the-programing
      @the-programing 2 месяца назад

      @ Thank you for the video though. Another suggestion is that I don't know any projects that store json objects in relational databases. It is usually stored in columns and only made into json when you need to send it to a client-side. You shouldn't query the json objects inside a relational database, because it is made to work with rows and columns only.

  • @vishnusai4658
    @vishnusai4658 2 месяца назад

    Helpfull TQ can you make backup incement backup for database in real time mogodb marinadb, in eks

  • @artiomoganesyan8952
    @artiomoganesyan8952 2 месяца назад +1

    I wanted to use Neo4j for my pet project, however it is so hardware hungry, I decided to use postgres. But I really want to see comparison of neo4j and postgres.

    • @AntonPutra
      @AntonPutra  2 месяца назад

      ok i used it in the past, it is an optimized graph database but i think i can use a postgres plugin as well

  • @milagos09
    @milagos09 2 месяца назад +3

    Postgres vs. Microsoft sql server, please 🙏

    • @AntonPutra
      @AntonPutra  2 месяца назад +3

      ok, it's on my list

  • @professortrog7742
    @professortrog7742 2 месяца назад

    No sane dba would model the data such that the price field would be within a json blob.
    The price and all other non-optional fields should have their own columns and be indexed with a normal btree index.

  • @rohithkumarbandari
    @rohithkumarbandari Месяц назад

    Please make another comparision video after setting postgres up such that it acts like mongo db.

  • @PavitraGolchha
    @PavitraGolchha 2 месяца назад +1

    Would have been nice to also include SQLite here

    • @AntonPutra
      @AntonPutra  2 месяца назад +1

      i'll do sqlite soon as well using unix socket instead of tcp/ip as i did in the previous test

  • @Dr-Zed
    @Dr-Zed 2 месяца назад +6

    This once again proves that PostgreSQL is the only thing you need for most applications. It can replace document DBs, caches like Redis and even message brokers.

    • @AntonPutra
      @AntonPutra  2 месяца назад +1

      yes it covers 99% of all you need

  • @Ilja903
    @Ilja903 2 месяца назад +4

    Addicting channel, I literally check your channel every day. Plz MariaDB vs MySQL. Отличный канал, эти тесты пушка, такого просто нет нигде

  • @szpl
    @szpl 2 месяца назад +1

    11:20 Were you able to find the root cause of those disk writes ? Or maybe the absence of this write in MongoDB ?

  • @amaraag9435
    @amaraag9435 25 дней назад

    When watch backend tutorials everyone using postgresql and drizzle in recent times. Why developers use more often postgresql than mongodb?

  • @inithinx
    @inithinx 2 месяца назад +3

    Interesting video. I'm sure postgres/mongo is done something smart here, someone will make a PR to improve postgres i think.
    Also, KTOR just hit 3.0, so i will really appreciate a speedtest between kotlin and golang with ktor as the backend framework for kotlin and maybe echo as the golang backend framework.
    Thank you for your awesome work!

  • @artursradionovs9543
    @artursradionovs9543 2 месяца назад

    How to make animated display with grafana and prometheus?

  • @Nick-yd3rc
    @Nick-yd3rc 2 месяца назад +1

    Are you running Postgres with the vanilla default config again? 😢

    • @AntonPutra
      @AntonPutra  2 месяца назад +1

      no i forgot to commit postgres config, i use pgtune to optimize it based on the hardware i use

  • @AcidGubba
    @AcidGubba Месяц назад

    Who would have thought that writing to memory would be faster than to a hard drive? What level is this?

  • @EzequielRegaldo
    @EzequielRegaldo 2 месяца назад

    Ohhh spanish ❤ thank you so much !

  • @AdamPoniatowski
    @AdamPoniatowski 2 месяца назад +1

    can you do a cassandra vs mongodb comparison next?

    • @AntonPutra
      @AntonPutra  2 месяца назад +1

      columnar vs document database, maybe will do in the future 😊

  • @alienmars2442
    @alienmars2442 2 месяца назад +1

    any plan to compare cockroach db and yugabyte db ?

  • @iaaf919
    @iaaf919 2 месяца назад +5

    I wish you had memory compare too

  • @StanislawKorzeniewski-yr1ii
    @StanislawKorzeniewski-yr1ii 2 месяца назад +2

    Can you compare TimescaleDB vs QuestDB??

  • @exzolink
    @exzolink 2 месяца назад

    Все еще ждем Postgre vs Maria

  • @DhavalAhir10
    @DhavalAhir10 2 месяца назад +2

    Apache Solr vs ElasticSearch

  • @tkemaladze
    @tkemaladze 2 месяца назад +1

    Can you do MongoDB - MariaDB? please

    • @tkemaladze
      @tkemaladze 2 месяца назад

      or even ScyllaDB - MongoDB

    • @AntonPutra
      @AntonPutra  2 месяца назад +1

      yes mariadb will be next

  • @thepuma8558
    @thepuma8558 2 месяца назад

    Can you test DuckDB vs Sqlite ?

  • @agustikj
    @agustikj 2 месяца назад +1

    Mongodb vs elasticsearch? :)

  • @kryacer
    @kryacer Месяц назад

    can you compare performance postreSql vs MS Sql Server?

  • @reactoranime
    @reactoranime 2 месяца назад +1

    OLS vs Nginx, I've was surprised how OLS make sites flying..

  • @mavriksc
    @mavriksc 22 дня назад

    How about tiger beetle. And run some financial transactions test the 1kx improvement

  • @ikomayuk
    @ikomayuk 2 месяца назад +1

    Well, it's like apples and oranges. One allows you to store data with a flexible schema that is well suited for prototyping and low-latency selects, while the other enables you to query your data in any way. One is suitable for horizontal scaling, and the other has excellent tools for securing your data. One is proprietary-ish, and the other is framework-ish for other proprietary-ish products. Have a look for Citus if you think that Postgres hardly scales =)

    • @Zizaco
      @Zizaco 2 месяца назад +1

      Last time I checked mondoDB also allowed you to query your data in any way

    • @ikomayuk
      @ikomayuk 2 месяца назад +1

      @@Zizaco thats true, but in SQL you can use complex joins, while in mongo it's not so efficient to join multiple collections.
      As far as I remember, transactions are limited to one collection in mongo, while pg allows you to run transactions across multiple tables.
      And so on, and so on.

  • @animeverse5912
    @animeverse5912 2 месяца назад +1

    Redis vs scaletable (rust alternative)

    • @theo-k4i8m
      @theo-k4i8m 2 месяца назад

      Hi, do you mean Scaleout, I couldn't find any Scaletable

    • @animeverse5912
      @animeverse5912 2 месяца назад

      @ I’m sorry I meant skytable