QuestDB 🛡️ The Time Traveler's Database (Open Source)

Поделиться
HTML-код
  • Опубликовано: 7 сен 2024
  • A concise introduction to QuestDB and how it can make working with time series data so much easier and less costly. Includes ingesting data using the official QuestDB Rust crate.
    The main QuestDB page: questdb.io/
    Try it now with an online interactive demo: demo.questdb.io/
    The QuestDB Github repo: github.com/que...
    ---
    Camera: Canon EOS R5 amzn.to/3CCrxzl
    Monitor: Dell U4914DW 49in amzn.to/3MJV1jx
    Keyboard: Keychron Q1 amzn.to/3YkJNrB
    SSD for Video Editing: VectoTech Rapid 8TB amzn.to/3hXz9TM
    Microphone 1: Rode NT1-A amzn.to/3vWM4gL
    Microphone 2: Seinheiser 416 amzn.to/3Fkti60
    Microphone Interface: Focusrite Clarett+ 2Pre amzn.to/3J5dy7S
    Tripod: JOBY GorillaPod 5K amzn.to/3JaPxMA
    Mouse: Razer DeathAdder amzn.to/3J9fYCf
    Computer: 2021 Macbook Pro amzn.to/3J7FXtW
    Lens: Canon RF24mm F1.8 Macro is STM Lens amzn.to/3UUs1bB
    Caffeine: High Brew Cold Brew Coffee amzn.to/3hXyx0q
    More Caffeine: Monster Energy Juice, Pipeline Punch amzn.to/3Czmfox
    Building A Second Brain book: amzn.to/3cIShWf

Комментарии • 66

  • @ScottKohlmann
    @ScottKohlmann Год назад +7

    Thanks! This video saved our company a lot of time!

  • @jimmangefrida5207
    @jimmangefrida5207 Год назад +42

    Postgres query: select date_trunc('month', time), avg(PriceUSD) from btc group by 1. To be clear, QuestDB looks interesting and I can see some advantages to it, but I don't think you were being fair to Postgres.

    • @flwi
      @flwi Год назад +7

      I think his example put 30d buckets in consecutive order and aggregate the rows belonging to that bucket. Your example would only group by the month (and they have a different number of days).

    • @javierenlanube
      @javierenlanube Год назад +11

      QuestDB Developer Advocate here. I always say that if you can use only ONE database, go with Postgres. If you have some specific use cases in your application, you might want to use a purpose-built database (document, key/value, graph, time-series...). There are certainly many different ways of running a query and if you want to truncate by month probably your query should work. Now if you want to change aggregation and go down to every 35 minutes or every 5 microseconds or every 2 seconds, or things like those, in PostgreSQL you have to rethink your query and in QuestDB you only change the amount and unit in the SAMPLE BY interval. I cannot really think of one query in QuestDB that wouldn't be possible in PostgreSQL with more or less effort, but often simplicity is important. The other aspect is scale. QuestDB is designed to deal with analytics for quite large tables, so doing aggregates in unindexed tables containing billions of rows is a regular use case. For PostgreSQL that would be not typical and you'd need to invest a considerable effort to fine tune. Having said so, QuestDB is only designed for TimeSeries. PostgreSQL has more data types, more functions, a huge community and the maturity of having being out there for decades, and can be used for absolutely anything and everything. So in the end it is a trade off between using a specialised tool or a more generic one.

    • @codetothemoon
      @codetothemoon  Год назад +4

      As others stated, I think this would work if you're looking for the calendar month - but it's not flexible enough to allow for arbitrary intervals, like 2d, 5d, etc

    • @AllanSavolainen
      @AllanSavolainen Год назад +1

      @@javierenlanube Just remember to also use RRD so you can visualize the data easily.

  • @markolsson2740
    @markolsson2740 Год назад +13

    I really like the way you present your videos, the reviews are great, but I'd like to see more tutorials on learning and using rust. Maybe a series on developing an actual app that uses state machines, threads/async etc. Giving details about the rust way of doing those things.

    • @codetothemoon
      @codetothemoon  Год назад +5

      Thanks Mark - more Rust is on the way!

  • @StevenHodder
    @StevenHodder Год назад +5

    Super useful - dealing with time series data is always a pain.

    • @codetothemoon
      @codetothemoon  Год назад +2

      Agree! Using a specialized DB definitely helps

  • @rajbiswas776
    @rajbiswas776 6 месяцев назад +1

    As a MongoDB fan, I would absolutely love to test it against Mongo now that MongoDB has write and read optimised TS collections.

    • @codetothemoon
      @codetothemoon  6 месяцев назад

      i would love to see these results as well - do you think Mongo is well suited for time series data?

  • @TobiasFrei
    @TobiasFrei Год назад +9

    Certainly a great video, and the first time I hear about QuestDB. Now as a Postgres fan I have to look up/experiment how QuestDB compares to TimescaleDB (which I didn't touch in a while). 🤓

    • @codetothemoon
      @codetothemoon  Год назад +3

      Thanks! Yeah I haven't tried TimescaleDB, but my understanding is that QuestDB is known to have better ingestion performance. But I can definitely see the appeal of Timescale for those who have existing Postgres infrastructure.

    • @prashanthb6521
      @prashanthb6521 Год назад

      Lot more efficient than timescale but still more to be done to call it mature. Some more time is needed.

  • @kwako21
    @kwako21 Год назад +2

    This was very informative. Great video!

  • @qm3ster
    @qm3ster Год назад +3

    1. How does it compare and contrast with InfluxDB?
    2. What does the wire format of submitting data look like?

    • @javierenlanube
      @javierenlanube Год назад +1

      disclaimer: questdb developer advocate here. InfluxDB and QuestDB are both time-series databases, so they are quite similar up to a point. It is a bit like comparing PostgreSQL vs MySQL. Both options are great choices for time-series. QuestDB tends to perform a bit better, and we regularly see users at our public slack moving away from InfluxDB when they have very high ingestion throughput, or tables/queries with large cardinality. You query QuestDB with standard SQL, including joins and it offers SQL extensions for time-series. There are some SQL extensions that some users find friendlier on QuestDB. But I have also heard of super happy Influx users who like their Flux language or the newer InfluxQL. Regarding the wire format for QuestDB, you can either send CSV using the REST/HTTP endpoint, use the postgresql protocol to do standard SQL INSERTs, or use the ILP protocol, which actually uses the message format that Influx open sourced. ILP is recommended for high throughput and there are libraries in seven programming languages so you don't need to deal with low-level details and just send your data over.

    • @qm3ster
      @qm3ster Год назад

      @@javierenlanube possible nit: did you mean "InfluxQL or newer Flux"? (the other way round)

    • @javierenlanube
      @javierenlanube Год назад

      @@qm3ster you are right. The point is I have heard from users who are happy using both Flux and FluxQL, and of course of many users that prefer the standard SQL plus extensions of QuestDB. Both DBs are very cool

  • @nikensss
    @nikensss Год назад

    omg! port is over 9000! (6.28)
    Aside from that, excellent video, looks pretty interesting

  • @mogest
    @mogest Год назад +5

    Ever since PostgreSQL 14 you've been able to do this:
    SELECT date_bin('30 days', time, '2000-01-01'), avg("PriceUSD") FROM btc GROUP BY 1
    Or to group on months, instead of 30 days, it's even easier:
    SELECT date_trunc('month', time), avg("PriceUSD") FROM btc GROUP BY 1
    There are other benefits of time series databases, but PostgreSQL is pretty great!

    • @codetothemoon
      @codetothemoon  Год назад +1

      thanks for pointing this out! date_bin wasn't on my radar!

  • @Zzznmop
    @Zzznmop Год назад +3

    Oh so I should’ve used a DB to analyze my perf test with millions of timestamped rows?
    I think this video (and excel crashing) are the push I needed to actually dive into databases below surface level

    • @codetothemoon
      @codetothemoon  Год назад

      Nice, let us know how QuestDB works out for your use case!

    • @Zzznmop
      @Zzznmop Год назад

      @@codetothemoon the *easier* solution was to use Python for EDA. The downside is that excel histograms are much easier to work with than matplotlib
      Upside is that no database interaction was required. Cloud computing can easily scale to my memory requirements for the times needed

  • @aperson4051
    @aperson4051 Год назад +1

    Your keyboard sounds really good!

    • @codetothemoon
      @codetothemoon  Год назад

      Thanks, it's a Keychron Q1, link is in the description!

  • @logeshkumarrajkumar861
    @logeshkumarrajkumar861 Год назад +5

    Very useful. Thank you. Do you think questsb is a good choice to store and retrieve social media analytics like impressions, engagements etc? Also can I use it as a replacement for a traditional relational database as well?

    • @javierenlanube
      @javierenlanube Год назад +9

      Hi Logesh. QuestDB developer Advocate here. QuestDB is an excellent choice to store social media raw data and get analytics out of it. Regarding if it can be used as a replacement for a traditional database (I am assuming RDBMS like MySql, PostgreSQL, SQLServer...), it really depends. The simple answer is NO, it cannot replace your traditional database, but depending what you are doing with it, it might. QuestDB is designed for fast ingestion and analytics, not really as an operational database, so it works great when your data is append-only. You can update and delete data, but those operations are not as efficient. The concept of a transaction in QuestDB is also much simpler than the traditional ACID model you will find in conventional databases. And QuestDB doesn't enforce any constraints for foreign keys. QuestDB is great when you want to store data (fast or not) which has a timestamp, and where the timestamp is usually important for most of your queries (you query a time range, you want to aggregate by time, you want to bulk export time ranges...), but it is not a great choice if you want to search by ID, work with a small number of records at a time, or want to enforce transactions and referencial integrity. I hope this helps, but if you have more questions please visit our slack, where QuestDB developers and end users hang out slack.questdb.io/

    • @logeshkumarrajkumar861
      @logeshkumarrajkumar861 Год назад +5

      @@javierenlanube thank you for the detailed response

  • @alandto3996
    @alandto3996 Год назад +2

    i would love a video ccomparing surrealDB and questDB

    • @codetothemoon
      @codetothemoon  Год назад +1

      interesting idea! They do take polar opposite approaches - Surreal aims to handle all use cases while Quest aims to be specialized specifically for time series data. My guess is that Quest would be better for time series data, but I guess the only way to find out is to test them!

    • @raysteve3321
      @raysteve3321 Год назад

      Could you test and make a video

  • @seannewell397
    @seannewell397 Год назад +1

    Pretty cool! Any more typesage SDKs for QuestDB in rust or other langs (ts?) cause i really like diesel/sqlx/prisma/drizzle

    • @codetothemoon
      @codetothemoon  Год назад

      For really fast data ingestion (ILP clients) - it looks like there are Rust and JS clients questdb.io/docs/reference/clients/overview/ and I think you should be able to use something like sqlx (and maybe Diesel) for querying, since it has a Postgres compatible interface. You could theoretically use those for data ingestion too, but you'd be foregoing the fast data ingestion capabilities of an ILP client.

  • @mhmdkzr
    @mhmdkzr Год назад

    How would you compare it to TimescaleDB?

  • @forrestbajbek1475
    @forrestbajbek1475 Год назад +3

    Couldn't you just use `SELECT time, AVG(PriceUSD) OVER (RANGE BETWEEN INTERVAL 30 DAY PRECEDING AND CURRENT ROW)` in Postgres? Not discounting how concise the QuestDB syntax is. Will definitely check this out.

    • @flwi
      @flwi Год назад +2

      Nice, this looks quite concise. But it would not reduce the number of rows, right? It's more like a sliding average.

  • @testingstuff6111
    @testingstuff6111 Год назад

    Thank you for this !

  • @gamcd
    @gamcd Год назад +4

    The keyboard sounds nice, is it a custom?

    • @codetothemoon
      @codetothemoon  Год назад +4

      Thanks - it's a Keychron Q1 which I absolutely love. Though lately I've been using a Keebio Iris and I have a Corne coming in the mail. But if you're into more traditional keyboards, definitely recommend the Keychron.

    • @gamcd
      @gamcd Год назад +1

      @@codetothemoon that’s dope I’ve heard good things about the Q1

    • @ivenwinkelmann9783
      @ivenwinkelmann9783 Год назад +1

      @@codetothemoon What switches do you got in the Q1? I got a low profile Corne with choc reds which I absolutely love!

    • @codetothemoon
      @codetothemoon  Год назад

      @@ivenwinkelmann9783 Gateron G Pro Reds. I liked them so much I have another set of them ready for when my Corne arrives. I've heard so many good things about the LP Corne, very cool to hear that you like yours. I'm thinking one of those might be next for me - or maybe a 5x3 if I can get home row mods working....

  • @AlbertCloete
    @AlbertCloete Год назад +1

    Is the time series database also a lot faster with 'sample by' queries, or is it just that the query is easier to write?

    • @codetothemoon
      @codetothemoon  Год назад +2

      I believe read performance is roughly on par with traditional DBs - the place you’d see the most benefit performance wise is in writes, for which QuestDB is significantly faster

    • @javierenlanube
      @javierenlanube Год назад +2

      @@codetothemoon Read performance is very fast if you are using a time filter in your query. If not, it depends, but QuestDB tries hard to vectorize /parallelize instructions to make it pretty fast. And it keeps fast even with billions of records in a single table

  • @oneito947
    @oneito947 Год назад +1

    how much different is this from also new surrealdb

    • @codetothemoon
      @codetothemoon  Год назад

      Substantially - Surreal aims to be a sort of database jack of all trades, QuestDB focuses solely on efficient handling of time series data.

  • @hoegge
    @hoegge Год назад

    Not so smart to not add a leading zero to the months of partitions so they are sorted in the directory.

  • @u007james
    @u007james Год назад

    no text matching comparison?

    • @javierenlanube
      @javierenlanube Год назад +2

      Questdb supports like and ilike functionality, but no full text search

  • @vzuzukin
    @vzuzukin Год назад

    What is the most beginner-friendly database for a new developer for a small project? Would it be MongoDB or a similar document store?

  • @CarterColeisInfamous
    @CarterColeisInfamous Год назад +1

    influxdb 2.0

  • @YadavalliRajashekar
    @YadavalliRajashekar Год назад +1

    Did anybody said you look like charlie puth

  • @SahilP2648
    @SahilP2648 Год назад +1

    Why not use NoSQL? After using NoSQL I don't see why one would still use SQL because NoSQL is just so damn developer friendly.

    • @codetothemoon
      @codetothemoon  Год назад

      There are some tradeoffs with NoSQL and whether it's a good fit really depends on your use case. Some NoSQL databases are really terrible for time series data because of their lack of query flexibility and aggregation functions - DynamoDB would be a good example of this.

    • @SahilP2648
      @SahilP2648 Год назад

      @@codetothemoon I just looked up what I was using in my previous professional project. It was SQLAlchemy which is an ORM for Flask/SQL which I think acts just like NoSQL because it saves queries in objects and the operations are translated to SQL queries but is still extremely developer friendly. So in case SQL is absolutely needed, going this route is still better than vanilla SQL.

    • @prashanthb6521
      @prashanthb6521 Год назад +1

      NoSQL forces you to write more application code which could be done easily with SQL by the DB engine.

    • @SahilP2648
      @SahilP2648 Год назад

      @@prashanthb6521 and that's my point. You don't need to write complicated SQL commands if you can just write queries in your project's code and also save the query objects. It's better definitely.