Projection and Predicate pushdown in Apache Parquet

Поделиться
HTML-код
  • Опубликовано: 17 окт 2024

Комментарии • 10

  • @pawarbi4675
    @pawarbi4675 Год назад +2

    Enjoying videos Mark, keep going. One suggestion - slow down a bit and highlight the code/ explain the code a bit (not in detail) so we get context.

    • @learndatawithmark
      @learndatawithmark  Год назад +1

      Thanks and I'll try! Although I have been told to slow down for many years and I'm clearly not great at doing that!

  • @galsl
    @galsl 8 месяцев назад

    can you provide examples where predicate pushdown is not possible with parquet?

  • @dhaval1489
    @dhaval1489 Год назад +1

    Can you do a tutorial of Duckbd + ibis, I am totally new to databases, I am actually familiar with excel, pandas and Polars, just starting

    • @learndatawithmark
      @learndatawithmark  Год назад +1

      Hi! Thanks for your comment. I haven't used Ibis before, but I'm gonna take a look and will try to make an intro tutorial 🙂

  • @michaelhunger6160
    @michaelhunger6160 Год назад

    does the pandas API also support predicate pushdown?

    • @learndatawithmark
      @learndatawithmark  Год назад +1

      It does projection pushdown via the columns parameter (pandas.pydata.org/docs/reference/api/pandas.read_parquet.html)
      And I think if you'd use the pyarrow engine you would get predicate pushdown too.
      Pandas delegates the Parquet reading to fastparquet or pyarrow under the covers, it doesn't have a reader itself as far as I'm aware.

    • @jeevan88888
      @jeevan88888 Год назад

      @@learndatawithmark great insights Mark, just slowdown the explanation in the main part of the code.

  • @DerekMahar
    @DerekMahar Год назад

    Where may I find pqrs?

    • @learndatawithmark
      @learndatawithmark  Год назад +1

      Sorry forgot to put the link. Here we go: github.com/manojkarthick/pqrs