5 reasons to adopt Apache Iceberg over Hive | Starburst Icehouse Architecture

Поделиться
HTML-код
  • Опубликовано: 29 окт 2024

Комментарии • 4

  • @andreyolv-dataengineering5211
    @andreyolv-dataengineering5211 5 месяцев назад

    but to use the iceberg connector for trino you need Hive Metastore, why??????

    • @ManfredMoser
      @ManfredMoser 5 месяцев назад +3

      You dont need a Hive metastore. You need a metastore .. thats how Iceberg works. Trino supports Hive, Glue, Nessie, JDBC, REST and Snowflake metastores with the Iceberg connector.

    • @StarburstData
      @StarburstData  5 месяцев назад +1

      This is correct! Although you can use the Hive metastore, you have many other options. This is a big part of the draw for the openness of Iceberg. You get to decide which components to use, including the metastore.

    • @LesterMartinATL
      @LesterMartinATL 5 месяцев назад +1

      You're probably thinking that the metadata is stored on the data lake along with the data. well, it is, BUT the Iceberg spec calls out that the name of the specific metadata file that contains the current snapshot is also stored. this is crucial to the optimistic concurrency when a write happens. basically, a new version needs to be based on the prior version (or a descendant of it in certain circumstances) and the catalog aids in keeping this under control.