Core Databricks: Understand the Hive Metastore

Поделиться
HTML-код
  • Опубликовано: 15 авг 2023
  • A core part of the Databricks ecosystem is the Hive Metastore which enables Spark SQL. But how does Hive work and how do you use it? How does Hive relate to the new Unity Catalog? Join me as I answer these questions and more.
    Support Me on Patreon Community and Watch this Video without Ads!
    www.patreon.com/bePatron?u=63...
    Link to slides, data, and code (Databricks Notebook in dbc format):
    github.com/bcafferky/shared/b...
  • НаукаНаука

Комментарии • 33

  • @andrewpotts9948
    @andrewpotts9948 Месяц назад +3

    That's the right level of detail that I needed. Well explained. Thank you.

  • @haseebjehangir3249
    @haseebjehangir3249 10 месяцев назад +7

    Finally a video on databricks hive metastore which is well explained, thanks Bryan

  • @kvin007
    @kvin007 10 месяцев назад +1

    Love the direct and clear content! Keep it going!

  • @YiminWei-z6w
    @YiminWei-z6w 7 дней назад +1

    great explanation. Thanks!

  • @JLRocco43
    @JLRocco43 10 месяцев назад +2

    I was just pondering on doing a deep dive in this today and reading a lot of docs and then you put out the video 😂 awesome work Bryan!

  • @soumyavema6515
    @soumyavema6515 10 месяцев назад +2

    Pretty clear ...very much needed before exploring Unity catalog ....Waiting for the next

  • @martalopezjurado
    @martalopezjurado 10 месяцев назад +1

    I love this video!! thanks a lot.
    Waiting for the unity catalog video!

  • @danhai7276
    @danhai7276 10 месяцев назад

    Great video, waiting for the next one unity catalog.🙌

    • @BryanCafferky
      @BryanCafferky  10 месяцев назад

      Yeah. There's a lot to Unity Catalog. Also doing Databricks AI Assistant which is very cool.

  • @joshuawagner5350
    @joshuawagner5350 Месяц назад

    Exceptional explanation. Thank you.

  • @sujitunim
    @sujitunim 10 месяцев назад

    Thanks Bryan for this amazing session

  • @etianemarcelino5706
    @etianemarcelino5706 10 месяцев назад

    Great content... Like always

  • @rabeMa
    @rabeMa 6 месяцев назад

    Deadly clear, awesome 👌👌👌💯💯💯

  • @mehulkhare8278
    @mehulkhare8278 4 месяца назад

    Thanks for making it simple to understand.

  • @nargesrokni6348
    @nargesrokni6348 10 месяцев назад

    very good explanation, thank you very much man

  • @renegade_of_funk
    @renegade_of_funk 10 месяцев назад

    You’re doing the Lord’s work. 👌

  • @ngneerin
    @ngneerin 8 месяцев назад

    This gave real good idea

  • @CaponordRevHappy
    @CaponordRevHappy 6 месяцев назад

    Superb! thank you.

  • @ravinarang6865
    @ravinarang6865 3 месяца назад

    Very Good.

  • @GhernieM
    @GhernieM 6 дней назад

    Hey Bryan, do you plan to create something about Unity Catalog?

  • @jbab9618
    @jbab9618 4 месяца назад +1

    Hi @BryanCafferky if CSV file meta data is change then hive metastore automatically update metadata in hive store, is it right else we can do any steps for refresh metadata ?

    • @BryanCafferky
      @BryanCafferky  4 месяца назад +1

      A Hive table definition over a CSV file is read only and to get the meta data reloaded, I believe you would need to drop and re-create the table.

  • @pal3201
    @pal3201 7 месяцев назад +1

    Can you tell us when are you releasing your take on Unity Catalog ? Looking forward to it.

    • @BryanCafferky
      @BryanCafferky  7 месяцев назад

      So many things to cover these days. Hopefully, soon. Thanks!

  • @benjaminwootton
    @benjaminwootton 9 месяцев назад +1

    Good video. Though I understand Hive Metastore, it confuses me why everything in data has a dependency on it. For instance, Iceberg seems to need it for everything even though it’s supposed to be a self describing table format.

    • @BryanCafferky
      @BryanCafferky  9 месяцев назад

      Technically, you don't need the Hive metastore to read Delta tables. But it provides a look up to where the table is physically stored. Otherwise, you need to provide the full path to the storage location. It also stores schemas for files that don't have built-in schemas like CSV and Text files.