S2024 #03 - Data Formats & Encoding Part 2 (CMU Advanced Database Systems)

Поделиться
HTML-код
  • Опубликовано: 23 окт 2024

Комментарии • 4

  • @trungdinh7833
    @trungdinh7833 8 месяцев назад +3

    Great lecture! What would it take for a new file format becomes mainstream? Parquet/ORC are so popular, is it possible for a new format to rise?

  • @kevinkristensen8939
    @kevinkristensen8939 3 месяца назад

    Thanks for this! I've always found the semistructured stuff hard to understand. I just want to point out, though, that the example in the referenced paper for shredding has different values in the columnar decomposition. In particular, for value 'en' in Name.Language.Code, the repetition level is 2, because it is a repetition of the 2nd repeated field (according to the paper).

  • @oneofpro
    @oneofpro 8 месяцев назад

    Thank you.

  • @JamesRouzier
    @JamesRouzier 8 месяцев назад

    The Link for the notes are not there