Create an external data stage (Azure) on Snowflake

Поделиться
HTML-код
  • Опубликовано: 1 окт 2024

Комментарии • 25

  • @KahanDataSolutions
    @KahanDataSolutions  3 года назад +1

    Get my Modern Data Essentials training (for free) & start building more reliable data architectures
    www.ModernDataCommunity.com

  • @satyagouda9494
    @satyagouda9494 4 месяца назад +1

    Thank you so much! Very Simple example and easy to understand!

  • @KahanDataSolutions
    @KahanDataSolutions  3 года назад +3

    **NOTE**:
    In this video I indicate there are storage costs associated for holding on to the staged data. However, that would actually not apply for "External" stages (such as Azure Blob, S3, etc.). Rather, this type of stage would be acting as a pointer to that cloud location. You would, however, incur storage costs if it is an "Internal" stage where Snowflake would actually be holding the data.
    Also, purging/removing from an "External" stage WILL DELETE the file in the cloud storage as well. This example was meant to show how that syntax works, but be cautious before running that step if it is critical data!

  • @user-ug64e3st87op
    @user-ug64e3st87op 6 месяцев назад

    Hi! How to copy Incremental ? For example I have 1.000.000 line in table in Snowflake, then in Azure DB was added one more line and now in Azure DB 1.000.001! How to add just this one last line to snowflake without copy aaaa---aaaa---ll data?

  • @mkillough67
    @mkillough67 9 месяцев назад +1

    Great video, thank you, but how do you automate the addition of your input CSV (in this case) file to Azure without having to manually upload it?

  • @RichardMartinez-o2e
    @RichardMartinez-o2e 5 дней назад

    and this is all done with wide open network access to your BLOB?

  • @ihafidh
    @ihafidh 3 года назад +3

    Great video again. Question though, are you sure that the stage will add to Snowflake cost? You created an “external” stage which was a pointer to an Azure blob. That should not add to the storage cost in Snowflake.
    Had you created an “internal” stage (which is stored in Snowflake) then it would add to storage cost.

    • @KahanDataSolutions
      @KahanDataSolutions  3 года назад +2

      Thank you for pointing this out! You are correct. For the external stage it would only be storing the metadata of the pointer, not data itself. I will add a note to the description calling that out. Appreciate you watching!

  • @masoudghodrati3770
    @masoudghodrati3770 2 года назад +1

    great vid, thanks.
    can't the copy/insert done automatically? like as soon as a new file added, we can see data added to the table?

  • @nishavkrishnan4271
    @nishavkrishnan4271 3 года назад

    I am getting an error when i list "Stage" .Error is Failure using stage area. Cause: [Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature. (Status Code: 403; Error Code: AuthenticationFailed)]
    Can you please help ?

  • @pilarbaldominos2505
    @pilarbaldominos2505 2 года назад +1

    Thanks for sharing … you do not know how much you has helped me to start my journey with snowflake!

    • @KahanDataSolutions
      @KahanDataSolutions  2 года назад

      That's great to hear. I'm glad to hear it has been helpful for you - thanks for watching!

  • @satyaraj1427
    @satyaraj1427 3 года назад

    Hey the video is great.....but i have a doubt..is there a way to load the csv files directly? like without creating tables in the worksheet and loading them, is there a way to do the direct file load from blob storage? please help me with this....Thank you

  • @OfficialSnowsix
    @OfficialSnowsix 3 года назад

    I noticed on the create stage portion, there was a snowflake managed option. What if we use snowflake but not azure or aws? Would we then just use snowflake managed? Would it just be a repository via snowflake instead of s3 or blob storage?

    • @KahanDataSolutions
      @KahanDataSolutions  3 года назад

      That's correct. This option refers to Internal Snowflake stages which would indicate that the data is not originally stored in an external location such as S3 or BLOB, but rather uploaded directly to Snowflake. Here are some links with some more information:
      - docs.snowflake.com/en/sql-reference/sql/create-stage.html
      - docs.snowflake.com/en/user-guide/data-load-local-file-system-create-stage.html

  • @hyuntaekjoo6034
    @hyuntaekjoo6034 3 года назад

    Good tutorial! Thank you!
    Can I have a question? Is it possible to copy tables, not data?
    Because just importing a CSV file itself, it looks better. It saves 1 step.

    • @KahanDataSolutions
      @KahanDataSolutions  3 года назад

      Thank you for watching, and yes it is! It sounds like you may be looking to use the "clone table" feature. This video will help explain / walk you through it - ruclips.net/video/aVZxk8tz5p8/видео.html

  • @noel_g
    @noel_g 3 года назад

    Does purging remove it from the blob storage container?

    • @KahanDataSolutions
      @KahanDataSolutions  3 года назад +2

      In this scenario, yes purge = true would also remove it from the External stage (blob). If you do not want that to happen you can leave that part out, and Snowflake will still be smart enough to only work on the newly added files.
      Note: I accidentally misspoke on this video by saying there are storage costs associated to "External" stages (Blob, S3, etc.), which there is not. Costs would only apply to "Internal" stages. External stages would simply be a pointer to that Blob container and not be holding the data itself. Therefore, it is less critical to be constantly purging that type of stage.

  • @SuperICHmagdich
    @SuperICHmagdich Год назад

    Hi Kahan great video as always. Am I correct that copying into a table means that under the hood the data is transferred to the internal table stage from the external stage?

    • @KahanDataSolutions
      @KahanDataSolutions  Год назад +1

      I'm not 100% positive, but my understanding is that an external stage is not transferred to an internal stage, but rather copied over directly to your table from the endpoint location (ex. S3 bucket).

    • @SuperICHmagdich
      @SuperICHmagdich Год назад

      The copy command than is kind of unfortunately named as a table is not a real storage location. Consequently deleting data from eg. S3 should result in this beeing reflected in the snowflake table am I right? Maybe some clarification regarding this is worth raising a Snowflake ticket for.

  • @noel_g
    @noel_g 3 года назад

    How would this differ if the file was parque? E.g. is there a way to create the table automatically since column info is already in the file metadata?

    • @KahanDataSolutions
      @KahanDataSolutions  3 года назад

      Hi Noel - The process would be very similar, except you would need to create a File Format for "Parquet" instead of "CSV" and the copy script would require some parsing. I will have a video next week on working w/ JSON which will cover similar functionality.
      For now, here is a link to the Snowflake docs that explain it in more detail. - docs.snowflake.com/en/user-guide/script-data-load-transform-parquet.html
      If your goal is to automate this process, I also suggest looking into auto-ingesting data w/ Snowpipe.
      Hopefully between this video and the documentation it will get you going in the right direction!

    • @noel_g
      @noel_g 3 года назад +1

      @@KahanDataSolutions thanks. I guess we can’t get away from defining the table first. I thought since schema info and data types are there already a step could be skipped.