Difference between Database vs Data lake vs Warehouse

Поделиться
HTML-код
  • Опубликовано: 30 июн 2024
  • Want to learn Big Data by Sumit Sir?
    Checkout the Big Data course details here: trendytech.in/?referrer=youtu...
    Difference between Database vs Data lake vs Warehouse
    𝗝𝗼𝗶𝗻 𝗺𝗲 𝗼𝗻 𝗦𝗼𝗰𝗶𝗮𝗹 𝗠𝗲𝗱𝗶𝗮:🔥
    🔅Sumit LinkedIn - / bigdatabysumit
    🔅Sumit Instagram - / bigdatabysumit
    Database
    =========
    Transactional data
    OLTP (online transaction processing)
    Structured data
    Recent data - day to day data.
    Example - online banking transaction.
    Oracle, Mysql
    Schema on Write
    DatawareHouse - DWH
    ====================
    Analytical processing where we require a lot of historical data to find the insights.
    The moment we run complex queries on our database with an intent to do some analysis then your day to day transaction will become slow.
    we take the data from databases and migrate it to Datawarehouse to do analytical processing.
    we get the data from multiple sources.
    Structured Data - Schema on write.
    example - TeraData
    storage cost is high but lesser than your database.
    ETL process -
    suppose your data is in database
    extract the data
    Transform it (is a complex process)
    Load it to Datawarehouse
    This approach reduces our flexibility.
    Data Lake
    ==========
    to get insights from huge amount of data.
    the data is present in its raw form. It can be structured or unstructured.
    Log File - we can directly have this file in raw form in data lake.
    ELT process - Extract Load & Transform.
    HDFS, Amazon S3
    Cost effective..
    Schema on Read.
    create structure to visualize or see the data.
    it gives you enough flexibility.
    #bigdata #dataengineering

Комментарии • 50

  • @sumitmittal07
    @sumitmittal07  Год назад

    Checkout the Big Data course details here: trendytech.in/?referrer=youtube_bd12

  • @rishabh31415
    @rishabh31415 2 года назад +12

    Got this question in Amazon DE interview round 1

  • @abhijeetghosh8073
    @abhijeetghosh8073 Год назад +5

    Tried looking for defination in many videos but the way you explained really made it easy to understand. Thanks for sharing knowledge 🙏

  • @ashutoshsharma3396
    @ashutoshsharma3396 Год назад +2

    Thanks for this video! In our project we were loading data from oracle db to Teradata stg tables using informatica one to one mapping and after loading to stg tables were doing transformations and loading dimension and fact tables so it was ELT process.

  • @ishaqkhan8653
    @ishaqkhan8653 Месяц назад

    Thank you for the wonderful explanation. It was simple and crisp

  • @JuniorMarci27
    @JuniorMarci27 Год назад +1

    Perfect explication! Thanks!!!
    You changed my mind about this subject.

  • @andreanlobo7373
    @andreanlobo7373 Год назад

    Extremely useful. Thanks for this video and all the valuable information

  • @ushavr
    @ushavr 10 месяцев назад

    Very thorough and helpful! Easy to understand. Thanks

  • @gerardt.5172
    @gerardt.5172 Год назад

    Great explanation and very esay to follow!👍 Thanks!

  • @udaypatilSiR
    @udaypatilSiR Год назад

    Sumit sir really hats off to you.

  • @nklee1980
    @nklee1980 Год назад

    Your explanation 100% very clear and understanding.....

  • @devasaniganesh8196
    @devasaniganesh8196 2 года назад +5

    You made this easy to understand, Thank you sir.

  • @rakish81
    @rakish81 Год назад

    Superb & Informative! nicely articulated.

  • @MSG_22
    @MSG_22 Год назад

    It was simple and clear, thank you

  • @funtime12345
    @funtime12345 Год назад +3

    What an amazing explanation sir!!! Superb!! Love it!!! You are great sir!!!

  • @avinashshinde2659
    @avinashshinde2659 Год назад

    excellent,got Deep knowledge!

  • @abhijeetpatil8958
    @abhijeetpatil8958 2 года назад +2

    Now I understand better

  • @arpangupta2162
    @arpangupta2162 2 года назад +4

    it was a much needed session

    • @sumitmittal07
      @sumitmittal07  2 года назад

      Happy that you found the session useful!

  • @shreyashwaghmare9136
    @shreyashwaghmare9136 8 месяцев назад

    Bro your video helped me a lot. Thanks man

  • @ganish5431
    @ganish5431 2 года назад +1

    Good explanation..!

  • @nandansingh1482
    @nandansingh1482 Год назад

    Very well explained thanks a lot

  • @ravirajchenna612
    @ravirajchenna612 Год назад

    Loved the explanation

  • @gcpchannelforbegineers7080
    @gcpchannelforbegineers7080 Год назад

    Thanks for the amazing content :)

  • @madhavareddy3488
    @madhavareddy3488 Год назад

    Nice one!!
    Thanks!

  • @MJoe-fb9ps
    @MJoe-fb9ps Год назад

    Awesome

  • @user-ql8ct7bx5e
    @user-ql8ct7bx5e Год назад

    Nice explanation

  • @ameerullah2260
    @ameerullah2260 4 месяца назад

    Super awesome

  • @alessandroceccarelli6889
    @alessandroceccarelli6889 Год назад

    Can MongoDB, with its analytical engine and Time Series collections, be considered a hybrid DB/Warehouse? Would it be inherently wrong to store e.g. historical sales transactions of a shop and current transactions within a Time Series Collections on Mongo?
    If not, which alternative would you consider for it?
    Thank you so much for your video 👏🏻👏🏻

  • @prajitkarande5214
    @prajitkarande5214 2 года назад +1

    Can we use ELT for datawarehouse ?
    Ca we store log file as it in datawarehouse if yes then what's exactly different between data warehouse and data lake?

    • @ANKITKUMAR-nv8ur
      @ANKITKUMAR-nv8ur Год назад

      No in data warehouse we can not store log file as it is unstructured data. but in data lake we can have unstructured and structure data that's the difference exactly

  • @omkarm7865
    @omkarm7865 2 года назад +2

    can we use Oracle also as data warehouse because as far I know Teradata is also database only. so
    any database available in market, we can use that as data warehouse. Am I correct?

    • @sonu-if4ym
      @sonu-if4ym Год назад

      Yup.. We have used oracle as dwh

    • @nvasudeva
      @nvasudeva Год назад +1

      While technically u could use but this would be at the cost of performance.
      Please note that traditional databases works on rows where as typical DWH database is configured to operate in columnar fashion.

  • @dhruvpandey1189
    @dhruvpandey1189 Год назад

    Are datawarehouse and data lake different sections of the same set up? For example if we have a snowflake DW, can the data lake be also within the same snowflake set up but an isolated environment?

    • @vandanasharma4738
      @vandanasharma4738 Год назад +1

      Snowflake is created/hosted on any of the three clouds we have. So as per my understanding, snowflake is data warehouse but it can be used as a datalake as well. When used as a datawhere house it stores the data in snowflake itself but when you use it as a data Lake, it keeps or make use of Amazon S3 for example bucket for storage

    • @vandanasharma4738
      @vandanasharma4738 Год назад

      Please correct for better understanding

  • @AbhishekTiwari-vf9ru
    @AbhishekTiwari-vf9ru Год назад

    I have a doubt , does datawarehouse supports acid properties? Also is only historical data stored in datawarehouse?

    • @vandanasharma4738
      @vandanasharma4738 Год назад

      The idea behind developing data warehouses is to keep historical data you can relate it with a normal datawhere house people are using to store goods so it's a kind of storing huge amount of data

  • @gopinathg318
    @gopinathg318 2 года назад +1

    If we cannot store historical data in data base then how we will copy those data to a data warehouse to do analysis.

    • @vandanasharma4738
      @vandanasharma4738 Год назад +2

      So what happens is when you have data in a database then on regular intervals we move this data from database to a datawarehouse using an ETL tool. So the data flows from your database -> then ETL tool ~> then a datawarehouse.
      The frequency at which you move data from a database to a data warehouse is generally suggested by the clients or the business users

  • @dishashetty866
    @dishashetty866 Год назад

    Is Google Cloud Storage a database or a data lake?

  • @deepanshujain4119
    @deepanshujain4119 Год назад

    Can a database become a datawarehouse and visa versa?

    • @MaheshTiwari-hq7lr
      @MaheshTiwari-hq7lr 5 месяцев назад

      both have certain limitations as discussed in the video.

  • @ashutoshyadav8348
    @ashutoshyadav8348 2 года назад

    Why cost of storing data in database is high?

    • @ramm3020
      @ramm3020 Год назад

      May be onpremises storaege . now due to cloud the storage cost comes down. Pls comment it is correct or not. THanks.

  • @azingo2313
    @azingo2313 Год назад

    Wonderful explanation