Deduplication for Dummies - What is deduplication?

Поделиться
HTML-код
  • Опубликовано: 26 окт 2024

Комментарии • 55

  • @faisalfares5112
    @faisalfares5112 3 года назад +10

    I have spent few hours watching many people to just understand how deduplication works... and this simple video is the best video I have ever seen.
    You have explained the whole thing in just few minutes so any layman like me can understand.
    Thank you very much.

  • @luislednick1413
    @luislednick1413 4 года назад +2

    Very simple and straight explanation. Good job.

  • @mindsting
    @mindsting 9 лет назад +3

    I know this video is old, but is very well done! Thanks! I hope it is okay that I share this to help train storage/DC sales people who need help on this topic so they truly can 'get it.' Will always make sure you get credit for sure! Great work.

  • @mrkevinlwright
    @mrkevinlwright 12 лет назад

    Awesome, Summation! Im in the process of selecting a SAN solution and this simple video added a great piece of knowledge to my overall understanding!
    Thanks!

  • @pakodasingh
    @pakodasingh 3 года назад +1

    In MS world all people will have pointer to the blocks called Reparse Pointer stored in original file. And where all unique pieces of blocks get stored in server is called Chunk Store...awesome video.

  • @newbie8051
    @newbie8051 2 месяца назад

    This guy did his undergrad in journalism woah

  • @zeeshan-tp5hp
    @zeeshan-tp5hp 4 года назад

    Wow. You have explained this so well. 👍

  • @nemonemo6285
    @nemonemo6285 2 года назад

    Perfect. Thank you.

  • @borkisoufiane
    @borkisoufiane Год назад

    perfect too simple to understand thank you

  • @johnmclaughlin4218
    @johnmclaughlin4218 9 лет назад

    Clear and concise explanation - thanks

  • @MelroyvandenBerg
    @MelroyvandenBerg 7 месяцев назад

    Still relevant to this day. Remember that kids ;)

    • @MelroyvandenBerg
      @MelroyvandenBerg 7 месяцев назад

      Ps. you also have DB deduplication eg. via memory cache, so on other parts of the software or in a network. Not only disk.

  • @happysnapperman
    @happysnapperman 13 лет назад

    Explained in plain English. Well done. Thanks.

  • @reilagji4752
    @reilagji4752 3 года назад

    why does every video from the early '10s look like it was the 80s. Boy has technology changed us

  • @fordgt8847
    @fordgt8847 5 лет назад

    Awesome explanation!

  • @encikbett
    @encikbett 10 лет назад

    Thank you. Now I have more understanding.

  • @Therockingww
    @Therockingww 2 года назад

    thank you !!

  • @kannan991
    @kannan991 9 лет назад

    Great explanation . Thanks

  • @abcd123181
    @abcd123181 10 лет назад +3

    We understand file level but In block level deduplication, if any 1 will change his data then how it will get store in data center ?? and if no two people have same data then ?

  • @nph24
    @nph24 11 лет назад

    This helped me so much! Great job!!!!

  • @newajay100
    @newajay100 5 лет назад

    Thanks, Nicely Explained

  • @MrMariog2681
    @MrMariog2681 12 лет назад

    Thank you. Well done.

  • @anisaa8752
    @anisaa8752 2 года назад

    good explanation

  • @gatewayer1
    @gatewayer1 11 лет назад +1

    hi! The only thing you dont tell: How does deduplication on block data knows where each block goes? I mean, 12345 ist just a row of number for 1) different users and 2) different blocks; so it's not Me=12345M; Ted=12345 ... so, how does Deduplication knows where each block goes? and how much space does this information require compared to the origin-block information?
    Hope you can explain that, maybe either in a video or with a comment, THANKS!

  • @sambitbehera8835
    @sambitbehera8835 4 года назад

    Thats helped a lot

  • @troller4jesus
    @troller4jesus 8 лет назад +2

    doesn't block change if a file within the block changes?

  • @avimzrh
    @avimzrh 10 лет назад

    great explanation.

  • @AryaPrasetya
    @AryaPrasetya 8 лет назад

    well explained, thank you!

  • @ericmiller7213
    @ericmiller7213 11 лет назад

    That was Bad Ass.... great explanation and summation

  • @mihas101
    @mihas101 11 лет назад

    well done. that is understood even if your non-technical like me. thanks

  • @karthikpillai2378
    @karthikpillai2378 11 лет назад

    Thanks, Well Explained.

  • @nduwana
    @nduwana 12 лет назад

    Nice work. Any recommendation which backup software is good/superior than others for block level dedupe?

  • @TPHBLIB
    @TPHBLIB 11 лет назад

    Thank you very much!

  • @AmazinglyAwkward
    @AmazinglyAwkward 6 лет назад

    This is a great explanation. I just have 1 question. Why?

  • @anamfarooqui3795
    @anamfarooqui3795 4 месяца назад

    Thanks

  • @PankajVerma-fw5ww
    @PankajVerma-fw5ww 11 лет назад

    Thanks for sharing this info...

  • @chrisengelbrecht9996
    @chrisengelbrecht9996 9 лет назад

    Great! Thanks

  • @poloboy
    @poloboy 4 года назад

    thanks brah

  • @ewasteonline
    @ewasteonline 12 лет назад

    Nice job

  • @metalaarif
    @metalaarif 8 лет назад

    Great explanation. Thanks but I have a question in regards to Block Level DeDuplication. If 2 guys have some song and 3 guy has different song how would block level deduplication work there. Let's say 2 of them has Coldplay - Yellow and the other guy has Iron Maiden - Fear of The Dark. How would Block Level work there.

    • @AmazinglyAwkward
      @AmazinglyAwkward 6 лет назад

      Maybe the file is recognized as a music file so like 1 block would be saved across all of the files would be that it IS a music file but then it would be seperate blocks for the artist and songs?

    • @robbstark8692
      @robbstark8692 6 лет назад

      I think a music file was a bad example, only because it makes it hard to visualize the bits being used to dedupe and most music files are already compressed. There are some other videos that show how it works, but basically the software recognizes patterns of bits inside every byte being backed up. Using this video's example, let's say block 1 is 0110 in binary and block 2 is 0101. Maybe it's just metadata or file headers that tells the computer it's an MP3 file (I'm not sure, just trying to use an example). This wouldn't change for ANY MP3 file being backed up, so it would be redundant to store each example of those for every MP3 file being backed up. Block 3 could be 1010 in binary, block 4 1100, and block 5 1001. This could contain the specific audio codex being used, different bit rates, or other components of an MP3 file that varies from file to file. Let's say block 3 says the bitrate is 128 Kbps, block 4 says the bitrate is 160 Kbps, and block 5 says the bitrate is 256 Kbps. The rest of the file is contained in 100s of other blocks, so those blocks will be largely unique and couldn't be deduped very well (compressed file formats like MP3 are terrible at deduping, and many times the file actually becomes larger). These binary patterns are stored in a dedupe engine used by the software, and every time a specific pattern is recognized the software points to the location in the file and determines what binary pattern can be inserted into that location in the block.
      All 3 files are MP3s, so we don't need to keep saving that part of the file, but we do need to know the other pieces of information to ensure the file is usable when it's restored. Over time, these redundancies can become huge amounts of data. We don't need to save block 1 and 2 for every file, we simply need to know what block 1 and block 2 look like (0110 and 0101) and what they represent. Then, when the deduplication engine sees these patterns, it knows it can skip backing them up and use a pointer to indicate where the pattern exists in the specific file. I'm far from an expert, that's just my understanding of how the deduplication process works.

    • @pakodasingh
      @pakodasingh 3 года назад +1

      Deduplication only works on duplicate files not on unique files.

  • @ovimt
    @ovimt 11 лет назад

    You keep a table transparent to the user that wants to ignore that does the logical (what you think there is on the bup storage) and physical (what you actually have on the storage). Let's take the last case from the vid. you think you have 1234, 1245, 1235 but in fact you have 12345. The mapping table might contain: Block 1 represents 1st, 5th and 9th logical blocks. Block 2 logically represents 2nd, 6th and 10th logical blocks.... Block 5 represents 8th and 12th logical blocks.

  • @mubbashirjavaid7486
    @mubbashirjavaid7486 4 года назад

    Great

  • @khaledsoliman7936
    @khaledsoliman7936 8 лет назад

    What is deduplication?

  • @rajatsharma01
    @rajatsharma01 12 лет назад

    grossaly underestimated file deduplication: forgot Rabin fingerprints, chunking files?

  • @karatbarsinfo345
    @karatbarsinfo345 10 лет назад

    nice

  • @GaryMiller-r5n
    @GaryMiller-r5n Месяц назад

    Mohr Junction

  • @salander8729
    @salander8729 Год назад

    Doesn't have indian accent; watchable.

  • @indawgwetrust4255
    @indawgwetrust4255 8 лет назад

    i suggest a shave and a haircut.

  • @kannan991
    @kannan991 9 лет назад

    Great explanation . Thanks

  • @ringhp
    @ringhp 11 лет назад

    Well done. Thank you!

  • @kannan991
    @kannan991 9 лет назад

    Great explanation. Thanks