Getting data of my failing RAID array

Поделиться
HTML-код
  • Опубликовано: 27 июн 2024
  • My main server had a failing ZFS raid array and in this vlog style video I try to fix the array and get all the data from it. I go over my process and all the little issues I run into in the process.
  • НаукаНаука

Комментарии • 35

  • @idle_user
    @idle_user 28 дней назад +16

    The knowledge you have of the steps you take is astounding.
    I personally have to look up every step of the way with every new error.

    • @ricsip
      @ricsip 26 дней назад

      the problem here is that if it wanted to be an educational video rather an entertaining video, all steps could have been explained with some small pause and maybe a drawing / diagram. Otherwise it remains entertaining to other zfs experts, and would serve no education content to ones who arent zfs experts.

    • @ElectronicsWizardry
      @ElectronicsWizardry  25 дней назад +4

      I think I was aiming for entertaining, and as you pointed out people with ZFS knowledge is a pretty slim crowd. Teaching how to fix these types of issues can be difficult as there can be multiple things wrong, and general guide is difficult to make. I thought this video was a interesting case study into one specific issue I had for some.

    • @Mr.Leeroy
      @Mr.Leeroy 24 дня назад

      @@ricsip how about pausing video on each unfamiliar moment and doing your own homework?
      If you are unable or unwilling to do at least that, no guide/tutorial ever will download critical thinking diskette into your brain as they did in The Matrix.
      This video has got all you need to provide with coherent info on the case study.

  • @marc3793
    @marc3793 27 дней назад +4

    Man, you have a lot of drives (and other hardware) hanging around! 😄

    • @ElectronicsWizardry
      @ElectronicsWizardry  25 дней назад +1

      Yea I have really accumulated a pile of drives over the years. I probably should get rid some of it, but I am a bit of a hoarder.

  • @Skukkix23
    @Skukkix23 28 дней назад +4

    ElectronicsWizardry: I lost 2 drives, it's a raidz2 it's fine!

  • @johnmerryman1825
    @johnmerryman1825 27 дней назад +2

    Great video, love the vlog style! You inspired me to replace the degraded boot-pool on my homelab Truenas server

  • @gg-gn3re
    @gg-gn3re 28 дней назад +2

    "0:28 I been a bit lazy with my personal stuff" you and me both brother. Years ago I ended up just syncthing several of my family members stuff all together giving us each access to only their respective stuff and calling it a day.

  • @MStrong95
    @MStrong95 28 дней назад +2

    Glad it mostly worked out and you had a backup copy as well for some extra data protection and redundancy. Currently trying to work with customer service for a Seagate Barracuda 8TB hard drive that failed inside the warranty period. It's expiration date for warranty is sometime in 2025.

  • @lifefromscratch2818
    @lifefromscratch2818 28 дней назад +1

    Definitely cool seeing real world stuff like this.

  • @magmaxgus
    @magmaxgus 28 дней назад +2

    Great Stuff. Keep it up!

  • @silversword411
    @silversword411 28 дней назад +1

    More of these videos, good to see some tips.
    sudo !!
    That was a new one to me
    You also said some tools at 17:48 what are those? Don't forget to mention names of stuff! :) More pls

    • @ElectronicsWizardry
      @ElectronicsWizardry  28 дней назад +9

      That tool I used then was btop. I should do a video on different usage monitoring tools on linux.
      I have to balance between video length and amount of content, but glad to hear people like the extra info.

  • @moebius2k103
    @moebius2k103 26 дней назад +2

    With so many spare drives, just swap swap the problematic ones straight away next time. Copy the whole data off first though. Read IO making the copy is preferable to rebuild IO. If the rebuild goes bad you've got the copy. Then you can read the writing on the wall and throw some cash at it with a whole new pool of drives and retire the old ones for good.

  • @Tumleren
    @Tumleren 28 дней назад +4

    I may have missed it but do you have a video on monitoring and alerts? Like for knowing when a disk fails or there's a problem with the pool

  • @truckerallikatuk
    @truckerallikatuk 28 дней назад +3

    The 3TB drives with known issues were the big red flag there. Personally, I'd have been slowly swapping them out for 4TB+ drives over time, just to avoid this exact situation.

    • @ElectronicsWizardry
      @ElectronicsWizardry  28 дней назад +5

      The strange part to me is both of my working ST3000DM001 drives(I had 5 originally back in the day) didn't have any issues it seems during this rebuild, and pass badblock tests now. I'll keep a close eye on them, but I'm curious how much longer they will last. I think the drives in this array were averaging 50k hours, and 'trying to use up the old drives' isn't a good idea for the main fileserver.

    • @Van-l2r
      @Van-l2r 28 дней назад +1

      A few years ago, I tossed any mechanical drives under 10TB. I wish I still had those now!

    • @truckerallikatuk
      @truckerallikatuk 28 дней назад +1

      @@ElectronicsWizardry The typical bathtub curve of failures... they die most at the beginning and end of life.

  • @peteradshead2383
    @peteradshead2383 27 дней назад +1

    My synology ds918+ said to me the other day "drive 2 to as failed" , I/O errors and said it had 906 bad sectors replace the drive .
    So switched the NAS off a ordered a new drive , the new drive came next day so switched back on the NAS and all drives healthy , so told drive 2 to do a smart short test and it passed , extended test passed , iron wolf test passed, and a scrub .
    So I gone from a sky is falling in to it passing all tests by just a switch off , I didn't pull the drive or anything .

  • @jeromehage
    @jeromehage 19 дней назад

    Thank you for this nice video

  • @byrd203
    @byrd203 24 дня назад

    the Drives going crazy on SMR raid it happens too they will make other drives fail

  • @byrd203
    @byrd203 26 дней назад

    you probly had SMR drives they fail on rebuild yes you want to check the spec sheet and use CMR Drives in raid like iron wolf pros

    • @ElectronicsWizardry
      @ElectronicsWizardry  25 дней назад

      I'm pretty sure my drives were SMR looking at the models of the drive I had. I also have done a few rebuilds with SMR drives in the past and there normally just extremely slow, they don't fail the rebuild.

    • @byrd203
      @byrd203 24 дня назад +1

      @@ElectronicsWizardry failing on rebuild is a common issue on SMR Drives WD & seagate got sued over this look it up it was a hole mess even videos on youtube regarding this do not use SMR ever.

  • @darthkielbasa
    @darthkielbasa 28 дней назад

    The moniker is true.

  • @Van-l2r
    @Van-l2r 28 дней назад

    My computer decided to dump my Windows Storage Space on me. I was able to copy about half of it. I went back to cold storage.

  • @shetho1
    @shetho1 28 дней назад +1

    I couldn't be bothered with that command line crap I would rather have a gui so much easyier that command just seems like to much hassle

    • @ElectronicsWizardry
      @ElectronicsWizardry  25 дней назад

      Yea the command line is a lot of learning, and I get why many people want a easy to use GUI. Unfortunately a lot of tools are command line only.