A Chat about Linus' DATA Recovery w/ Allan Jude

Поделиться
HTML-код
  • Опубликовано: 21 ноя 2024

Комментарии • 219

  • @ThePlebicide
    @ThePlebicide 2 года назад +88

    Its nice to hear a technical discussion like this where you aren't having to stumble over client confidentiality every 5 minutes. it was really cool.

  • @smoshGaming
    @smoshGaming 2 года назад +290

    If Linus can drop gpus then why does it matter if he drops tables too

  • @daviedaviedave
    @daviedaviedave 2 года назад +90

    This is a match made in heaven, Allan and Wendell on the same video.

  • @mbarrio
    @mbarrio 2 года назад +14

    46:21 Waiting for that talk link! What a truly amazing guys.

    • @malexejev
      @malexejev Год назад

      found this ruclips.net/video/v8sl8gj9UnA/видео.html

  • @Marc.Google
    @Marc.Google 2 года назад +13

    I want to say thank you Wendell for being the expert that you are and sharing your adventures on RUclips for all of us to learn from and enjoy.

  • @diesieben07
    @diesieben07 2 года назад +41

    Having just gotten into ZFS with TrueNAS, if this video teaches me anything it's that I have made the right choice. If it can get "90% recovery out of the box" with that terrible degraded pool... I think I am safe with my 5 drive NAS :D
    Thank you for this great video.

    • @musicforlifemc1006
      @musicforlifemc1006 2 года назад +4

      Especially considering your pool should never get this degraded. The main cause of degration in Linus' case was using an old version of ZFS with no regular scrub set up. TrueNAS sets this up by default apparently.

    • @abzzeus
      @abzzeus 2 года назад

      they had RAIDZ2 so had two redundant copies rather than the single you have so be aware of that

    • @abzzeus
      @abzzeus 2 года назад +1

      @Сусанна Серге́евна I'm not talking about Linus, rather @diesieben07 case
      RAIDZ2/6 vs RAID10 has a key difference when it comes to drive loss and rebuild. Rebuild puts massive strain on the drives, so it is possible to lose a drive in the rebuild. If in RAID10 that loss is in the good mirror than you have lost your data, whereas in RAIDZ2/6 you still have your data and hopefully the rebuild will complete nand you can rebuild another failed drive
      As to Linus, he has hasn't realised that there is big difference between where he started as a single person/small company an a medium/large business that he is now. This may have taught him that things that you do as a small outfit are different to a large one. And being a tech channel does NOT exempt you from having standards that are driven by proper system management. He has realised this on PC builds for his growing staff, that having a collection of random PCs is a nightmare to support and things can go wrong stopping production

  • @amateurwizard
    @amateurwizard 2 года назад +12

    5 years since you two did this last, time flies.

  • @evildude109
    @evildude109 2 года назад +43

    As an old techsnap fan, I'm loving this. So now we see the end of the line. When youtubers need tech support they call Linus, who calls Wendell, who presses the red button to release Allan.

    • @Yandarval
      @Yandarval 2 года назад +6

      First mistake was not having a good, tested backup strategy. Second mistake is calling Linus at all.

    • @kaptenkrok8123
      @kaptenkrok8123 2 года назад +5

      if you call linus for tech support you are gonna have a really bad time lol...

    • @fredEVOIX
      @fredEVOIX 2 года назад

      @@kaptenkrok8123 can't disagree a lot of people forget Linus or other tech channels are media creators, sometimes actors who have no idea what they are talking about in the case of Steve from GN which what 1-2 years ago "didn't know how to build a pc inside a case" which means he's a complete fake

    • @kaptenkrok8123
      @kaptenkrok8123 2 года назад +1

      @@fredEVOIX what?! Where can i see that?

    • @allanjude
      @allanjude 2 года назад +3

      If you liked TechSNAP, have you seen 2.5admins?

  • @JBothell_KF0IVQ
    @JBothell_KF0IVQ 2 года назад +5

    This was an incredible talk with people who obviously know A LOT. Would love more content like this

  • @Norman_Fleming
    @Norman_Fleming 2 года назад +3

    This was so enjoyable to listen to. The complexity of storage, and everything else, has multiplied so much over the decades. Nice there are people that still love working to improve it all from top to bottom.

  • @daltongrowley5280
    @daltongrowley5280 2 года назад +16

    I dont know what any of this all means but there is something soothing about listening to two professionals discuss something they know quite well.

  • @primistandem6781
    @primistandem6781 2 года назад +7

    It was very nice to see Allan with Wendell. I got Techsnap nostalgia. That show was soo good and informative (precisely because of Allan).

    • @evildude109
      @evildude109 2 года назад +3

      Look at 2.5 Admins, Allan's more recent podcast project. I just binged those over the last month, they are wonderful.

  • @Brumljw
    @Brumljw 2 года назад +3

    You’re an inspiration Wendell. You gave me the strength to come out of my shell!
    Thanks for everything. Love you brother!

  • @mrlithium69
    @mrlithium69 2 года назад +4

    im glad you got the main head honcho on to speak, and talk about how he improved the recovery tooling. I know what its like to do a recovery in ZFS, and it IS kind of fun just to experience the internals with ZDB, and its so flexible, but also hard to do everything manually. Ive been using it since 2018, ZFS is the way to go.

  • @sumikomei
    @sumikomei 2 года назад +8

    Oh man, when I heard "years of no maintenance/scrubbing/etc" I was just kind of hands on head in shock LOL

  • @AdenMocca
    @AdenMocca 2 года назад +3

    Really great video, thanks for the in depth coverage. Glad the broken system could be then used as a research device to make ZFS even better!

  • @NuggDavis
    @NuggDavis 2 года назад +9

    Wow, it's been a while since I last saw Allan at LFNW like... 8 years ago. Nice to see where he is now.

    • @MMSmithj
      @MMSmithj 2 года назад +2

      check out “2.5 admins”/ OR “bsdnow”/ podcasts featuring allan jude

  • @wizpig64
    @wizpig64 2 года назад +5

    the repair-send and no-copy dataset moves are AMAZING

  • @kjlovescoffee
    @kjlovescoffee 2 года назад +3

    I would love more in-the-weeds content on this, sitting, as I am, with a failed pool. I've just left it turned off until I have enough replacement drives, and reading on the topic in the meantime. I've gleaned some useful things from this discussion, but it would be good to see more in-depth.

  • @dasGieltjE
    @dasGieltjE 2 года назад +2

    Diffing the different data blocks to detect which bits are potentialy at fault and brute forcing them automatically if requested would be an awesome addition.

  • @kingneutron1
    @kingneutron1 2 года назад +7

    27:45 This is what I've been saying for years, stagger your mirrored SSDs (and use different brands) so they don't wear out at the same time. You can also rotate them every X months if you have the budget

    • @nv1t
      @nv1t 11 месяцев назад

      This has been done with HDDs before that. We used different brands and different manufacturing date hdds in hour NAS systems 15-20 years ago...because of different wear and tear.

  • @roadkill11000
    @roadkill11000 2 года назад +21

    "Don't be like Linus" Truer words....

  • @attainconsult
    @attainconsult 2 года назад +2

    great to see Alan talking about ZFS again

  • @ThePopolou
    @ThePopolou 2 года назад +8

    Linus Tech Tips, oh the irony there. I found it amazing watching their clip how they didn't even follow a best practice guide on zfs. They used freenas/truenas even, a healthy-sized community in itself. The irony is just remarkable.
    Anyway, great piece and great to hear about the new advances in the pipeline. Love hearing about this stuff from Allan. I have an all-nvme project over 100GBe so am eagerly waiting for zfs to be tuned to take advantage of such a platform. The "repair send" and a new Block Reference Tree features are great ideas that I hope can be implemented upstream ASAP.

    • @speedyjago
      @speedyjago 2 года назад +4

      When system administration is an afterthought for someone doing other - openly profitable - work this is what happens. I used to do all the system administration at my old job but was often rented out to customers. Then no one would do it and months later when I got back from the engagement I'd have to clean up a ton of problems.

    • @ThePopolou
      @ThePopolou 2 года назад +2

      @Сусанна Серге́евна I am not so sure because what we learnt is that they didn't even set up email alerts - this is all more of an indication of a certain degree of arrogance...because we cannot truly believe that they are woefully inexperienced, "do as i say, not as i do" individuals running a technology channel that delves into Enterprise when they clearly haven't got the credentials to properly do so?

    • @ThePopolou
      @ThePopolou 2 года назад +2

      @Сусанна Серге́евна Couldn't agree more. They're simply entertainers. It is quite bizarre why a small proportion of the enterprise sector is seemingly aligning with them. I never thought i'd consider Supermicro or Toshiba/Kioxia in an "enthusiast" domain (not to lesser it of course!) but if that's what they see of themselves then perhaps we should realign our impression of them too.

  • @DavidOlofsson
    @DavidOlofsson 2 года назад +1

    I always need more long technical videos by Wendell!

  • @im.thatoneguy
    @im.thatoneguy 2 года назад

    @23:14 amazing advice! If you have a zraid and a drive fails, don't pull the bad drive until the rebuild is complete. What a good point that a drive might be occasionally erroring and should definitely be replaced prior to total failure, but it's probably good enough to support the resilvering operation.

  • @YeOldeTraveller
    @YeOldeTraveller 2 года назад +2

    The most I ever made on a single task was a drive recovery for a client that was not allowed to let a service contract. Said service contract would have ensured the backups were good. Client managed to corrupt the backup resulting in a recovery when the drive failed rather than a replace/restore.

  • @abavariannormiepleb9470
    @abavariannormiepleb9470 2 года назад +4

    This conversation inspired a general recovery question regarding traditional “classic” Audio CDs. These don’t have error correction. Using software like AccurateRip you can check if your rip matches checksums from a database of other users that ripped the same Audio CD. Sometimes scratches on the CD prevent a track from being accurately ripped and the damaged sectors of a track get logged. Is it possible to use modern very fast hardware to fill these “holes” in tracks with all possible data combinations until the entire track matches the verified checksum from that AccurateRip database?

  • @Delease
    @Delease 2 года назад +1

    Very interesting conversation. Peeking into the ZFS internals was enjoyable.

  • @TheToasterPilot
    @TheToasterPilot 2 года назад

    Hey it's Alan, love 2.5 admins and BSDNow podcast!

  • @ipaqmaster
    @ipaqmaster 2 года назад +8

    I see backplane issues as you're describing at 4:00 _quite often_ even in my personal zfs setups and it's quite frustrating watching the CKSUM (for example) counters of _all_ drives in a pool rise together with the exact same count's and 100% knowing that's a backplane issue, but ZFS is happy to eject them instead of be a bit more careful with that kind of problem. Ideally... backplane failures shouldn't be happening anyway.. and it's going to look the same to ZFS's checking, but damn it can be annoying to deal with.

    • @_--_--_
      @_--_--_ 2 года назад +2

      What backplane are you using? Direct Attach or Expander? Manufacturer?
      Just asking because im currently shopping for a enclosure with backplane, would be nice to know what to avoid.

  • @MaxPrehl
    @MaxPrehl 2 года назад +2

    Wow, this was actually an awesome talk. So much good advice and insights into ZFS!

  • @spiralfractr
    @spiralfractr 2 года назад

    I love that you guys are professional friends or partnered with LTT in the way that you are... I love being able to see the deep dives on your interactions on their over the top reactions to the original problem, and advancement of the tech itself...
    It's Inspiring.

  • @willfancher9775
    @willfancher9775 2 года назад +7

    You mentioned a talk Allan was going to give near the end of the video, and that it may have already happened by the time the video went up. Has it happened yet?

    • @malexejev
      @malexejev Год назад

      found this ruclips.net/video/v8sl8gj9UnA/видео.html

  • @carl8790
    @carl8790 2 года назад +1

    Hmmm didn't know ZFS wasn't optimized for NVMe drives, thought it would by now. Tbh, if I can get away with using tape drives, I would lol. Thanks for having Allan on, I've learned quite a bit.

    • @genrabbit9995
      @genrabbit9995 2 года назад

      I can't think of any Filesystem optimized for NVMe drives

  • @metaleggman18
    @metaleggman18 2 года назад

    This is part of the reason I like using mirrors in home NAS situations. All the data is on all the drives, and as long as your backups are in order, you're good. And if you occasionally upgrade the mirrors with larger drives, you can add up space fairly well, assuming the old drives in the vdev are still healthy. You can even keep a separate pool with triple mirrors for things that are smaller, but really important, like family photos or financial documents. I get it for big chassis in large scale production units, things like RAIDz allow you to get more space for your money, and the computational complexity during rebuild shouldn't matter that much. But I prefer the simplicity of mirrors, along with the incredibly fast resilvers. Recently I had some checksum errors on a drive in my NAS, and it turns out it was likely just an issue with a cable getting disconnected because of some construction. But, because I was using mirrors, I just added a new drive to the mirror vdev (an upgraded 6TB EXOS to replace the two 4TB Barracudas in the vdev), had it resilver, took out the "bad" drive, then added another third disk to replace to "good" drive. Once both 6TB drives were resilvered, I was able to take out the "good" 4TB drive, and poof, now I have two extra terabytes of space in my pool. Then, assuming both drives are fine, I can either use the old drives as a new vdev, or if I have other drives of the same size in the pool, I can attach them to the vdev as hot spares.
    It's part of why I plan to build a JBOD enclosure to attach to my NAS. I'm getting pretty low on drive space using standard ATX cases lol (though I'm able to do like 10 4.5" drives? Which is really great for an older, budget case).

  • @imperssonator
    @imperssonator 2 года назад

    I remember Allan Jude from the tech snap days. Very smart guy

  • @victoralander1398
    @victoralander1398 2 года назад +3

    The uberblock creating errors in physically nearby sectors sounds like rowhammer.
    Edit: Could you write backup/snapshot copies of the ssd metadata to the mechanical disks?

  • @TomTalley
    @TomTalley 2 года назад +6

    Wendel. Best audio ever. Please give us a hint how you did it and PLEASE consider paying the same level of attention to the audio on level one. Thanks.

  • @johnmijo
    @johnmijo Год назад

    Wendell, I had to come back and watch this video again :)

  • @andre2k
    @andre2k 2 года назад +1

    Wow really great video! Especially interesting were the part about file cloning on copy and restore coming in the future. 😁

  • @Jango1989
    @Jango1989 2 года назад +1

    Great video!
    Love how this lead to developing great tooling and really interesting to hear this talk!
    Sounds like Linus could do with outsourcing his sysadmin to a reputable company like one in Kentucky. For a tech channel, they are hilariously negligent with the administration of their own systems.😅 It seems like every few months there is an IT disaster that was easily avoidable with some regular systems administration. [Edit] as Wendell said himself @50:08

  • @PedroAlves0
    @PedroAlves0 2 года назад +1

    I'm surprised ddrescue wasn't mentioned anywhere. Was it never a part of the recovery process?

  • @kevh6303
    @kevh6303 2 года назад

    Dubstep Allan has returned!

  • @eDoc2020
    @eDoc2020 2 года назад +1

    This is the first that I've heard that ZFS stores additional copies of metadata. I was under the impression that it did not and I have been using btrfs on my small system largely because btrfs does store extra copies. So to ask a next question, can I have a multi-drive ZFS setup where the data has no redundancy but the metadata is still protected if one drive were to fail? With low-value data I don't mind if some files become inaccessible but I at least want to know exactly what I lost.

  • @RogerBarraud
    @RogerBarraud 2 года назад

    Excellent vid, guys... Thanks!
    Where's the link to Allan's talk though?

  • @MaxPrehl
    @MaxPrehl 2 года назад +2

    Wendell, pls put windows on Do Not Disturb when you're recording your system audio!

  • @abavariannormiepleb9470
    @abavariannormiepleb9470 2 года назад +1

    Did the faulty LTT backplane have Broadcom components in it?

  • @arimcbrown
    @arimcbrown 2 года назад +3

    lol, Linus contributing to FOSS & Linux in ways he never expected XD

  • @damianerangey
    @damianerangey 2 года назад

    Block reference tree was one of the killer features of SimpliVity from day 1 :)

  • @robertpearson8546
    @robertpearson8546 Год назад

    This is the first video that mentioned scrubbing. Recovery is very different from normal operation.

  • @Scrub_Ghost
    @Scrub_Ghost 9 месяцев назад

    Excellent video. I hope LMG reimbursed you for the work done.

  • @dwaynearthur1476
    @dwaynearthur1476 2 года назад +3

    2 Legends !!!!!!!😃😃

  • @hanes2
    @hanes2 2 года назад +1

    Haven’t seen Allan in a long time.

  • @Hellsfoul
    @Hellsfoul 2 года назад +1

    I still do not understand, why Linus did make a video about LTO backup, but does not use it?!

  • @TheAnoniemo
    @TheAnoniemo 2 года назад +1

    Those outlook notification sounds threw me off there!

  • @accesser
    @accesser 2 года назад +1

    This is fantastic listening

  • @profosist
    @profosist 2 года назад +1

    Amazing info even for people just running stuff at home.

  • @LampJustin
    @LampJustin 2 года назад

    45:00 omg finally!! 🤩 Been using reflinks with btrfs and xfs for years! ^^

  • @adrien335
    @adrien335 2 года назад +1

    Love my boy Allan!

  • @martinbreitbarth8674
    @martinbreitbarth8674 2 года назад

    Damn, I can only click the like button once! Thanks for this, very informative.

  • @kyubre
    @kyubre 2 года назад

    Is there any narrative that details how ZDB was used for the forensics?

  • @leexgx
    @leexgx 2 года назад +2

    44:10 I am surprised that's not a part of zfs already (b-tree) , btrfs does that as I can restore a snapshot to a folder path and it's just mounts the snapshot
    I would have thought the vdev is the lowest level and zfs datasets are like btrfs subvolumes, just surprised that it duplicates it like that on zfs in snapshots restore to another dataset

  • @LA-MJ
    @LA-MJ 2 года назад +1

    Where is the link to the talk?

  • @bertnijhof5413
    @bertnijhof5413 2 года назад

    I use ZFS now for 4.5 years and I'm very happy with it. The video was partly over my head, I have to study somewhat more :) I use ZFS on:
    1. My Ryzen 3 2200G; 16GB desktop running a minimal install of Ubuntu 22.04 LTS on OpenZFS 2.1.4. A minimal install, because I moved my "work" to VMs.
    It has 3 datapools mapped to a nvme-SSD; 2 HDDs (~9 power-on years 1TB+500GB) and a 128GB sata-SSD used as L2ARC/ZIL (SSD cache for the HDD datapools). The datapools are:
    - 512GB on my nvme-SSD for the most frequent used VMs;
    - 2x 500GB striped partitions on the 2 HDDs for my data and for more VMs, but with copies=2 for my data (pictures, videos, music etc.);
    - 500GB at the end of the 1TB HDD for archives (ancient 16-bits software and outdated VMs, like e.g. Windows 3.11 for Workgroups or Ubuntu 4.10).
    This weekend my datapools were scrubbed again without any issues. Once I had, that my "data" dataset (copies=2) has been corrected during the scrub. Note that the last 3.5 years the HDDs are more or less retired, they are used for less than 4 hours/week and they are supported by the high hit rates of the ZFS caches. I have a hit rate of ~98% for my L1ARC (memory cache maxed at 4GB) and currently the L2ARC hit rate stands at 49%.
    2. My 1st backup a 2011 HP Elitebook 8460p laptop with i5-2520M; 8GB and an almost new 2TB HDD. It also runs Ubuntu 22.04 LTS also on OpenZFS . The nice thing is, that it runs exactly the same VMs as my desktop, ideal during holidays in Europe during family visits.
    3. My 2nd backup the remains of a ~20 year old HP d530 SFF with 4 leftover HDDs in total 1.21TB, but with less than 3 power-on years. It has a Pentium 4 HT (1C2T; 3.0GHz); 1.5GB DDR. It has 2 datapools, one zroot on 2 striped HDDs (IDE 3.5" 250+320GB) and dpool on 2 striped HDDs (SATA-1 2.5" 320+320GB). The case is an original Compaq Evo Tower with a Win 98SE activation sticker :) The system runs FreeBSD 13.1 also on OpenZFS 2.1.4. The only disadvantage is, that the backup runs at 200Mbps of the 1Gbps due to a 95% load on one P4 CPU thread. I did run the scrub last week.
    I see two effects:
    - I run both backups at the same time, but the backup to FreeBSD/Pentium 4 is faster than the backup to Ubuntu/i5, while all 3 involved PCs are connected to the same 1 Gbps switch. The P4 runs at a constant transfer speed between 20 - 25 MB/s, while the i5 runs in the begin at say 80MB/s, but than it gets slower and irregular with long period of say 0 to 4 MB/s interleaved with say 20 to 40MB/s. It gives me the feeling of fragmentation in the receive buffer. In the past when the L1ARC was used for buffering, I did not have that issue. I did file an Ubuntu bug report, but no reaction.
    - Boot time of Xubuntu VM is ~6 seconds from the SP nvme-SSD (3400/2300MB/s), the reboot times are ~5 seconds, I assume from the L1ARC. To get faster reboot times from L1ARC memory, I assume, I have to buy a Ryzen 5 5600G. Maybe with the 2200G I could try to cache only the metadata for the nvme-SSD, the difference should not be very significant. After the here announced nvme queuing improvements, it might even be faster assuming, that the decompression of each record is assigned to another CPU thread also?

    • @bertnijhof5413
      @bertnijhof5413 2 года назад

      I have tried to set the primary cache of the nvme-SSD to metadata and the (re)boot time for Xubuntu is now ~7 seconds, but the use of L1ARC memory reduces from 4GB to 0.25GB :) Also for Windows 11 I did not loose more than say 10% in the boot time, so I keep the metadata setting for the moment. I noticed that the load on the nvme-SSD increases, but I still have a 74% hit rate for the L1ARC cache.

  • @justyours8766
    @justyours8766 2 года назад

    Xdiff for metadata , or maybe with a colour code so you can quickly see?

  • @LeeSeng
    @LeeSeng 8 месяцев назад

    Had zfs "invalid exchange" error during zpool import on 3 usb drives, maybe due to no export during shutdown?. Well, that setup serves me well for a few years with 2 disk replaced previously.

  • @harleyarmstrong5947
    @harleyarmstrong5947 9 месяцев назад

    You mentioned not to use dd to clone the failing drives? Is there a better tool for cloning an unreliable driver sector by sector?

  • @LeeFall
    @LeeFall 2 года назад +2

    The question everyone want to know is "Did Jake do it"
    I'm British and even i got confused when Allen said Zed FS etc lol Is he an expat? Or maybe Canadian?
    Edit: oh he is Canadian, I guess he accepted the English language without "making it their own" ;)

    • @PedroAlves0
      @PedroAlves0 2 года назад

      I thought he had said Zetta FS, from the original Zettabyte FS naming.

  • @abzzeus
    @abzzeus 2 года назад +1

    I would like to add special device to my pool, but at the moment figuring out the exact size of drives to use seems complex and a bit voodoo

  • @andymorris3250
    @andymorris3250 8 месяцев назад

    What would be your reccomendation for SAS cloning or duplication? Stand alone product 1:1 or another SAS chassis/server and recovery/ partition software?

  • @camofelix
    @camofelix 2 года назад

    49:30 RIP Optane, I loved you whilst you were around 😢

  • @DaxHamel
    @DaxHamel Год назад

    Wendell = gloriously insane!

  • @triplelifecentral
    @triplelifecentral 2 года назад

    Hey, anyone got a link to the Linus video Wendell is talking about?

  • @karolisr
    @karolisr 2 года назад

    No scrubs in years just boggles my mind. How!? Whyyy?!

  • @churblefurbles
    @churblefurbles 2 года назад

    Intel 530's had a write amplification bug as well, fixed long after it mattered.

  • @cryptearth
    @cryptearth 2 года назад

    ZFS really changed my way of looking at multi-volume storage stuff
    I started my journey when I used the fake-raid of my motherboards uefi to set up a 5-disk raid5 ... and played the restore-game about 3 times now - lucky me without data loss
    as I got rather annoyed by the fact I might lose data in a raid5 I started to look at other options and first got hooked by BtrFS - which is no good for raid5/6 - and so I learned about ZFS
    since M$ pulled the support for win7 and I wanted to get rid of the fake-raid5 anyway I switched over to arch linux, bought a big hba to connect many drives to it - set up a raidz2 pool and do regular scrubs
    sure - I had a look into windows storage spaces and ReFS - but it comes with the issue that this is all closed source and proprietary and only expensive enterprise 3rd party stuff is available - that's a no-go for me in a recovery situation: I want an easy and simple way to access my data - and with zfs-on-windows comming along quite a way ZFS seems to be the one option for a multi-platform filesystem - cause exFAT just doesn't cut it for many reasons - mostly because it's a FAT descendant

  • @metaleggman18
    @metaleggman18 2 года назад

    Yeah, with SSDs on servers, I'd probably have a backup setup, so that I have two copies of the vdev contents, one on the SSD to be used, and one on HDD as a backup, copied during downtime. I get having fast media on home servers for things like 4K video, but generally I wouldn't really want to store anything there and only there. In fact, it'd be better to have some sort of caching setup where a file is copied to the SSD vdev before playback, if possible.

  • @abavariannormiepleb9470
    @abavariannormiepleb9470 2 года назад +2

    Are the SMART data for “proper”SSDs like ones from Samsung or Micron really reliable regarding their remaining NAND life?
    For example get a Samsung 980 PRO with 2 TB, only use 1 TB of its capacity and write 1 PB to it. This should result in a significantly higher rest life expectancy compared to using the entire 2 TB and writing 1 PB to it. I’m a bit worried that SSD manufacturers might game their SMART values with the total TBW the specific SSDs are rated for. All SSD defects I have personally witnessed since 2011 were sudden deaths with no prior SMART indication whatsoever :(
    The most recent one was a Micron 7450.

    • @_--_--_
      @_--_--_ 2 года назад

      "All SSD defects I have personally witnessed since 2011 were sudden deaths with no prior SMART indication whatsoever"
      Thats because 95%+ of SSD failures is the controller dying and the NAND is still fine, atleast for consumer/prosumer SSDs.

    • @abavariannormiepleb9470
      @abavariannormiepleb9470 2 года назад

      @@_--_--_ well, the Micron 7450 isn’t exactly highest end but I would have classified it at least as “proper enterprise”. Regarding prosumer stuff I hoped that at least vertically integrated manufacturers like Crucial (Micron), Intel and Samsung would have proper SMART monitoring - note: Had the least amount of failures with Samsung SSDs. But I dislike it that their enterprise SSDs don’t offer any warranty for end customers, this “forces” one to Micron after Intel messed up their NAND SSDs with unfixable bugs before discontinuing SSDs for good :(

    • @_--_--_
      @_--_--_ 2 года назад

      @@abavariannormiepleb9470
      Well depends on the controller used I guess, personally never bothered with micron, so I cant say anything about their drives.
      I just mentioned consumer/prosumer because lots of drives from manufacturers like WD, Kingston or Sandisk use controllers that are very nutorious for just dying very frequently and i have never seen these specific controllers used in enterprise SSDs.
      As far as I know Samsungs U.2 PM9A3 uses the same controller as the 980 Pro, I have to agree Samsung controllers are significantly more reliable, but I have heard of dead 980 Pro and PM9A3 also from sudden controller failure, but usually if this new Samsung pcie 4.0 controller dies it atleast tends to die within its first couple dozen operating hours.
      Kioxia also uses proprietary controllers like Samsung, dont know about how good their smart reporting is, but I have never heard of any of them failing from a sudden dead controller.

  • @benhillard919
    @benhillard919 2 года назад +1

    Good video! Really neat stuff.

  • @thelistener4101
    @thelistener4101 2 года назад +1

    only a "hundred hours" to "melt" Wendell's brain??? OMG! that level waaaay out paces me!

  • @pieterrossouw8596
    @pieterrossouw8596 2 года назад

    TrueNAS Scale on native hardware (Ryzen 3600, Radeon RX550, 48GB RAM) or virtualized TrueNAS on Proxmox?

  • @silentbob1236
    @silentbob1236 8 месяцев назад

    The whole time i watched this on my phone, I kept thinking I had dropped some crumbs on the screen.

  • @rudde7251
    @rudde7251 2 года назад

    Where is the talk referenced? You didn't link it and it's 2 months since this was.

  • @Rickymcdd
    @Rickymcdd 2 года назад

    Ok simple question, Why did they not restore from backups?. if they don't have backups why?

  • @robertpearson8546
    @robertpearson8546 Год назад

    Compare that system to Apple's Time Machine. When Apple's "program" gets a read error, it truncates the "backup" and informs no one that data was lost. Then as time goes on, the Apple program deletes older non-truncated files to make room for new corrupted files. Years later, I am still finding files corrupted by Apple.

  • @Fahdalrabeayah
    @Fahdalrabeayah 2 года назад

    A long time I have not seen you Allan since bsd show

  • @jrr851
    @jrr851 2 года назад

    ZFSsend with all safeties turned off.. is that "Full Send"?

  • @petersvideofile
    @petersvideofile 2 года назад

    This was really great! Thanks!

  • @abavariannormiepleb9470
    @abavariannormiepleb9470 2 года назад +5

    For the future of ZFS: Please introduce a solution that also works with system sleep (S3) so your ZFS storage doesn’t have to be on 24/7 or cold-booted all the time. This would help ZFS’s proliferation into more “normal people” homes.

  • @hersenbeuker
    @hersenbeuker 2 года назад

    is Allan's talk already live?

  • @Wayofthelao
    @Wayofthelao 2 года назад

    Ah man this is awesome stuff!!!

  • @hololightful
    @hololightful Год назад

    Good video, but you never actually put the link to Alan's talk in the description like you said you would in the video.

  • @Sommyie
    @Sommyie 2 года назад +7

    Allan and Wendell?! 🥰😘😂😊

  • @101m4n
    @101m4n 2 года назад

    I fear the cold truth here is that it's always possible to be more negligent than your storage is smart!

  • @dunastrig1889
    @dunastrig1889 2 года назад +1

    Woot!

  • @kungfujesus06
    @kungfujesus06 2 года назад +1

    Are we going to shame the blackplane manufacturer? This seems like some pretty crummy firmware that needs to be called out. Also probably in part because I'm guessing they used SATA drives on a SAS expander (this never ends well).

  • @northwanderer800
    @northwanderer800 2 года назад

    Liked Allan on Jupiter back in the day

  • @jfkastner
    @jfkastner 2 года назад

    Awesome!

  • @Ruhjuh
    @Ruhjuh 2 года назад

    This was really interesting even though most of it went above my “pay grade” 😅