Setting Up Proxmox High Availability Cluster & Ceph

Поделиться
HTML-код
  • Опубликовано: 24 ноя 2024

Комментарии • 48

  • @romayojr
    @romayojr 10 месяцев назад +17

    i’ve been running proxmox cluster with ceph pool on 3 dell 7060 in my environment for about 6 months now it’s been working great and it hasn’t had any failures as of yet. i highly recommend doing this if you have the resources

    • @techadsr
      @techadsr 10 месяцев назад

      The i5 8500 has 2 more cores than the N100 giving your cluster 6 more cores than mine. What usage and how many cores is ceph consuming on your cluster?

  • @Cpgeekorg
    @Cpgeekorg 10 месяцев назад +11

    at 10:50 there is a cut, and for folks who may be following along with this video I want to clarify what happened because it's a gotcha that people new to proxmox really should understand (I'm not trying to undermine Don here, he did a fantastic job with this video demo). - the way that the test container is shown configured (specifically using the default storage locations for storage, which in this case is the local storage) is INCORRECT for this configuration. you *must* choose the shared storage (in this case the ceph pool that was created called "proxpool" if you want to configure HA (it doesn't let you do it otherwise because it's not backed by shared storage). do not despair amigos, if you configure your vm or container accidentally on local storage and you already start deploying your workload and set it up and then you decide you want this vm/ct to be part of your HA config, you can just move the storage from your local to your shared storage by:
    1. click on the ct you accidentally created on local storage
    2. click on "resources"
    3. click on "root disk"
    4. click on the "volume action" pull-down at the top of the resources window
    5. click on "move storage"
    6. select the destination (shared) storage you want to move it to
    repeat this for all disks that belong in this container, and HA will work once all disks attached to the container are on shared storage. this procedure works the same for VMs as well, but you'll find the storage configuration under the "hardware" tab instead of the "resources tab" - and then you just click on each disk, and do the same "volume action" - "move storage" as with ct's.
    *pro tip*: proxmox, at the time of this writing, does NOT support the changing of the "default" storage location when you make new vm's and CT's, HOWEVER, this list is ALWAYS (at the time of this writing) in alphabetical order, and it defaults to the first storage location in the alphabet. if you wish to set the default, you can name whatever storage you like to be the default to the first name alphabetically. (for lots of people i've seen this as something like "ceph pool" BUT, for some strange reason proxmox prioritizes storage target ids that are capitalized, so I call my ceph pool NVME (because it's built on my nvme storage) and it shows up at the top of the list, and is thus default when I create storage. note: unfortunately you can't just change a storage id because your vm's won't know where your storage is. if you need to rename your storage, your best bet is to create a new ceph pool with the new name (based on the same storage - don't worry, ceph pools are thin provisioned), go to each vm/ct's storage, and move the storage to the new pool. when there is nothing left on the pool (you can verify this by looking at the pool in the storage section and making sure there aren't any files stored there), you can remove it from the cluster's storage section, then remove the pool from the ceph options.

    • @remio0912
      @remio0912 10 месяцев назад +1

      Would you be willing to make a video?

    • @Cpgeekorg
      @Cpgeekorg 10 месяцев назад +1

      @@remio0912 I’d be happy to, but it’ll probably be a couple weeks for that one. Currently rebuilding my network.

  • @TheRealAnthony_real
    @TheRealAnthony_real Месяц назад +1

    No time wasted 😊 .. i feel like i got more info from this short resume compared to the last 3 hours watching other vids .

  • @GreatWes77
    @GreatWes77 9 месяцев назад +2

    Excellent presentation. Didn't over or under explain anything IMO. I appreciate it!

  • @techadsr
    @techadsr 10 месяцев назад +1

    I added a usb NMVE enclosure with 1TB SSD to each node in my N100 3 node Proxmox cluster. The nodes' have USB 3.0 ports. Installed a Ceph pool using those three SSDs. Ceph and NFS (Synology DS1520+) are using the second adapter on each node and the NAS making the storage network traffic isolated from regular traffic. I moved a debian vm to NFS and timed a migrate from node 2 to 1. Repeated that migrate with that same debian on Ceph. Like Don, I was pinging the default router from that debian vm during the migrations. Never lost a ping. The timing for the 48G debian machine migration on NFS was 19 sec with 55 ms downtime. For Ceph, the timing was 18 seconds with 52 ms down time. Migration speed for both was 187.7 MiB/s. The HP 1TB EX900 Plus NVME SSD is gen3 but the SSK SHE-C325 Pro NVME Gen2 enclosure is USB 3.2.
    Not much of a performance difference in my config for NFS vs Ceph. At least there's a benefit for not having the NAS as a single point of failure.

    • @ultravioletiris6241
      @ultravioletiris6241 8 месяцев назад

      Wait you can have high availability with migrations using Synology NFS? And it works similar to Ceph? Interesting

    • @techadsr
      @techadsr 8 месяцев назад

      ​@@ultravioletiris6241The difference is that Ceph doesn't have a single point of failure like a Synology NAS does.

    • @NatesRandomVideo
      @NatesRandomVideo 6 месяцев назад

      @@ultravioletiris6241All Proxmox HA cares about is if the storage is shared.
      (Not even that, really. You can set up ZFS root during node creation and replication of a VM to one or more nodes and migration is screaming fast - but can lose data. Shared storage eliminates the data loss window between replication runs.)
      There are ways to get / build HA SMB/CIFS storage and HA NFS storage - with as much redundancy as you like and the wallet can afford.
      That said, a single Synology isn’t that. So it is the SPOF in using either SMB or NFS shared storage for Proxmox cluster HA.
      Quite a few home gamers going for “decent” recoverability may use a shared storage system with UPS and the built in “nut” UPS monitoring in Linux to distribute UPS power status to various things so they can all do a graceful shutdown before the battery fails.
      It’s not protection against a NAS hardware failure - but it covers the most common outage that most people see. Power.
      Other things to consider when using a consumer grade NAS for shared storage is how badly it slows down during disk replacement and recovery. Many find their VMs performance to be extremely poor during that time.
      You can go very deep down this rabbit hole. Some even run into similar problems when their CEPH needs to rebuild after a failure. Giving it its own fast network for just itself is one of the ways to mitigate that.

  • @GT-sc5sk
    @GT-sc5sk 2 месяца назад +1

    hmmm should you have not choose storage prox-pool? and not a local-lvm?

  • @mikebakkeyt
    @mikebakkeyt 10 месяцев назад +2

    I run an HA cluster atm with two identically named ZFS pools and so long as I put the CT disks in that pool then it allows replication and full HA functionality. I don't see any need to add the extra complexity of Ceph just for HA. Ceph seems awesome but it's an order of magnitude higher complexity over ZFS...

    • @Darkk6969
      @Darkk6969 10 месяцев назад

      I use ZFS with replication for production servers at work. The biggest benefit of CEPH is real-time replication while ZFS it's based on timely snapshots and then those snapshots get sent to the other node in the cluster. The reason I am not using CEPH is for performance. Newer versions of CEPH may have gotten alot better so may have to revist it at some point.

    • @mikebakkeyt
      @mikebakkeyt 10 месяцев назад +1

      @@Darkk6969 agreed. I have prototyped ceph but it needs to be run at a larger scale to make sense and the management of it is very complex with opportunities to mess up data. I want to use it but I need dedicated nodes and a faster network so for now it waits

  • @CaptZenPetabyte
    @CaptZenPetabyte 9 месяцев назад +1

    Just to confirm, I think I missed something, each 'prox1, prox2, prox3, prox.n' would be different machines running proxmox on the same network? I shall rewatch and maybe have a bit of a play around with the documentation. Thanks for all the recent proxmox tutes mate, they have been very helpful indeed!

  • @dijkstw2
    @dijkstw2 10 месяцев назад +1

    Just thinkjng about it and you’re posting this 😂🎉

  • @ronm6585
    @ronm6585 10 месяцев назад

    Thanks for sharing Don.

  • @fbifido2
    @fbifido2 10 месяцев назад +1

    @4:35 - please dive more on the cluster networking part:
    on VMWare vSAN you have storage network, VM network, fail-over network, etc ...
    what's the best way as in networking to build a ceph cluster with 3 or more host?

    • @Darkk6969
      @Darkk6969 10 месяцев назад +2

      CEPH is very flexible in terms of network redundancy. I had two dedicated 10 gig network switches just for CEPH. Hell, if both switches fail it can use the LAN public network as backup.

  • @JohnWeland
    @JohnWeland 10 месяцев назад

    I have 2 nodes running myself (2 Dell r620) waiting for some lower end CPUs to arrive in the mail before I bring the third node online. It came with some BEFFIER CPUs and that means jet engine screams (1u server fans).

  • @remio0912
    @remio0912 10 месяцев назад +1

    I got messed up when you created a OSD. I had no other disks available to use on any of the nodes.

  • @karloa7194
    @karloa7194 9 месяцев назад

    Hey Don, are you RDP-ing to manage your infrastructure? I noticed that there are two mouse cursors. If your using some sort of jump box or bastion host, could you share how you're connecting to your bastion host and what is your bastion host?

  • @Melpheos1er
    @Melpheos1er Месяц назад

    Couldn't find the information anywhere but do we need also 3 hosts for a cluster when using a san with iscsi ?
    I have a cluster of two for test purposes and automated migration enabled but when I restart a host, VM are not migrated 😕

  • @GT-sc5sk
    @GT-sc5sk 2 месяца назад

    and migration mode: restart..which maybe explain your losse of ping..did the LXC restarted?

  • @Cpgeekorg
    @Cpgeekorg 10 месяцев назад +1

    3:32 "as far as migrating through here, you cannot do that yet until you set up a ceph" - this is incorrect. in this state, you CAN migrate vm's from one node to another, they just have to be paused first. all that's required for moving vm's node to node is a basic cluster. HOWEVER, because the storage isn't shared between them, it does take longer to move vm's between nodes in this state because the entirety of the storage needs to move from one node to the next. the way it works if you have ceph (or another shared storage, it doesn't have to be ceph, it could be an external share or something, ceph is just a great way to set up shared service with node-level (or otherwise adaptable) redundancy), is that instead of moving full disk images when you migrate, the destination node accesses the shared storage volume (so the storage doesn't have to move at all). which means the only thing that needs to be transferred between nodes is the active memory image, and this is done in 2 passes to minimize latency in the final handoff (so it transfers all blocks of the active vm ram, then it suspends the vm on the source node, copies any memory blocks that have changed since the initial copy, and then the vm is suspended at the destination node and resumes. on a fast network connection this final handoff process can be done in under a couple miliseconds so to any users using the services of the vm being transferred, are none the wiser. - you can start a ping, migrate a vm mid-request and the vm will respond in time at it's destination (maybe adding 2-3ms to the response time). it's FANTASTIC!

  • @dzmelinux7769
    @dzmelinux7769 9 месяцев назад

    So does this work with LXC container too? They don't start after migration, but HA isn't considered migration.

  • @caiodossantosmoreiradasilv2851
    @caiodossantosmoreiradasilv2851 Месяц назад

    but, if i has a running server, and then it stopped, because about energy or another facts, will it go to run to another cluster ?

  • @diegosantos9757
    @diegosantos9757 8 месяцев назад +1

    Hello dear,
    Would it work with Pimox too??
    Thanks for the great vídeos!

  • @robert-gq5th
    @robert-gq5th 9 месяцев назад

    What do i do if when i add a drive it makes a new lvm and i cant use it for osd?

  • @fbifido2
    @fbifido2 10 месяцев назад

    @8:30 - can you have a Proxmox host/node for Ceph storage only, not for running VM?
    eg: you have your 3 compute node, but running low on storage, can you add a node or two with lots of storage to expand your Ceph cluster?

    • @Darkk6969
      @Darkk6969 10 месяцев назад +1

      Yes you can. You'll need to install CEPH tools on all the nodes. Just create the OSDs on the nodes you want to dedicate to CEPH.

  • @leachimusable
    @leachimusable 10 месяцев назад

    1:15 min. Wich system is that in VM?

  • @Lunolux
    @Lunolux 10 месяцев назад

    thx for the video,
    6:22 i'm a little confuse, this storage is where ? on the proxmox server ? remote storage ?

    • @techadsr
      @techadsr 10 месяцев назад +1

      Don created the OSD on /dev/sdc. But his three node Proxmox cluster itself was virtualized. So not sure if /dev/sdc was also virtualized.

  • @LVang152
    @LVang152 6 месяцев назад

    Can you delete the local-lvm?

  • @da5fx
    @da5fx 8 месяцев назад

    I’m going to talk only about my experience with proxmox and I have tried to used several times. I do not like it I think that there are better options. My setup is a 3 node i5 13th gen with ceph and 32gb ram two nics one for ceph traffic and the other for all other traffic. I think it’s very slow, there is a problem, in my opinion, when stopping the vm’s when they get stuck or you made a mistake in some way with the vm. The templates can only be cloned from one node and they are attached to that node different from VMware of course you can migrate the templates. Installing a Linux vm in the traditional way takes a long time like several hours something like 4 hours or more. The ceph speed on ssd was around 13mb/s. I made a test by moving all my 10vm from 3 to only 2 nodes to test on the third node the speeds. Maybe it’s me and I’m not used to this kind of solution because I was a VCP on 5.5 and 6 I normally prefer fedora KVM because of cockpit but that doesn’t provide any way to cluster 2/3 machines. In sum I got tired of it and installed harvester hci and now a vm is installed in 5m or a bit more, longhorn gives speeds around 80mb/s.
    This is just my last experience and the previous ones. I hope this helps someone. Thank you.

  • @primenetwork27
    @primenetwork27 8 месяцев назад

    i have 3 node in ceph if it possible to add 1 more server?

  • @LukasBradley
    @LukasBradley 5 месяцев назад +1

    At 6:20, you say "you'll see the disk that I add." Where did this disk get defined? My config doesn't have an available disk.

    • @franchise2570
      @franchise2570 29 дней назад

      I guess he doesn’t read comments. Wish he would have responded back.

    • @rogerfinch7651
      @rogerfinch7651 24 дня назад

      You need another separate disk for ceph. Not proxmox one or shared disks for the vms

  • @shephusted2714
    @shephusted2714 10 месяцев назад

    this is one you needed to do but pls explore other netfs plus what is you wanted to combine all pc and gpu to look like one pc - can you explain or diy that in followup? #HA 40g #no switch #load balancing

  • @puyansude
    @puyansude 10 месяцев назад

    👍👍👍

  • @fbifido2
    @fbifido2 10 месяцев назад +1

    @7:20 - it would be nice if you would explain what each field is for or what's best practices.

  • @techadsr
    @techadsr 10 месяцев назад +1

    How impactful is ceph to CPU resources on a three node, N100 based Proxmox cluster? I put off trying ceph when the resource requirements mentioned dedicating CPUs.
    So far, i have NFS storage on a separate network connected to 2nd NICs on Synology and the ProxMox nodes.
    It looked like ceph could be set up with it's management traffic on a separate network as well. But with only 12 cores available on my cluster, maybe ceph isn't for me.
    Thoughts?

    • @ewenchan1239
      @ewenchan1239 10 месяцев назад +2

      So, I have a three node HA Proxmox cluster (each node only has a 4-core N95 processor, so the less performant version of the N100 that you have), with 16 GB of RAM and 512 GB NVMe in each node.
      When I installed Proxmox, I had to re-do it a few times because I needed to create the 100 GB partition for Proxmox itself (on each node) + 8 GB swap partition and the rest of the drive can be used for Ceph.
      In terms of CPU usage -- Ceph RBD and CephFS itself, actually doesn't take much from the CPU side of things.
      It is HIGHLY recommended that you have at least a second NIC for all of the Ceph synchronisation traffic (my Mini PC has dual GbE NICs built in), which works "well enough" for the rest of the system being only 4-core N95 with 16 GB of RAM).
      Of course, Ceph isn't going to be fast with a GbE NIC in between them, but given what I am using my HA cluster for (Windows AD DC via turnkey linux domain controller, DNS, and Pi-Hole), it doesn't really matter to me.
      Nominal CPU usage on my cluster is < 2%, even when there's traffic going through the Ceph network.
      Nominal memory usage is whatever Proxmox already consumes (any additional memory consumption is negligble/imperceiveable).
      What will matter more in terms of CPU/memory will be what you plan on running on it.
      And I have effectively the same setup as you, but with slower processors (also 12 cores total, and 48 GB of RAM total, but spread out amongst the nodes) which means that I can't do anything TOO crazy because it isn't like the VM can spawn itself over multiple nodes, so everything has to run as if there was only one node available anyways.

    • @techadsr
      @techadsr 10 месяцев назад

      ​@@ewenchan1239 Sounds like the resource requirements doc had me concerned too much. I wonder how ceph compares to NFS running on Synology.
      I hadn't thought about explicitly creating partitions for proxmox os and swap. The install creates separate logical disks (see **** below) in LVM for proxmox swap and root. Will logical disks for swap/root not work for ceph?
      root@pve2:~# fdisk -l
      Disk /dev/sda: 953.87 GiB, 1024209543168 bytes, 2000409264 sectors
      Disk model: NT-1TB 2242
      Units: sectors of 1 * 512 = 512 bytes
      Sector size (logical/physical): 512 bytes / 512 bytes
      I/O size (minimum/optimal): 512 bytes / 512 bytes
      Disklabel type: gpt
      Device Start End Sectors Size Type
      /dev/sda1 34 2047 2014 1007K BIOS boot
      /dev/sda2 2048 2099199 2097152 1G EFI System
      /dev/sda3 2099200 2000409230 1998310031 952.9G Linux LVM
      **** Disk /dev/mapper/pve-swap: 8 GiB, 8589934592 bytes, 16777216 sectors
      Units: sectors of 1 * 512 = 512 bytes
      Sector size (logical/physical): 512 bytes / 512 bytes
      I/O size (minimum/optimal): 512 bytes / 512 bytes
      **** Disk /dev/mapper/pve-root: 96 GiB, 103079215104 bytes, 201326592 sectors
      Units: sectors of 1 * 512 = 512 bytes
      Sector size (logical/physical): 512 bytes / 512 bytes
      I/O size (minimum/optimal): 512 bytes / 512 bytes
      Disk /dev/mapper/pve-vm--102--disk--0: 48 GiB, 51539607552 bytes, 100663296 sectors
      Units: sectors of 1 * 512 = 512 bytes
      Sector size (logical/physical): 512 bytes / 512 bytes
      I/O size (minimum/optimal): 65536 bytes / 65536 bytes
      Disklabel type: dos
      Device Boot Start End Sectors Size Id Type
      /dev/mapper/pve-vm--102--disk--0-part1 * 2048 98662399 98660352 47G 83 Linux
      /dev/mapper/pve-vm--102--disk--0-part2 98664446 100661247 1996802 975M 5 Extended
      /dev/mapper/pve-vm--102--disk--0-part5 98664448 100661247 1996800 975M 82 Linux swap / Solaris
      Partition 2 does not start on physical sector boundary.
      Disk /dev/mapper/pve-vm--1000--disk--0: 8 GiB, 8589934592 bytes, 16777216 sectors
      Units: sectors of 1 * 512 = 512 bytes
      Sector size (logical/physical): 512 bytes / 512 bytes
      I/O size (minimum/optimal): 65536 bytes / 65536 bytes
      Does the ceph pool need to be created before creating VMs and LXCs or downloading ISOs or CTs?