Getting started with ceph storage cluster setup

Поделиться
HTML-код
  • Опубликовано: 1 окт 2024

Комментарии • 25

  • @paulfx5019
    @paulfx5019 3 года назад +6

    Hey Daniel, many thanks for video, I've always struggled with the concept of CEPH vs SAN's & NAS's
    (ZFS) and would like some clarity around what I perceive as "the elephant in the room", power consumption, based on my numbers CEPH nodes would use more power than traditional storage equipment, also my other concern is latency, if I understand correctly, ones application/vm needs to wait for success writing of data to all nodes ...if so, how is this concept better? especial in cloud environment....maybe I can't see the trees for forest and missing the obvious and would value your feedback. cheers

    • @DanielPersson
      @DanielPersson  3 года назад +5

      Hi Paul.
      Thank you for watching my videos.
      Well, I believe the solutions solve different problems. When it comes to NAS at least it is mostly for slow storage where you need to move data to a new location for long-term storage where you're able to raid over multiple drives.
      CEPH is more of the solution when you want to save data in a data center or multiple ones and require absolute recovery. For instance, you can say that you require 2 copies of the data before it's written to disk (default), and then you want to accomplish 4-5 copies of the data over time. And you can configure it you want the failover domain to be disk, host, rack, row, datacenter, or region.
      This means that you can ensure that data is saved in all regions of the world so if the nuke lands in your country the data might be available overseas. So these kinds of solutions are used by Google, Amazon, Microsoft, and others that have datacenters with central storage needs.
      Most NAS solutions add more disk space by adding another hard drive. The most efficient way to add more disk to a Ceph cluster is by adding another node. When it comes to load we actually increased our efficiency. Earlier we had one disk that handles all the data needs and it was always in a state of IO Wait. Now that we have spread the load over multiple physical disks in a network the latency is lower.
      Of course, there will be network latency but disks today, even SSD's are slower than fiber networks.
      When it comes to the power consumption it's probably higher but then again our customers require instant access to their news to we just want to provide. But the new machines still have a lower footprint than the old ones and with a NAS you usually don't replace it over the years while we probably will cycle hosts out when they reach their limit and replace them with new machines with a better footprint and performance.
      I hope I've addressed some of your questions and if I've missed something then please don't hesitate to ask.
      Best regards
      Daniel

    • @paulfx5019
      @paulfx5019 3 года назад +2

      @@DanielPersson Hi Daniel, Many thanks for taking the time to respond. We do operate out of 2 DC in same country with many 100's kms distance between just in-case of that nuke... We are always seeking to reduce power due to high running cost... Which is better replication or erasure coding? We backup all VM guests every hour and replication backup SAN to second DC. To-date all our SANs are Tiered flash, 15k, 10K & 7.2k and RAID10 no NAS's. I have posted to forums on a number occasion questions related to latency with CEPH and no one has responded which to be honest is sending alarm bells to me. Every 6 months or so, I keep reviewing the latest features and news on CEPH because in theory it looks awesome although the change scares us. Cheers

    • @DanielPersson
      @DanielPersson  3 года назад

      Hi Paul.
      I understand your hesitation, and I can't say that this is the best or will work for you. As you seem to have a set of good hardware and probably some extra, then my suggestion would be to take a backup of a test set. For example, set up a smaller cluster and backup a customer to that cluster and, after that, gauge if you see an increase in performance or power utilization.
      Our situation was a bit dire, so we had to jump the shark to solve our performance issue. But you can do this in a more controlled fashion. I think the best Ceph solutions are the ones that can grow over time, and in your case, there should be a quick switch to rollback.
      After us running Ceph for a couple of months, we feel confident with our current setup and plan of expanding to more nodes and moving more customers and functionality. The data we have moved now was the one that required to be low latency other data that we will add have fewer latency requirements which makes those safer to migrate.
      In your case, you can go the other way and take things that you backup more seldomly as a test group. Then again, you know your organization best and how to make a good test case.
      Best regards
      Daniel

  • @brandonpatzold7367
    @brandonpatzold7367 Год назад +1

    Hi Daniel, cool video… I need to move my corporate on CephFS on Redhat Enterprise Linux and I hope I’m making the right decision…GlusterFS has been problematic and I hope CephFS can perform better.

  • @pierreuntel1970
    @pierreuntel1970 9 месяцев назад +1

    a script is not a tutorial

  • @aappiah1
    @aappiah1 Год назад +2

    Excellent stuff and thanks for sharing

  • @bestrelaxingmusic21
    @bestrelaxingmusic21 11 месяцев назад +1

    Is this Ceph can make a S3 Blob Storage for AI ?

    • @DanielPersson
      @DanielPersson  10 месяцев назад

      Hi Relaxing Music.
      Absolutely. But you should use fast drives for the best read speed and store multiple copies of the data as it could improve the read speed even more. A fast network is also required. So, to utilize Ceph for AI training, you need to look into your setup. Inference, on the other hand, is usually best done locally.
      Thank you for watching my videos.
      Best regards
      Daniel

  • @aaronperl
    @aaronperl 3 года назад +3

    I wasn't familiar with ceph at all, thanks for the introduction. Now if only I had more machines and disks to play with...

    • @DanielPersson
      @DanielPersson  3 года назад +1

      Hi Aaron.
      That is always an issue. :)
      I have some Rasberry PI's that could be used for some of the services but running with virtual machines are also a good option if you just want to try it out. Seldomly required to have this in a private environment but we have a 100 TB+ network at work so being a part of the Sysop team I thougt it would be good to do some research. :)
      Best regards
      Daniel

  • @ruhnet
    @ruhnet 3 года назад +2

    Great video! Thanks for the good overview!

  • @UnleashingMayhem
    @UnleashingMayhem 2 года назад +1

    should we deploy ceph inside kubernetes cluster?
    I mean, what if something happened to the cluster than we wouldn't have access to our PVs and PVCs.
    Is it possible to integrate ceph with proxmox and than use it inside the kubernetes?
    What is the best practice here?

    • @DanielPersson
      @DanielPersson  2 года назад +1

      Hi Yaser
      Thank you for watching my videos.
      Should you deploy it in Kubernetes? No. I would not recommend deploying a hardware-dependent storage solution into an environment of images that could move around.
      Running some of the services in Kubernetes could be sensible, but the primary OSD's is something I would run bare-metal, or perhaps in the dockerized thingy that cephadm uses.
      I've not tried to run Ceph on Proxmox, but I don't see why that should be a problem. I have looked at similar solutions that spin up a whole orchestrated environment with Ceph as storage.
      Using Ceph doesn't have any actual best practices as a solution, but each component has requirements, and as long as they are met, you should be fine. For example, the best practice for monitors, managers, and metadata servers is running on multiple physical hardware (3 or 5 monitors). So you have redundancy in your network.
      OSDs are connected to physical drives for best performance. So if you have 20 drives provisioned, each of them should have an OSD. Creating multiple partitions or using virtual drives are not recommended.
      I hope this helps.
      Best regards
      Daniel

  • @dadelaar650
    @dadelaar650 Год назад +1

    S U B B E D

  • @MarkConstable
    @MarkConstable Год назад +1

    Love Ceph, hate Ansible.

    • @slightfimulator4888
      @slightfimulator4888 2 дня назад +1

      Everyone hates ansible, but what do you use instead, puppet? Everyone hates puppet too.

  • @aappiah1
    @aappiah1 Год назад +1

    Great stuff

  • @devnull256
    @devnull256 3 года назад +5

    Unfortunately, there is nothing about how to actually install ceph, I'm not even sure what this video is about, apart from "See folks, I installed ceph with ansible!" good for you ..., but that's not what I came for. One first need to learn how to install it all on low-level, then play with ansible...

    • @DanielPersson
      @DanielPersson  3 года назад +3

      Hi /dev/null
      You are right. I have some videos about ceph at this point but I never did a clean manual install. Great suggestion, i need to do that at some point.
      Already planned a couple of videos but subscribe and I will get to it soon enough :)
      Thank you for watching my videos.
      Best regards
      Daniel

  • @KangJangkrik
    @KangJangkrik 3 года назад +1

    How to integrate it with docker and kubernetes? Thanks!

    • @DanielPersson
      @DanielPersson  3 года назад +1

      Hi Athaariq
      This is something I really want to look into. I've started to prepare for this video, and it's something I need to understand for work. So motivating myself to learn this topic just because I need to know it.
      Thank you for the suggestion and for watching my videos.
      Best regards
      Daniel

    • @KangJangkrik
      @KangJangkrik 3 года назад +1

      Hi Daniel, I appreciate your work on making videos. By the way, most people will start with docker containers on their PC instead of multiple machine. So it would be really helpful if target the audiences who didn't have storage clustering experience like me :)

    • @LampJustin
      @LampJustin 3 года назад +2

      @@KangJangkrik you can either use rook to get storage on kubernetes directly or deploy Ceph with ceph-ansible. If you deploy a stand alone ceph cluster, you'll have to manually setup the ceph-csi driver.

    • @KangJangkrik
      @KangJangkrik 3 года назад +1

      @@LampJustin thanks for your knowledge :) tried on Hyper-V VM but I got no success with ansible