What are Elasticsearch shards? Why do they matter? Elasticsearch cluster architecture explained.

Поделиться
HTML-код
  • Опубликовано: 5 фев 2025
  • Elasticsearch is a fantastic tool but it's easy to muddle through without knowing the fundamentals. It's only a matter of time before your cluster performance drops, errors start happening, and you wonder just what a shard actually is.
    I talk and wave my hands about, explaining how we could have - maybe - built - some of - Elasticsearch ourselves. At the end of the video you'll know what a shard is, why they're so important, and want to learn more to really improve the performance of your cluster.
    Watch on (and watch the rest of the free course linked below) to figure out how to answer that mysterious question: How many primary shards do I need for my index?
    My course - Fundamentals of Elasticsearch architecture and shards - is available for free here: school.georgeb...
    You can read more about Elasticsearch and The Elastic Stack in general on my blog: georgebridgema...

Комментарии • 72

  • @LearnwithAvinashDalvi
    @LearnwithAvinashDalvi День назад

    One of best explanation ever I heard. Explaining one concept at time and showing their problem and then how other concept introduced is amazing way to explain. Thanks for this explanation. I will surely going to use this method to explain others.

  • @nch77884
    @nch77884 3 года назад +48

    Hands down the best explanation and introduction to Elasticsearch. Can't thank enough for making this video.

    • @GeorgeBridgemanData
      @GeorgeBridgemanData  3 года назад

      Thank you for your lovely comment. I'm so pleased you enjoyed the video!

    • @zaeemahmedabbasi
      @zaeemahmedabbasi Год назад

      ​@@GeorgeBridgemanData⁷77777777777777777777777777777777777777777777777777777777777777777777😅😮

  • @HaisumUsman
    @HaisumUsman 2 года назад +4

    Man! You are not from this planet! You deserve a thousand thumbs up.

  • @harrisonleong4283
    @harrisonleong4283 3 года назад +6

    I really wish this was the 1st elasticsearch video that I had watched, so as to save me so much time watching other video which could not teach me the same level of information that I need. Thank you very much, and I shall check out your courses.

    • @GeorgeBridgemanData
      @GeorgeBridgemanData  3 года назад

      That's the reaction I was hoping for - a useful first video on Elasticsearch. Thank you so much for posting!

  • @netlob
    @netlob 3 года назад +14

    Holy sh*t! This must've been the best tutorial I've ever seen on RUclips. High production, clear presentation and well thought through. +1 subscriber for sure!

  • @MikeBertelsenDK
    @MikeBertelsenDK 4 месяца назад +2

    I started going through a Udemy course on Elasticsearch and came to a section about Shards. When the chapter was complete I still didn't understand fully what a shard is.
    I searched on RUclips and ended up on this video. You do a great job of explaining it so I (as a complete beginner) have a better understanding.
    Kudos to you for providing this video :)

  • @heyheythecat
    @heyheythecat 3 месяца назад

    I crawled thru many ES 101 videos that explain “index”and “shard”. This one did the best job.

  • @sridharnuthi1
    @sridharnuthi1 2 года назад +9

    This is a video that should go to a reference library about Elasticsearch. Thank you for putting such a good, clear and methodical overview of ES. Just brilliant!

  • @lawlade
    @lawlade 2 года назад +3

    Watched this 5 times, rewound several times and i understand it FULLY. Thanks so much for such clear explaining

    • @GeorgeBridgemanData
      @GeorgeBridgemanData  2 года назад

      You're welcome! I'm so pleased you understood everything. I hope it helps!

  • @medovanx
    @medovanx 5 месяцев назад +1

    This is really one of the most useful videos that introduced ES to me.

  • @blossomwithcurls
    @blossomwithcurls Год назад

    I just started learning Elastic search and this is the best an clear information on Elastic search architecture. Thanks for sharing!

  • @thomasanderson8478
    @thomasanderson8478 2 года назад +3

    This is the best explanation of elasticsearch I've ever seen. So many videos skip over the details, and it's been making it difficult to understand what elasticsearch is doing under the hood. I normally don't comment on videos, but this is too high quality not to. Please continue to put out content!

  • @yazzy9975
    @yazzy9975 Год назад

    This video changed my life. No exaggeration.

  • @crujzojam7004
    @crujzojam7004 3 года назад +3

    Please post more videos…ur videos are easy to understand and quite informative….please carry on the good work

    • @GeorgeBridgemanData
      @GeorgeBridgemanData  3 года назад +1

      Done! New video just posted!
      There will be more. I have lots of ideas but I'm really trying to get this my Elasticsearch course finished, and work the day job.

  • @Guille495
    @Guille495 5 месяцев назад

    Awesome explanation, I love your narrative style, it really underlines the why and how of the current ecosystem!

  • @TheOtmane007
    @TheOtmane007 Год назад +2

    What a clear , and progressively explained architecture. Thank you so much

  • @Transactional
    @Transactional Год назад +1

    Thank you. It feels like my brain is getting clearer.

  • @jhoyl
    @jhoyl 2 года назад +1

    Thanks - the perfect introduction to Elasticsearch architecture.

  • @cliffmathew
    @cliffmathew 5 месяцев назад

    Very clearly explained. Thanks

  • @jupudivinod
    @jupudivinod 3 года назад +1

    This is fantastic! Bricks till walls in a nutshell! Thanks much for this great presentation.

  • @thsu1
    @thsu1 2 года назад +1

    thanks for the clear and awesome explanation to Elasticsearch and Lucene. really appreciate this useful content

  • @hnyc1986
    @hnyc1986 2 года назад +1

    Awesome explanation about Elasticsearch!!!

  • @Milostrosic
    @Milostrosic Год назад +1

    Very clear explanation!

  • @rakeshkush1234
    @rakeshkush1234 2 года назад +1

    wonderful technical story.

  • @tkousek1
    @tkousek1 3 года назад +1

    Thank you very much sir for this information. Awesome people like you are what's good about this world!!! Much appreciated!!!

  • @sobhan285
    @sobhan285 3 года назад +1

    Wonderful. Looking forward to more courses from you.

    • @GeorgeBridgemanData
      @GeorgeBridgemanData  3 года назад

      Thanks so much! Elasticsearch Engineer Essentials is in the works, and I'll be posting shorter content on here as well.

  • @arpit9163
    @arpit9163 2 года назад +1

    Thank You for making this fantastic video !

  • @НиколайБеляшов-в6к
    @НиколайБеляшов-в6к 10 месяцев назад

    Many thanks for your work! It's awesome video!

  • @moritzlgrs401
    @moritzlgrs401 2 года назад +1

    Absolutely fantastic!

  • @cicd
    @cicd 2 года назад +1

    Great content, thanks for sharing!

  • @danielsantiago11
    @danielsantiago11 2 года назад +1

    Premium content, thank you!

  • @hieungo-ai
    @hieungo-ai Год назад

    Its two year late but the lesson is extremely value

  • @andy_ltluan
    @andy_ltluan Год назад +1

    I think that shard in ES has the same concept with partition in Kafka when they have all partition replicas in different nodes

  • @johnsonakanbi367
    @johnsonakanbi367 3 года назад

    Thanks so much for this great presentation.

  • @sammygun84
    @sammygun84 7 месяцев назад

    Hi thanks for a video.
    For example we have: "unassigned_shards" : 40,
    When we run:
    GET _cluster/allocation/explain?filter_path=index,node_allocation_decisions.node_name,node_allocation_decisions.deciders.*
    {
    "index": "elastalert_past",
    "shard": 0,
    "primary": false
    }
    We reiceve next answer:
    "explanation" : "a copy of this shard is already allocated to this node [[elastalert_past][0], node[JaLzrdasdajQ], [P], s[STARTED], a[id=OmY9kwpHTlybJfSrWvdsadada6g]]"
    We have only one node and what we can do in this situation ?
    Also we have "number_of_replicas" : "0", "auto_expand_replicas" : "false", what we can do in this situation ?
    GET /.kibana/_settings
    {
    ".kibana_2" : {
    "settings" : {
    "index" : {
    "number_of_shards" : "1",
    "auto_expand_replicas" : "false",
    "provided_name" : ".kibana_2",
    "creation_date" : "1601664093",
    "number_of_replicas" : "0",
    "uuid" : "WKdIpzLFSP-ydObLw",
    "version" : {
    "created" : "7090299"
    }
    }
    }
    }
    }

  • @riazbacchus3962
    @riazbacchus3962 Год назад

    this is great content. thank you.

  • @lptarik
    @lptarik 4 месяца назад

    What if i have 2 primary shards and 1 replica shard. Will that replica store all docs from both primary shard?

  • @ucthuannguyen6432
    @ucthuannguyen6432 3 года назад +1

    Wonderful. Thank you so much.

  • @pseudolimao
    @pseudolimao 9 месяцев назад

    where was thsi video 1 month ago. you should be paid by these software companies... bless your heart

  • @yazzy9975
    @yazzy9975 Год назад

    If elasticsearch distributes the data between the shards of an index such that each lucene store roughly holds the same number of documents, when you run a search query, elasticsearch, despite the inter-node communication, only knows which shards hold that index and not which particular shard will have that document? So it has to run the query against all the shards and merge results, it cannot just search the one shard that contains that document? It does not know beforehand based on how documents are distributed among shards.

  • @slapcanister
    @slapcanister 3 года назад +1

    This is so good.

  • @carlosroberto366
    @carlosroberto366 3 года назад

    Isn't the cluster the server (i.e. AWS EC2 instance) itself? To my mind, a node is not a server because you can create several nodes in the same machine. I was expecting to see MyCluster1 and MyCluster2 each having a single node, hence, high availability via cross-cluster communication.
    11:11 node = server in his example
    15:05 node = process

    • @GeorgeBridgemanData
      @GeorgeBridgemanData  3 года назад +1

      A node is an Elasticsearch process running on a host. You're right that you can run multiple nodes on the same host (even not containerised), but it's not recommended and it's widely accepted that you only run a single node on a host.
      If you did run two nodes on a single host, you could have either one or two clusters on that host. The node is configured with the cluster name it's expected to join, so you could configure each node with a different cluster name and have two clusters on that host!
      Cluster formation can get quite involved. There are configuration settings that need to be applied specifically at the formation stage. I can do a video on how that works at some stage.

  • @PhanTanThangTH
    @PhanTanThangTH 11 месяцев назад

    Thank you so much :)

  • @DrewIsFail
    @DrewIsFail 3 года назад +1

    Is it fair to say you could build ES from dynamoDB? I'm trying to compare the two.
    I would love a video on the query language, does it have a mathematical basis like sql does to sets?
    It goes without saying, but I'm say it, thanks for making this clear, concise, focused high level content.

    • @GeorgeBridgemanData
      @GeorgeBridgemanData  3 года назад

      Hi there. I'm really pleased you enjoyed the video.
      I'm not sure you could build an equivalent of Elasticsearch using DynamoDB. There's a *lot* more to Elasticsearch than I talked about in this video!
      There's more content coming, including an introduction to the query language. The in-depth content will be in a training course instead of RUclips, though. I've never considered if there's a mathematical basis to the query language. I doubt there is in terms of what Elasticsearch offers, but all Elasticsearch queries are converted to Lucene query language, which may be more thoroughly researched. Interesting question!

  • @akshaychawla7413
    @akshaychawla7413 2 года назад

    I am not able to enroll for your course, tried with 2 different emails, please have a look into this.

    • @GeorgeBridgemanData
      @GeorgeBridgemanData  Год назад

      Sorry for the very late response. I've had feedback from a couple of people using Firefox, who worked around it by using a different browser. I'm not sure if that's your issue but thought I'd mention it. Let me know if you're still having issues and I'll try responding quicker this time!

  • @gslyrics1507
    @gslyrics1507 3 месяца назад

    No one talks of indexing live updates from Relational Databases

  • @sv_n
    @sv_n 11 месяцев назад

    1000th like 😅