Thanos (Multi Cluster Prometheus) Tutorial: Global View - Long Term Storage - Kubernetes

Поделиться
HTML-код
  • Опубликовано: 31 июл 2024
  • 🔴 - To support my channel, I’d like to offer Mentorship/On-the-Job Support/Consulting - me@antonputra.com
    ▬▬▬▬▬ Experience & Location 💼 ▬▬▬▬▬
    ► I’m a Senior Software Engineer at Juniper Networks (12+ years of experience)
    ► Located in San Francisco Bay Area, CA (US citizen)
    ▬▬▬▬▬▬ Connect with me 👋 ▬▬▬▬▬▬
    ► LinkedIn: / anton-putra
    ► Twitter/X: / antonvputra
    ► GitHub: github.com/antonputra
    ► Email: me@antonputra.com
    ▬▬▬▬▬▬ Related videos 👨‍🏫 ▬▬▬▬▬▬
    👉 [Playlist] Kubernetes Tutorials: • Kubernetes Tutorials
    👉 [Playlist] Terraform Tutorials: • Terraform Tutorials fo...
    👉 [Playlist] Network Tutorials: • Network Tutorials
    👉 [Playlist] Apache Kafka Tutorials: • Apache Kafka Tutorials
    👉 [Playlist] Performance Benchmarks: • Performance Benchmarks
    👉 [Playlist] Database Tutorials: • Database Tutorials
    ▬▬▬▬▬▬▬ Timestamps ⏰ ▬▬▬▬▬▬▬
    0:00 Intro
    7:23 Deploy Prometheus to Kubernetes
    11:45 Deploy Minio (AWS S3 Compatible Object Storage)
    12:56 Thanos Remote Read with Sidecar
    16:50 Deploy Thanos Querier
    19:23 Deploy Thanos Store Gateway
    20:49 Deploy Thanos Compactor
    22:01 Configure mTLS for Remote Read
    31:22 Thanos Remote Write
    35:40 Deploy Thanos Receiver
    41:25 Configure Thanos Receiver Sharding
    44:31 Configure mTLS for Remote Write
    ▬▬▬▬▬▬▬ Source Code 📚 ▬▬▬▬▬▬▬
    ► GitHub: github.com/antonputra/tutoria...
    #kubernetes #devops #cloud
  • НаукаНаука

Комментарии • 146

  • @AntonPutra
    @AntonPutra  Год назад +5

    🔴 - To support my channel, I’d like to offer Mentorship/On-the-Job Support/Consulting - me@antonputra.com

    • @vishnuvardhanvandavasi9998
      @vishnuvardhanvandavasi9998 9 месяцев назад +1

      when it will come please let us know @AntonPutra

    • @Fidellio369
      @Fidellio369 3 дня назад

      Hey Anton, Is your mentorship still available? (And yes, I'd be happy to pay :) )

  • @mitesh6872
    @mitesh6872 7 месяцев назад

    Perfect compilation of each use case and detailed explanation. I haven't seen such a video on Prometheus and Thanos before. Thank you very much Really Helpful 👍

  • @cokegen
    @cokegen Год назад +8

    Producing these had to take quite some time ... thanks for doing them, learning a ton from these prometheus videos you recently posted 10/10

    • @AntonPutra
      @AntonPutra  Год назад

      Thank you, I'm glad that you find it useful!

  • @675FresH
    @675FresH Год назад +1

    Thanks Anton! Helpful as always!

  • @madbbb
    @madbbb Год назад +3

    Spasibo Anton! I was googling how to secure remote write on Thanos Remote Write/Prometheus Agent and you just released this video 3 days ago.

    • @AntonPutra
      @AntonPutra  Год назад

      pojalusta, source code in the description

  • @Leonid.Shamis
    @Leonid.Shamis Год назад +1

    Hi Anton, thank you for the informative and educational content! I look forward to your next video about preparing a Production ready Thanos setup using Amazon EKS Cluster remote write endpoint and deploying stateless Prometheus in Agent mode and moving local alerts to a Centralized Thanos.

  • @denisrazumnyi6456
    @denisrazumnyi6456 Год назад +1

    Well done !!!
    Thank you for your job !

  • @hachastico674
    @hachastico674 3 месяца назад +1

    This is super helpful, thank you a LOT

  • @minimalniemand
    @minimalniemand Год назад +1

    great tutorial, thanks!

  • @PattamaW
    @PattamaW Год назад +1

    super informative and useful tutorial, I can't wait for the production setup!

    • @AntonPutra
      @AntonPutra  Год назад

      Thanks, well it's very similar but implementations would vary depending on the cloud you use.

  • @bender5326
    @bender5326 Год назад +1

    As usual another great tutorial!
    N.B. I hope you will present a video about VictoriaMetrics) VM more productive and has great architecture

  • @monitorcamera8850
    @monitorcamera8850 Год назад +1

    awesome video

    • @AntonPutra
      @AntonPutra  Год назад

      Thanks, took me a while to record =)

  • @denisrazumnyi6456
    @denisrazumnyi6456 Год назад +1

    Thanks!

    • @AntonPutra
      @AntonPutra  Год назад

      Thanks for the support @denisrazumnyi6456!

  • @sundeepgarg3502
    @sundeepgarg3502 Год назад +1

    you are genius

  • @mamitadieng4166
    @mamitadieng4166 8 месяцев назад

    Hi Anton Putra. Thx for thiw video very usefull. If I decide to use the opensfhit cluster prometheus native thanos query as datasource for an external grafana which is in a nerwork outside the openshift one, do I have first to open the flow?

  • @smitzaveri12
    @smitzaveri12 5 месяцев назад +1

    Hi @Anton .. why do we have 2 receivers ? Is it because to distribute load?
    also can you clarify this :
    1. Lets say we have more than 2 prometheus instances and we have 1 thanos, should we use multiple receivers i.e more than 3 or 4?
    2. Is it wise to assign 1 receiver to 1 prometheus meaning receiver 1-> prometheus 1, receiver 2-> prometheus 2 etc?

  • @clementbenedetti724
    @clementbenedetti724 Год назад +2

    Hi, thank you for the video, very clear as always, cannot wait to see the next video for Thanos in production, I hope soon.
    Are you gonna use the prometheus agent in the next video?

    • @AntonPutra
      @AntonPutra  Год назад +1

      Not sure when I'll create a following video cause I don't see a lot of interest in it here.
      I have one video on prometheus agent (ruclips.net/video/VyYrThINCjg/видео.html), set up remote write is the same as with other prometheus.

    • @ukaszl.9943
      @ukaszl.9943 Год назад +1

      @@AntonPutraAmazing job! I wait for Thanos in production. I try to find some tutorial to do it but with no luck. I'm a new one with Kubernetes and Operators world and I don't know where to change prometheus configurtation to create this agent, write-remote solution. Please help :)

  • @itspngu
    @itspngu Год назад +1

    we'll use an S3 bucket to store prometheus magics.
    :)

  • @emergirie
    @emergirie Год назад +1

    You got us used to doing everything in terraform was you tired😂 it’s a joke thanks for the sharing

    • @AntonPutra
      @AntonPutra  Год назад

      Better doing it by hands for learning:)

  • @TAICHI1SCO
    @TAICHI1SCO Год назад

    great tutorial as always. when is the production setup coming pls?

    • @AntonPutra
      @AntonPutra  Год назад

      Thanks, it's not much different just use external dns and more shards =)

  • @pier_x0
    @pier_x0 Год назад +1

    Hi Anton, great topic. Thank you so much!!!
    I can see on the cover #1, have you planned any further tutorial on that topic?
    Do we have any chance for a Federated Prometheus arch tutorial?

    • @AntonPutra
      @AntonPutra  Год назад

      Thank you. However, I believe that Prometheus federation may no longer be widely used. Is my understanding incorrect?

    • @pier_x0
      @pier_x0 Год назад

      @@AntonPutra yeah, I think you're right. Maybe Thanos fix that issue to collect metrics from different Prometheus
      Is it correct?

    • @AntonPutra
      @AntonPutra  Год назад

      @@pier_x0 yes it’s a main use case for thanks to give you global view

  • @user-hb4kl2wd8i
    @user-hb4kl2wd8i 4 месяца назад

    Nice video Anton, do you have something for thanos receive distributor and thanos receive configuration. Or the one which is video is the same thing. For me the remotewrite does not work if the distributor is enabled via the helm chart. Any help is highly appreciated.

  • @saadullahkhanwarsi5853
    @saadullahkhanwarsi5853 11 месяцев назад +1

    Thanks anton for all efforts,
    But we guys still waiting for remote Prometheus setup/cluster setup monitored through Grfana via Thanos.
    It would be great help
    Again thanks 🎉❤

    • @AntonPutra
      @AntonPutra  11 месяцев назад

      Thanks will do eventually =)

  • @VinayKumar-ek6bq
    @VinayKumar-ek6bq 5 месяцев назад

    Hi, i am doing this on on-prem and i don’t have any storage i can use except PVC, is there any way i can use thanos for this ? If yes then what will be the sidecar config ?

  • @nishasati8249
    @nishasati8249 7 месяцев назад

    Awesome video. Thanks a lot for sharing this. Could you please share the production ready multicluster monitoring setup in AKS?
    What will be required components to be setup in centralized cluster and other individual cluster? Any document would be a great help.

    • @AntonPutra
      @AntonPutra  7 месяцев назад

      sure - thanos.io/v0.6/thanos/storage.md/#azure-configuration

  • @starytomas
    @starytomas 10 месяцев назад

    Hi, when are you planning to make a video about thanos and prometheus in agent mod please? Thank you very much for your answer. The current video was great, I would also like to see more about prometheus agent mod and Thanos.

    • @AntonPutra
      @AntonPutra  10 месяцев назад +1

      Thanks, it's very similar and I have examples somewhere in my GitHub repo. You just need to provide a couple of additional flags and you get stateless, well almost stateless prometheus

  • @user-lo1sz6cz8r
    @user-lo1sz6cz8r Год назад

    Hi Anton, thank you for sharing your videos. It has been incredibly helpful and expanded my knowledge.
    I followed your tutorial and successfully deployed Thanos with remote write Prometheus. Everything appears to be functioning correctly, except that I am unable to view any targets in the Thanos UI. Could you suggest a reason for this issue?
    btw, I used helm charts instead of manifests, which are much easier
    Furthermore, I have another question regarding best practices for the Alert Manager. Should I transfer all alerts to the ruler or should I deploy the Alert Manager with each Prometheus instance?
    Thank you in advance for your assistance.

    • @AntonPutra
      @AntonPutra  Год назад +1

      Thanks! Well, first of all, you should not see any "targets" in the Thanos UI; you should only see "Stores". Targets should be visible in the local Prometheus UI.
      Second, Thanos recommends using local Prometheus and Alertmanagers because they do not rely on external components for queries to succeed.
      In my opinion, if Thanos is well monitored and any issues can be immediately identified, then using a global Ruler is acceptable.

  • @umeshgholve35
    @umeshgholve35 8 месяцев назад

    Hello anton, thanks for video. can you please give steps for aws eks cluster monitoring with prometheus thanos setup.

    • @AntonPutra
      @AntonPutra  8 месяцев назад

      Well, you need to use IRSA to grant the Thanos storage gateway read/write access to the S3 bucket. The rest of it is more or less the same. Are there any specific issues?

  • @user-cw8gt9yk3l
    @user-cw8gt9yk3l 11 месяцев назад +1

    Thanks for your tutorial. I would like to know in real production scenario, are there any ways to expose minIO securely without port forwarding?

    • @AntonPutra
      @AntonPutra  10 месяцев назад

      Sure, Minio is a distributed web server that mostly uses HTTP and websockets. Use an ingress or simply a load balancer to expose Minio.

  • @508mamidianilkumar6
    @508mamidianilkumar6 Год назад

    Hi Anton, Could you please do a video on how to use Prometheus in agent mode and forward metrics to Thanos Receive with scaling on both Agent and Receive side?

    • @AntonPutra
      @AntonPutra  Год назад

      I have one video on prometheus agent (ruclips.net/video/VyYrThINCjg/видео.html), set up remote write is the same as with other prometheus. Just use a code snippet from this video.

  • @backpack2861
    @backpack2861 9 месяцев назад

    Thanks for this video and sharing your knowledge, I just have a question about it, you mentioned that remote write is your favorite solution and I really like how this is working but I was investigating about remote write and the documentation says that is recommend only in cases of pushing metrics is the only solution. What do you think about it? I don't see any problem using remote write but maybe I'm ignoring something here (I'm newbie in this)

    • @AntonPutra
      @AntonPutra  9 месяцев назад +1

      I've been running thanos at scale 10+ envs for a couple of years, based on my personal experience remote read almost lags, so I found in order to provide better user experience for developers and other devops team members is better to use remote wrtie

  • @user-wt8ji9dg3t
    @user-wt8ji9dg3t Год назад

    Question : Do we have any Helm charts or Operator for the Receiver Stack Installation or CRD's is the only way for now ?
    I checked the Official Bitnami Chart which supports the sidecar approach but nothing on the Receiver
    Any ideas ?

    • @AntonPutra
      @AntonPutra  Год назад

      I don't think so, there is one for just prometheus - github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack

  • @SangeetaMeena-i5q
    @SangeetaMeena-i5q 12 дней назад

    Awesome video! I was trying to follow as in video, I was stuck at few points. For first step, when thanos was disabled and 2 pods of prometheus-staging-0 were created, my expectation was to be able to see targets when prometheus-operated was port forward on port 9090 to my localhost. I didn't see that. Could you please let me know why? Second after adding thanos sidecars without mutual tls, on port forward querier on 9090 on localhost, I did see sidecar in my Stores but I wasn't able to see any Prometheus metrics as you showed . Please guide.

    • @AntonPutra
      @AntonPutra  12 дней назад

      Thanks, there are a lot of labels that must match in order for the target to show up. I have another video on my channel covering Prometheus Operator, it may help. For the second question, if you don't have any targets, you won't have any metrics in Prometheus.

  • @anashamdan5237
    @anashamdan5237 Год назад

    Please complete the series about devops interview questions

  • @evelayudham3247
    @evelayudham3247 11 месяцев назад

    Hi Anton, got a question, How is thanos querier connected to prometheus sidecar. What values is to be passed in querier.stores list ?

    • @AntonPutra
      @AntonPutra  10 месяцев назад

      It uses gRPC protocol
      thanos query \
      --http-address "0.0.0.0:9090" \
      --query.replica-label "replica" \
      --endpoint ":" \
      --endpoint ":" \

  • @vishnuvardhanvandavasi9998
    @vishnuvardhanvandavasi9998 9 месяцев назад

    we need a setup between two eks cluster metrics with prometheus thanos ha and stored in s3.... we are awaiting for this log from 4 months

    • @AntonPutra
      @AntonPutra  9 месяцев назад

      Sorry, I didn't get a question.

  • @ahmadbanisalameh9285
    @ahmadbanisalameh9285 9 месяцев назад +1

    Thank you for your videos very insightful ,
    I have a question, I have multiple clusters each cluster has a prometheus deployed to it and I have a monitoring cluster that has a centralized thanos, the prometheuses are in agent mode with enabled remote write, do I need to deploy a receiver for each prometheus ?, using thanos helm chart and for prometheus kube-prometheus-stack,
    and if so do each receiver need a qeurier or I can connect all of them to the same querier, each one is gonna save to a different bucket does the same needs to happen for the store gateway ?, and if you have any suggestions helping me architect this setup.
    Thank you again

    • @AntonPutra
      @AntonPutra  9 месяцев назад +1

      Sure, you can deploy a single receiver (it can be sharded) and point all your prometheus agents there. Also you need at least 1 querier to query that receiver for short term metrics and store gateway in case you use object store for long term storage.

    • @ahmadbanisalameh9285
      @ahmadbanisalameh9285 9 месяцев назад

      but how the receiver is gonna know which bucket to upload to if its a single one and all my four prometheuses write the metrics to it , for example the prometheus on dev should be uploaded to a dev bucket and so on @@AntonPutra

  • @nitishchauhan7377
    @nitishchauhan7377 Год назад +1

    Hello Anton Good Video , a query when are you planning to release Thanos Production Setup ?

    • @AntonPutra
      @AntonPutra  Год назад

      I'm thinking about it since lot of people asked. Probably soon, but all the pieces for that in this video. Just need to use public load balancers pretty much and of course configure IAM for S3

    • @nitishchauhan7377
      @nitishchauhan7377 Год назад

      @@AntonPutra Oh that's great. Also I had a query, using remote write I'm storing data in thanos , now on the grafana level what changes should be done so that the tenant-A are able to see metrics of clusters running in tenant-A and tenant-B are able to see metrics of clusters running in tenant-B.

    • @AntonPutra
      @AntonPutra  Год назад

      @@nitishchauhan7377 you can use tenant label on that metric

  • @njg120
    @njg120 7 месяцев назад

    hi , good video. You were talk about a video of how configure prometheus agent , IRSA with bucket s3 and alb aws for exposing remote-write-svc , where are those videos ? thanks

    • @AntonPutra
      @AntonPutra  7 месяцев назад

      I haven't recorded them yet, sorry.

  • @ignatiussahmofor3385
    @ignatiussahmofor3385 7 месяцев назад

    Nice video . I clone you repo but was didn't see the yaml files you were using.

    • @AntonPutra
      @AntonPutra  6 месяцев назад

      Sorry for the delay. All of the YAML files are here: github.com/antonputra/tutorials/tree/main/lessons/163.

  • @dhartipatel8769
    @dhartipatel8769 3 месяца назад +1

    Super helpful video @AntonPutra . I have replicated this set up for my project. I have deployed on k8s v1.26.5 cluster. The only difference is I am using higher versions of Prometheus (v2.48.1); Prometheus-operator (0.71.2). I also upgraded the CRDs as per the operator. The issue is the I am not able to fetch any active prometheus target. I've checked all the configurations, deployed pods are running up and fine. The servicemonitor has correct labels(same as your src code). Any help appreciated ! Best Regards.

    • @dhartipatel8769
      @dhartipatel8769 3 месяца назад +1

      Forgot to add, I am checking the target at the local Prometheus UI :) There are no error logs as well, neither in operator pod nor in the prometheus-staging pod. I see one expected log missing as below when I compare it with your video:
      level=info msg="discovery manager scrape" discovery=kubernetes config=serviceMonitor/monitoring/prometheus/0 msg="Using pod service account via in-cluster config"

  • @rafael.aloizio1769
    @rafael.aloizio1769 Год назад +1

    Hey Anton, I'm really curious to know this, during your videos you usually organizing files with numbers in front of them, the question might be stupid, but I'll ask, do you use this only for the tutorials, or do you have a reason behind this, and follow this in your production coffee as well?

    • @AntonPutra
      @AntonPutra  Год назад +1

      Hi Rafael, no just for tutorial for files to be in order. Don't use numbers in your production code, it would look like a beginner's code =)

    • @rafael.aloizio1769
      @rafael.aloizio1769 Год назад +1

      @@AntonPutra oh ok, thanks for the answer, I'm a beginner though haha 😂 in the DevOps world at least.

  • @m18unet
    @m18unet Год назад +1

    Hi Anton. Thanks a million times for the awesome series. I have a question.
    Let's destroy the minio and set the "tsdb.retention" value like "180day" in the receivers. Is this bad practice? I don't want to use any object storage(S3, Minio etc.).
    I am asking this question for the Remote Write approach.

    • @AntonPutra
      @AntonPutra  Год назад +1

      No, it's fine and it can eliminate the bottleneck. Just make sure you set up sharding from day one. Also, if you decide to use Thanos Ruler, it's more reliable without S3 and the store gateway.

    • @m18unet
      @m18unet Год назад

      I completely understand you. Thanks for your detailed explanation Anton 😊

    • @m18unet
      @m18unet Год назад

      By the way, why might I need to install the thanos-ruler? I don't fully understand this component.

    • @AntonPutra
      @AntonPutra  Год назад

      @@m18unet For example, if you run stateless Prometheus in 'agent' mode, no data is stored locally. This means you cannot run Alertmanager locally as well. The Ruler is used to run alerts on the global state, which also has some drawbacks. You can read more about this in the official documentation.

  • @gabrielportela6544
    @gabrielportela6544 Год назад

    Thanks for the tutorial Anton! When the next video will be released?

    • @AntonPutra
      @AntonPutra  Год назад +1

      It appears that it's a very niche topic. Do you think there is an interest in continuing and providing prod ready setup?

    • @gabrielportela6544
      @gabrielportela6544 Год назад +1

      ​@@AntonPutra I think so. I could not find much helpful content about Thanos implementation on RUclips and you are always providing great tutoring.
      On top of that, the only other tutorial I watched also talked about a second video that was never released

    • @AntonPutra
      @AntonPutra  Год назад +1

      @@gabrielportela6544 okay, but I still need to finish my terraform series.

    • @XRoydX
      @XRoydX 10 месяцев назад

      I'd also love too see the video ❤
      Thank you so much for your work

  • @harshitanand6209
    @harshitanand6209 11 месяцев назад

    Hey Anton,
    Followed your video and implemented the thanos write method, all the pods are running without error, but I am still not able to see any metrics in thanos querier :( Can you point me what I could be missing here ?

    • @AntonPutra
      @AntonPutra  11 месяцев назад

      I would suggest starting with the minimal setup and trying to inspect every component for any errors.

  • @madhankannan8022
    @madhankannan8022 10 месяцев назад

    Hi , I am not able to find the video for thanos setup in eks

    • @AntonPutra
      @AntonPutra  10 месяцев назад

      It's very similar to this one but I haven't released it yet

  • @BingedFriends
    @BingedFriends 3 месяца назад +1

    Hi @AntonPutra, Thanks for the video. Did you get a chance to prepare the Production ready Thanos setup using Amazon EKS Cluster remote write endpoint and deploying stateless Prometheus in Agent mode and moving local alerts to a Centralized Thanos.

    • @AntonPutra
      @AntonPutra  3 месяца назад +1

      Not yet, but I'll definitely create one soon

    • @xXKingofDiamondsXx
      @xXKingofDiamondsXx Месяц назад +1

      @@AntonPutraI would love to see it done in AWS! I’m trying to continue off what I’ve learned from this video. Do I now need to expose the Querier from each cluster to Grafana? I’m so grateful for the content you put out. MANY thanks and cups of coffee for you friend!

    • @AntonPutra
      @AntonPutra  Месяц назад +1

      @@xXKingofDiamondsXx I may refresh it soon, maybe with Thanos, Cortex, or even Grafana Mimir. I haven't decided yet which one to cover first.

  • @souvikdas2457
    @souvikdas2457 8 месяцев назад

    @AntonPutra : From where to get these yaml's??

    • @AntonPutra
      @AntonPutra  8 месяцев назад

      I found them in the official repository a while ago and made a few updates. Are there any concerns?

  • @nickcarlton4604
    @nickcarlton4604 Год назад

    Thanks Anton, will you be doing a video on a HA Thanos Receive setup? I can get it working with a single Receive pod but when I setup multiple I start getting internal server errors etc

    • @AntonPutra
      @AntonPutra  Год назад

      I cover it in this tutorial. here is an exampe - github.com/antonputra/tutorials/blob/main/lessons/163/hashring.yaml
      41:25 Configure Thanos Receiver Sharding

    • @nickcarlton4604
      @nickcarlton4604 Год назад

      @@AntonPutra Thanks for that, the issue I see is when one of the receivers goes down for whatever reason the metrics it has are lost and Prometheus stops remote writing and throwing 500s because it cannot communicate with a host from the hashring. Does that make sense?

    • @AntonPutra
      @AntonPutra  Год назад

      @@nickcarlton4604 Yes, but when the receiver recovers, prometheus will write all the missing metrics (if the receiver was down for few hours). You can set the replication if you want - github.com/antonputra/tutorials/blob/main/lessons/163/receiver-1/statefulset.yaml#L53
      I'm not sure if HA is posible in this sense. You're right it's a receiver sharding for scalability and has nothing to do with HA (high availability when one replica goes down). I know that HA mode on Prometheus is possible by running multiple independent instances and use external labels for deduplication (it may or may not use different receiver shard...)

    • @nickcarlton4604
      @nickcarlton4604 Год назад

      Thanks, I guess it would be more a case of if I had 2 receivers and a hashring, when a receiver goes down, how can I still remote write to the remaining one? Or would it be better to have two receivers without a hashring, let them work independently via a clusterIP service but ship the data off to object storage faster (a few minutes) so then it can be consumed from store gateway?

    • @AntonPutra
      @AntonPutra  Год назад

      @@nickcarlton4604 Not sure about that, I use 5 shards in prod and don't have any issues. Please try it and let me know if it works

  • @mitesh6872
    @mitesh6872 7 месяцев назад

    Hi Anton, How to configure the prometheus to handle 4000 to 7000 Pods cluster? When we put current config and setup remote write, prometheus starts using 45GB+ RAM. can you please tell the best way to optimize it?

    • @AntonPutra
      @AntonPutra  7 месяцев назад

      Two ways:
      1. Manual sharding allows you to have multiple Prometheus instances querying different targets.
      2. Prometheus has a sharding feature, but it's in beta; you can take a look.

    • @mitesh6872
      @mitesh6872 7 месяцев назад

      @@AntonPutra Thanks. 1. so it will be 2 prometheus instance of 22 GB RAM but usage of 44GB is expected,right? One more thing related to put the datasource in grafana. Do I need to create 2 datasources. 1. Prometheus 2. Querier. Is there a way I get all the data of prometheus,Querier and Storage Gateway with single datasource?

  • @AloisioBilck
    @AloisioBilck Год назад

    Hi Anton,
    why don't you use helm to install prometheus?

    • @AntonPutra
      @AntonPutra  Год назад +1

      You can use helm, nothing wrong with it

  • @emergirie
    @emergirie Год назад

    Do you think their is an interest to use thanos when implemented managed Prometheus ?

    • @AntonPutra
      @AntonPutra  Год назад

      based on my experience self managed thanos up to 100x cheaper. cadvisor and node exporters create lots of metrics with high cardinality (lots of label combinations) can be very expensive with managed services like managed prometheus or any other data dog sfx you name it

  • @leeren_
    @leeren_ Год назад

    Looking forward to the followup video for production endpoints! Is that hard to setup?

    • @AntonPutra
      @AntonPutra  Год назад +1

      No, it's pretty much the same. By 'production', I meant that I was going to show how to use cloud-specific components, such as IRSA, to set up permissions to access an S3 bucket, etc.

    • @leeren_
      @leeren_ Год назад

      @@AntonPutra Got it. To allow multi-cluster Thanos, is it as simple as making the receiver endpoint external? Would exposing it through an ingress and figuring out how to use cert-manager fix this? Any gotchas or resources? Thanks again, this tutorial was AMAZING

    • @AntonPutra
      @AntonPutra  Год назад

      @@leeren_ yes you need public DNS for each receiver endpoint, but those endpoints use custom protocols and ports so you need dedicated load balancer or you can tcp service on Nginx to share one

    • @leeren_
      @leeren_ Год назад

      @@AntonPutra For AWS do you recommend using ACM or just doing cert-manager ?

    • @AntonPutra
      @AntonPutra  Год назад

      @@leeren_ Either way, you don't need a public certificate. You need to create a CA, and you can import it to Cert-Manager to automatically renew your private certificates. The same applies with ACM.

  • @the-learner-le5y
    @the-learner-le5y 7 месяцев назад

    Hello Anton Putra,
    When can we expect the video on production ready Thanos setup?

    • @AntonPutra
      @AntonPutra  7 месяцев назад

      I don't know yet; it's very similar. You just need to use IRSA to configure permissions for S3. Do you have any specific use case in mind, such as 1 cloud, 2 clouds, or perhaps 10 environments?

    • @the-learner-le5y
      @the-learner-le5y 7 месяцев назад

      @@AntonPutra
      Thanks for the reply,
      Planning to setup in Multiple AKS clusters where the receivers will be behind ingress. Anyways let me set it up.

  • @Kavinnathcse
    @Kavinnathcse 2 месяца назад

    very nice tutorial .. so whats the difference between thanos vs grafana mimir ?

    • @AntonPutra
      @AntonPutra  2 месяца назад

      thanks. i haven't used it yet, but I'll definitely test it and create a tutorial, maybe a direct comparison soon

    • @Kavinnathcse
      @Kavinnathcse 2 месяца назад

      @@AntonPutra Thanks a lot.. one more thing as per official doc looks like thanos remote-write is not recommended for single tenant. if so, do you know the reason behind that?

    • @AntonPutra
      @AntonPutra  2 месяца назад

      @@Kavinnathcse Prometheus generally does not recommend using push methods except when necessary. Based on my experience managing 20+ environments and almost 100 Kubernetes clusters, remote read is a bit slow since each time you open Grafana, Thanos needs to query the remote environment. To improve user experience and speed up queries, we decided to use remote write.

  • @phyohtetpaing44
    @phyohtetpaing44 Год назад +1

    Hi sir can you make a database benchmarking (mongo db vs mysql vs postgresql)

    • @AntonPutra
      @AntonPutra  Год назад +1

      Sure, I was working on one between MySQL and Postgres already

  • @salamander-101
    @salamander-101 Год назад

    i think grafana mimir better than thanos at scale and easy to scaleout

    • @AntonPutra
      @AntonPutra  Год назад

      I've never used it before but I'll give it a shot

  • @rahulchowdhury279
    @rahulchowdhury279 Год назад

    Make videos on Cortex

    • @AntonPutra
      @AntonPutra  Год назад +1

      in the works with vitoriametrics

    • @rahulchowdhury279
      @rahulchowdhury279 Год назад

      @@AntonPutra appreciate it. and if possible please make it a detailed and long video

    • @AntonPutra
      @AntonPutra  Год назад +1

      @@rahulchowdhury279 sure

    • @rahulchowdhury279
      @rahulchowdhury279 4 месяца назад

      Still waiting @Anton

  • @sagarhm2237
    @sagarhm2237 11 месяцев назад

    Which app u using to edite

    • @AntonPutra
      @AntonPutra  10 месяцев назад

      Adobe Suite

    • @sagarhm2237
      @sagarhm2237 10 месяцев назад

      @@AntonPutra hey bro can do one vedio on opentelometry auto instrumentation kubernates cluster which application running on pod and i need response time of endpoints in that application. I'm a beginner and I'm not getting how approach and implementation.

    • @AntonPutra
      @AntonPutra  10 месяцев назад

      @@sagarhm2237 Sure, in the future. I already used it in one of my benchmark videos, but I don't remember which one exactly. However, it is not as mature as Prometheus. On the other hand, it's used by many commercial monitoring agents.

    • @sagarhm2237
      @sagarhm2237 10 месяцев назад

      @@AntonPutra thanks I'm the fresher one .