Monitor EKS & EC2 instances with MANAGED Prometheus & Grafana (Terraform & Prometheus Agent & AWS)

Поделиться
HTML-код
  • Опубликовано: 7 сен 2024
  • 🔴 - To support my channel, I’d like to offer Mentorship/On-the-Job Support/Consulting - me@antonputra.com
    👉 [UPDATED] AWS EKS Kubernetes Tutorial [NEW]: • AWS EKS Kubernetes Tut...
    ▬▬▬▬▬ Experience & Location 💼 ▬▬▬▬▬
    ► I’m a Senior Software Engineer at Juniper Networks (12+ years of experience)
    ► Located in San Francisco Bay Area, CA (US citizen)
    ▬▬▬▬▬▬ Connect with me 👋 ▬▬▬▬▬▬
    ► LinkedIn: / anton-putra
    ► Twitter/X: / antonvputra
    ► GitHub: github.com/ant...
    ► Email: me@antonputra.com
    ▬▬▬▬▬▬ Related videos 👨‍🏫 ▬▬▬▬▬▬
    👉 [Playlist] Kubernetes Tutorials: • Kubernetes Tutorials
    👉 [Playlist] Terraform Tutorials: • Terraform Tutorials fo...
    👉 [Playlist] Network Tutorials: • Network Tutorials
    👉 [Playlist] Apache Kafka Tutorials: • Apache Kafka Tutorials
    👉 [Playlist] Performance Benchmarks: • Performance Benchmarks
    👉 [Playlist] Database Tutorials: • Database Tutorials
    ▬▬▬▬▬▬▬ Source Code 📚 ▬▬▬▬▬▬▬
    ► GitHub: github.com/ant...
    #AWS #EKS #DevOps

Комментарии • 67

  • @AntonPutra
    @AntonPutra  11 месяцев назад +3

    🔴 - To support my channel, I’d like to offer Mentorship/On-the-Job Support/Consulting - me@antonputra.com
    👉 [UPDATED] AWS EKS Kubernetes Tutorial [NEW]: ruclips.net/p/PLiMWaCMwGJXnKY6XmeifEpjIfkWRo9v2l&si=wc6LIC5V2tD-Tzwl

  • @Tszyu01
    @Tszyu01 Год назад +3

    This is actually great, helpful real world content. I’d expand on creating alerts using grafana as well as distributed tracing using tempo.

  • @nforlife
    @nforlife Месяц назад

    wow, this is awesome you saved my life man!
    could you please make another video using Grafana cloud or Open Telemetry, Prometheus, Tempo and Loki

    • @AntonPutra
      @AntonPutra  29 дней назад +1

      Thanks, I have one covering tempo with open telemetri if you are interested - ruclips.net/video/ZIN7H00ulQw/видео.html

    • @nforlife
      @nforlife 29 дней назад

      @@AntonPutra thanks!

  • @user-nl6im4lq1n
    @user-nl6im4lq1n 11 месяцев назад +1

    Hi Anton!
    Thanks for the video and your work for us.
    Do you have an example, where we build a separate host which we use as a monitor?
    That is, install Prometheus and Graphana on the instance and drag metrics from our nodes or instances there.
    The goal is to have a host that the user accesses and monitors all our AWS resources, both nodes and instances.

    • @AntonPutra
      @AntonPutra  11 месяцев назад

      Thank you! Well, I have a video on how to build centralized monitoring based on Thanos. However, the use case you described is not easy to implement, since any external Prometheus would not get Kubernetes service discovery. You can, however, set up a remote write and push metrics to a single instance.

  • @MrNewAmerican
    @MrNewAmerican 6 месяцев назад +1

    Such great content, Thank you!

  • @thiagoscodeler5152
    @thiagoscodeler5152 11 месяцев назад +1

    Great content as always Anton. Thanks for sharing! Do you already have the content for setting up alerts?

    • @AntonPutra
      @AntonPutra  11 месяцев назад +1

      Thanks, for EKS not yet

    • @thiagoscodeler5152
      @thiagoscodeler5152 10 месяцев назад

      @@AntonPutra Looking forward for that. Thank you.

  • @gkalangara
    @gkalangara Год назад +2

    really great content. One question i have though=> what would be the easy way other than port-forward svc/grafana 3000 to access it outside of the type loadbancer

    • @AntonPutra
      @AntonPutra  Год назад

      Thanks! Typically ingress would be the best choice.

  • @George-mk7lp
    @George-mk7lp Год назад +1

    always greatest content , best channel ever

  • @shulyakav
    @shulyakav Год назад +1

    Excellent! As always. )

  • @hollywoodmoviesexplainedhi4519
    @hollywoodmoviesexplainedhi4519 9 месяцев назад +1

    Hello anton, thanks for demo videomI would like to know if I can use single managed grafana for monitoring multiple eks cluster in different region

    • @AntonPutra
      @AntonPutra  9 месяцев назад

      Sure, you just need to add multiple datasources for each env/region.

  • @pikachu3686
    @pikachu3686 7 месяцев назад +1

    best video

  • @shehzadmohammed6269
    @shehzadmohammed6269 Год назад

    I love you! Thank you for these videos~

  • @christianibiri
    @christianibiri Год назад +1

    Love your channel :)

  • @JackReacher1
    @JackReacher1 Год назад +2

    Wait, ain't we just installing the whole prometheus on the cluster and just using prometheus agent?
    So, what is the advantage of using managed prometheus service by aws, it isn't even cloud agnostic?

    • @AntonPutra
      @AntonPutra  Год назад +1

      Managed prometheus allows you to collect metrics in centralized place. For example if you have lots of environments, you don't have to switch VPN and grafana dashboards. Second it has a long term storage of 180 days. Typical retention for a standalone Prometheus is 7-14 days. You can also use it from other clouds and premise. BUT it can be very pricey, so you have to filter out metrics before you ship it to managed prometheus.

    • @JackReacher1
      @JackReacher1 Год назад

      @@AntonPutra Wow, those are some solid points.

    • @hasanbingolbali9423
      @hasanbingolbali9423 9 месяцев назад

      @AntanPutra Do we filter out the metrics in Aws level or in the services which are distributing metrics level?

  • @AntonPutra
    @AntonPutra  Год назад +1

    Get Full-Length High-Quality DevOps Tutorials for Free - Subscribe Now! - ruclips.net/user/AntonPutra

  • @kayoutube690
    @kayoutube690 Год назад +1

    do you have a video for installing grafana on eks?

    • @AntonPutra
      @AntonPutra  Год назад

      There is nothing special about EKS and grafana, and yes I have bunch of example using terraform with helm and plain yaml. This is the last one - github.com/antonputra/tutorials/blob/main/lessons/173/terraform/6-monitoring.tf#L27-L43
      You can search for grafana in that repo

  • @m18unet
    @m18unet Год назад

    Thanks for the great tutorial. I have two questions
    1. Do we need PV/PVC (any permanent storage) with Prometheus agent mode? Are pods stateless?
    2. Can I set the statefulset's replica number to 2 for HA in Prometheus agent mode? I want to associate these two agent replica with thanos receiver. Is it a bad idea?

    • @AntonPutra
      @AntonPutra  Год назад +1

      Thanks.
      1. Even in "stateless" mode it uses some local storage for caching. I think you can try but I still use pvc. (also in stateless mode you need to use thanos ruler instead of local alertmanager)
      2. Yes you can, that's the only way for Prometheus HA mode, just don't forget to include external labels that Thanos can deduplicate metrics. Also HA mode will double network traffic and storage just something to keep in mind in terms of cost.

    • @m18unet
      @m18unet Год назад

      @@AntonPutra I understand very well. Thanks for your explanation 😊

  • @AntonPutra
    @AntonPutra  Год назад +1

    🔴Part 2 - Send Alerts to Slack, Email, PagerDuty - AWS Managed Prometheus (AMP) - ruclips.net/video/SvDpuVlJTDg/видео.html

  • @RachidMoysePolania
    @RachidMoysePolania 10 месяцев назад +1

    Hi, im having issues to get the ec2-node-exporter job i've done everything like in the tutorial, but it doesnt work, could you give me a hand?

    • @AntonPutra
      @AntonPutra  10 месяцев назад

      What's the issue? Do you have targets in the Prometheus but they timeout or simple not visible?

    • @RachidMoysePolania
      @RachidMoysePolania 9 месяцев назад

      @@AntonPutra the ec2 crawler I don't know if it works because I create my instance with all the user-data and it works but when I go to the Prometheus I look for the ec2-node-exporter job in the latest dashboard that he imports is never showed, so I cant get my metrics from my ec2 instances (not kubernetes cluster nodes) in Prometheus

    • @sydefcon
      @sydefcon 3 месяца назад

      @@RachidMoysePolania Hey did you solved the issue, face the same issue myself

    • @RachidMoysePolania
      @RachidMoysePolania 2 месяца назад

      @@sydefcon sure, let me know if still need help, will be glad to help you

  • @AntonPutra
    @AntonPutra  Год назад +1

    🟢 [New] Terragrunt Tutorial: Create VPC, EKS from Scratch! (Step-by-Step) - ruclips.net/video/yduHaOj3XMg/видео.html

  • @gpj-qo9cb
    @gpj-qo9cb Год назад

    Can this configuration be used with your AWS EKS Fargate setup? Would additional setup be needed for Fargate?

    • @AntonPutra
      @AntonPutra  Год назад +1

      Yes, you can use it to deploy prometheus agent and managed prometheus (including IAM for service accounts). But the daemonset is not supported yet, so you won't be able to deploy cadvisor or node exporter github.com/aws/containers-roadmap/issues/971

  • @ziaurrehman4738
    @ziaurrehman4738 Год назад

    Hey do you have a plan to make a video on GCP managed prometheus with GKE and instance Group

    • @AntonPutra
      @AntonPutra  Год назад +1

      Hi Zia, yes one more for sending alerts and then GCP stuff

    • @ziaurrehman4738
      @ziaurrehman4738 Год назад

      @@AntonPutra perfect, thanks

  • @sushmithashetty5324
    @sushmithashetty5324 Год назад

    How can we check the fargate containers are down or up using prometheus and grafana

    • @AntonPutra
      @AntonPutra  Год назад

      fargate node == pod, use targets up metrics

  • @aleksanderfidelus
    @aleksanderfidelus Год назад

    Could all of this be done with terraform to not include manual steps with kubectl?

    • @AntonPutra
      @AntonPutra  Год назад

      Of course Aleksander, just use Kubectl terraform provider (or helm)
      registry.terraform.io/providers/gavinbunney/kubectl/latest/docs

  • @thiagoscodeler5152
    @thiagoscodeler5152 11 месяцев назад

    Anton, I'm able to visualize metrics only for the monitoring namespace (not for my other namespaces). Do you have any clue? I'm using both Prometheus and Grafana Managed Services

    • @AntonPutra
      @AntonPutra  11 месяцев назад +1

      first check targets in prometheus, you may need to update Prometheus selectors

    • @thiagoscodeler5152
      @thiagoscodeler5152 11 месяцев назад

      @@AntonPutra I can now visualize other namespaces metrics. Only not seeing node-exporter service monitor on Prometheus UI. When I go to Service Discovery I can see some undefined targets.

    • @thiagoscodeler5152
      @thiagoscodeler5152 11 месяцев назад

      I managed to fix that.

    • @AntonPutra
      @AntonPutra  11 месяцев назад

      @@thiagoscodeler5152 cool

    • @thiagoscodeler5152
      @thiagoscodeler5152 11 месяцев назад

      @@AntonPutra any idea why I can only visualize Kube State Metrics related to the "monitoring" namespace?

  • @arjunkumarbetageri9791
    @arjunkumarbetageri9791 Год назад

    I have created resources as you have mentioned All pods and SVC are running fine. I'm using a linux machine to deploy terraform templates (EC2 instances) . How can I see prometheus and grafana on the web UI..? I tried all options as terraform creates EKS nodes in private subnets and we have not provisioned any loadbanacer also.

    • @AntonPutra
      @AntonPutra  11 месяцев назад

      Well the easiest way to port forward, for example "kubectl port-forward svc/prometheus-operated 9090 -n monitoring"

  • @ladfloss
    @ladfloss 8 месяцев назад

    Do you know how to do it using both aws managed grafana and prometheus?

  • @aybukecabuk06
    @aybukecabuk06 Год назад

    Hi, how can i solve this error. (prometheus-agent-0 1/2 CrashLoopBackOff 3 (4s ago) 63s) ts=2023-02-10T07:48:26.216Z caller=main.go:1119 level=error err="error loading config from \"/etc/prometheus/config_out/prometheus.env.yaml\": one or more errors occurred while applying the new configuration (--config.file=\"/etc/prometheus/config_out/prometheus.env.yaml\")"

  • @user-ro7co6yr4d
    @user-ro7co6yr4d Год назад

    Hi,I have followed each and every step,and executed but unable to port forwading promothes.
    ts=2023-07-13T04:24:16.794Z caller=refresh.go:99 level=error component="discovery manager scrape" discovery=ec2 msg="Unable to refresh target groups" err="could not describe instances: UnauthorizedOperation: You are not authorized to perform this operation.
    \tstatus code: 403, request id: 741f3c18-1c8a-4616-aed3-ac11f80d2ffc"
    I got this error and i am unable to connect loaclhost for prometheus .can please you help that. .

    • @AntonPutra
      @AntonPutra  Год назад

      This looks like the error that AWS returns when you don't have permissions. It's most likely that you have misconfigured Prometheus and it does not have permissions to access AWS.