Design a Distributed Geospatial Data Platform | System Design

Поделиться
HTML-код
  • Опубликовано: 14 окт 2024

Комментарии • 12

  • @joseavellaneda4921
    @joseavellaneda4921 8 месяцев назад +4

    Thanks for the video! Would be great to also see the how you would write it on a real application

  • @rankala
    @rankala 8 месяцев назад +2

    I would like to point out that there are datebase (extensions) for GIS data. Postgis for postgres. So in fact you could query a database. Other databases have also extensions or native features.

    • @interviewpen
      @interviewpen  8 месяцев назад +1

      Yes-for our vector-based data this is a good solution. However, for raster data we don’t have any direct equivalent. We sort of glossed over this in the interest of time, so really good thoughts here!

  • @pmshadow
    @pmshadow 8 месяцев назад

    Very good and explicative video, thank you very much.
    I am currently building an internal data platform, and I was going to use Prefect on a VM, but after seeing your video I believe the best way to go would be: Prefect + Dask Scheduler + Dask Worker on Azure Kubernetes Service. Does that make sense to you? Then I could benefit from autoscaling of the workers.
    Thanks again!

    • @interviewpen
      @interviewpen  8 месяцев назад

      Yep, that sounds like a great solution! There's also fully managed solutions like Snowflake and Databricks as well, if that suits your use case. Thanks for watching!

  • @pieter5466
    @pieter5466 8 месяцев назад +1

    This made me wonder whether systems like Hadoop and MapReduce are still used/built.

    • @interviewpen
      @interviewpen  8 месяцев назад

      Hadoop MapReduce could absolutely be used in place of Spark/Dask as our distributed data processing cluster. However, this would be a lot of manual work to build the types of aggregations we would need from scratch. Good point!

  • @yashpandey7433
    @yashpandey7433 8 месяцев назад +1

    Did something similar but on a very large scale in PayPal,

  • @ocean645
    @ocean645 13 дней назад

    Hi, what exactly is this subject? Is it data science?

    • @interviewpen
      @interviewpen  8 дней назад

      This is system design-we’re considering what services and infrastructure to use to solve a high-level problem. Thanks for watching!