Data Processing in Google Cloud: Hadoop, Spark, and Dataflow (Cloud Next '19)

Поделиться
HTML-код
  • Опубликовано: 8 сен 2024

Комментарии • 5

  • @rimchatti3807
    @rimchatti3807 5 лет назад

    Why can't we use Dataproc to process streaming data?
    Thanks :)

    • @quinglover3520
      @quinglover3520 5 лет назад +4

      You wouldn't want to and no. Dataproc is a manager Hadoop cluster which performs better with a huge dataset - which is generally batch data. Plus running a DataProc cluster is very expensive. If you want to process streaming data, you are better off using Pub/Sub or a Kafka Cluster

    • @abdulqadirkhan9775
      @abdulqadirkhan9775 5 лет назад

      Hi, will you pls share how did you add big query connectors with zeppelin? it is giving me errors when i try to access data set in pyspark using zeppelin.

  • @abdulqadirkhan9775
    @abdulqadirkhan9775 5 лет назад

    Will you pls share how did you add big query connectors with zeppelin? it is giving me errors when i try to access data set in pyspark using zeppelin.

  • @nikhilbendre5162
    @nikhilbendre5162 5 лет назад

    Use better mic....
    😊😊☹🙃🙁💯