You wouldn't want to and no. Dataproc is a manager Hadoop cluster which performs better with a huge dataset - which is generally batch data. Plus running a DataProc cluster is very expensive. If you want to process streaming data, you are better off using Pub/Sub or a Kafka Cluster
Hi, will you pls share how did you add big query connectors with zeppelin? it is giving me errors when i try to access data set in pyspark using zeppelin.
Why can't we use Dataproc to process streaming data?
Thanks :)
You wouldn't want to and no. Dataproc is a manager Hadoop cluster which performs better with a huge dataset - which is generally batch data. Plus running a DataProc cluster is very expensive. If you want to process streaming data, you are better off using Pub/Sub or a Kafka Cluster
Hi, will you pls share how did you add big query connectors with zeppelin? it is giving me errors when i try to access data set in pyspark using zeppelin.
Will you pls share how did you add big query connectors with zeppelin? it is giving me errors when i try to access data set in pyspark using zeppelin.
Use better mic....
😊😊☹🙃🙁💯