At 13:35, what is the benefit of loading data from Oracle into GCS first and then BigQuery (i.e. won't directly loading into BigQuery from Oracle be faster)? I know GCS can serve as a staging area and we get another copy of the data for fault tolerance, but is there any other benefit? Thanks :)
At 11:58 why DataFlow / Data Fusion can speed up pipelines? Like, the essence is either to do the transformation in BigQuery or in DataFusion. Even if the query is complex, won't it still be faster to do the transformation directly in BigQuery, rather than connecting to DataFusion and transforming data there?
Beautifully explained. Many thanks. Now, why wouldn't anybody want to move to the sunny side of France and land a job at Teads !! Both of the presenters did an excellent job.
How to perform sum of columns value in big query. And no of columns are notfixed , at runtime we need to decide no of columns need to consider for sum. Depends on user inputs
Well it's auto-managed and much simpler to work with. That's just SQL. You don't need to fine-tune the configuration like you would for a Spark job I think.
Yes, @Sunil Jain. I feel the same too. The source application would directly write it out to S3 for batch processing. Kafka would be required only for the streaming jobs.
At 13:36, why do we need to load into GCS first for batch loads? Can't we use Dafaflow/BigQuery BatchLoad / Data Fusion directly from GoldenGate?
At 13:35, what is the benefit of loading data from Oracle into GCS first and then BigQuery (i.e. won't directly loading into BigQuery from Oracle be faster)? I know GCS can serve as a staging area and we get another copy of the data for fault tolerance, but is there any other benefit? Thanks :)
At 11:58 why DataFlow / Data Fusion can speed up pipelines? Like, the essence is either to do the transformation in BigQuery or in DataFusion. Even if the query is complex, won't it still be faster to do the transformation directly in BigQuery, rather than connecting to DataFusion and transforming data there?
It is because they can bill you in both Services, for using BigQuery and also dataflow.
the best 'best practice' sharing session i ve ever had
Excellent session!
Beautifully explained. Many thanks. Now, why wouldn't anybody want to move to the sunny side of France and land a job at Teads !! Both of the presenters did an excellent job.
How to perform sum of columns value in big query. And no of columns are notfixed , at runtime we need to decide no of columns need to consider for sum. Depends on user inputs
Can you share slides?
Very useful!
6:30 do nada aparece uma nota de dez conto hahaha adorei
Hello, will you also share the slides? Thank you
Has he yet?
Congratulations
How is this different from Spark?
Well it's auto-managed and much simpler to work with.
That's just SQL. You don't need to fine-tune the configuration like you would for a Spark job I think.
Kafka to S3 and S3 to Cassandra via Spark is wrong approach.
Yes, @Sunil Jain. I feel the same too. The source application would directly write it out to S3 for batch processing. Kafka would be required only for the streaming jobs.
H thanks u so much love you amiee Wright and Bella daws Xxxxx
Dedicado a JMB!!!! :)
P