Hi Will, First of all, you are awesome. I am working on project where I am using Copy Data activity in Data pipeline but that is Buggy when it comes to Datetime column. So need to find alternatives and Then I tried Dataflow Gen 2 and found it cannot be parameterized or meta driven. Do you think there is possibility to do that using Dataflow Gen 2? Any insight would be much appreciated.
Not entirely sure of the reason why it is slower tbh, and I haven't actually checked since posting this video, I could have already improved. I expect by the time Fabric reaches GA it will on a par with pipelines
Ah sorry yes I should have shown that! Go to app.fabric.Microsoft.com Click on data factory Click New dataflow You should now be in the dataflow powerquery editor 👍
thanks, thats what I am using (copy-pasting power query m-code from desktop to blank query in dataflow gen2 ) just if dataflow gen 2 had the same REST API connector as data pipeline
In terms of comparison to other tools, what about Azure Data Factory, particularly Mapping Data Flows? Mapping Data Flows seem to be missing from Fabric
You’re right, there’s no mapping dataflows in the Fabric Data Pipeline. in the documentation pages it says that these have been ‘replaced’ by Dataflow Gen2. I did a more recent video on data pipeline vs dataflows here and mentioned mapping dataflows ruclips.net/video/t5mUKaLWpHE/видео.html
Great tutorial , how we can transform multiple csv in a dataflow and then sink them in a lakehouse or warehouse using the same dataflow is it even possible i tried it but when i select the destination it is only taking one CSV to sink at destination i am stuck here
Yes that's possible! Although how you do it depends on where the CSVs are stored. If in Lakehouse Files area, use the Folder source in Dataflow Gen2. If in SharePoint, use the SharePoint Folder source. And if in Azure Blob or ADLS, you can use the blob connecter or ADLS connector respectively
@@LearnMicrosoftFabric Thanks a lot , i successfully moved files from Lakhouse to mywarehouse using dataflow however when i ran dataflow twice or thrice using append its duplicating and triplicating the records for example i have Department table ( CSV ) and then i appended using DF initially Dept table was having 3 records ID 1 DP1 , ID 2 DP 2 and ID3 DP3 now all of these 3 records are triplicated in my Data is there any configuration to stop this from happening i used the same CSV ( nothing changed ) and ran the datflow 3 times , my expectation was that Fabric will not append any record as the ID already exists in my current CSV ( nothing being changed in the CSV ) will appreciate your assistance
Thank you for the detailed presentation on Data Factory, especially the comparison between DFg2 and Data Pipeline.
No problem, glad you found it useful ☺️
it's really clear and useful! Thank you so much for sharing your knowledge!!
Hey, loving this channel! where is the data set so i could follow along?
Hey thanks, glad you're enjoying! Sorry but I actually can't remember, but it looks like an ourworldindata dataset!
Great overview !!
Thanks for watching! :)
Hi Will,
First of all, you are awesome.
I am working on project where I am using Copy Data activity in Data pipeline but that is Buggy when it comes to Datetime column. So need to find alternatives and Then I tried Dataflow Gen 2 and found it cannot be parameterized or meta driven. Do you think there is possibility to do that using Dataflow Gen 2?
Any insight would be much appreciated.
What is the compute for DF gen 2? As I. Why is it slower than let’s say a fabric pipeline? Don’t they both use the MPP architecture? Thanks
Not entirely sure of the reason why it is slower tbh, and I haven't actually checked since posting this video, I could have already improved. I expect by the time Fabric reaches GA it will on a par with pipelines
Is it possible to get data into fabric from multiple sources? if yes how can we do that?
with a dataflow? yes, just make multiple queries. Or you can split them out into multiple dataflows
@@LearnMicrosoftFabric Thank you
How did you get to the PowerQuery interface? It's sort of understood but I'm having trouble getting to it.
Ah sorry yes I should have shown that!
Go to app.fabric.Microsoft.com
Click on data factory
Click New dataflow
You should now be in the dataflow powerquery editor 👍
Hi, great video, how do we load data from REST API using Dataflow Gen2 ?
Hey, thanks! you can do it PowerQuery using Web.Contents()
thanks, thats what I am using (copy-pasting power query m-code from desktop to blank query in dataflow gen2 ) just if dataflow gen 2 had the same REST API connector as data pipeline
Yes it also has a UI based connector called Web API - its under the ‘Online’ section of the Connectors when you do Get Data ☺️
Which one would faster and better in term of compute cost as time saving, dataflow gen2 or running notebook?
Notebook I believe! But I am planning to do a side-by-side comparison of capacity usage for each of them soon 🙌
In terms of comparison to other tools, what about Azure Data Factory, particularly Mapping Data Flows? Mapping Data Flows seem to be missing from Fabric
You’re right, there’s no mapping dataflows in the Fabric Data Pipeline. in the documentation pages it says that these have been ‘replaced’ by Dataflow Gen2. I did a more recent video on data pipeline vs dataflows here and mentioned mapping dataflows ruclips.net/video/t5mUKaLWpHE/видео.html
Great tutorial , how we can transform multiple csv in a dataflow and then sink them in a lakehouse or warehouse using the same dataflow is it even possible i tried it but when i select the destination it is only taking one CSV to sink at destination i am stuck here
Yes that's possible! Although how you do it depends on where the CSVs are stored. If in Lakehouse Files area, use the Folder source in Dataflow Gen2. If in SharePoint, use the SharePoint Folder source. And if in Azure Blob or ADLS, you can use the blob connecter or ADLS connector respectively
@@LearnMicrosoftFabric Thanks a lot , i successfully moved files from Lakhouse to mywarehouse using dataflow however when i ran dataflow twice or thrice using append its duplicating and triplicating the records for example i have Department table ( CSV ) and then i appended using DF initially Dept table was having 3 records ID 1 DP1 , ID 2 DP 2 and ID3 DP3 now all of these 3 records are triplicated in my Data is there any configuration to stop this from happening i used the same CSV ( nothing changed ) and ran the datflow 3 times , my expectation was that Fabric will not append any record as the ID already exists in my current CSV ( nothing being changed in the CSV ) will appreciate your assistance
Where is part 2 of 2
Dataflows end-to-end project (Microsoft Fabric) + Lakehouse + Power BI
ruclips.net/video/ZmgZPhl2LRU/видео.html