Hi Adam: I am 76 years old and also very new in this technology of ETL which I got now clear. Keep on developing this kind of professional projects to be used in the resume preparations.
Adam, I have been watching many of your videos. As someone new to Azure, i find your videos immensely valuable. Keep up your great work, really appreciate!
Great video! Most videos seem to focus mostly on the evertisement material straight from Azure. At best they show you the very dumb step of copying data from a file to DB. This is the first video I saw where you actually show how you can do something useful with the data and close to real life scenario. Thank you.
I love you, Adam! I have been struggling with using expression builder in Data Flow. I can't seem to figure out how to write the code. This video just made it look less complex. I'll be devoting more time to it.
Thanks you so much Adam. I was able to crack an interview with the help of your videos. I prepared notes according to your explanation & 3 hrs before the interview i watched your videos again it helped me alot
I am new to data and ETL stuff but your video's are too good. Excellent examples and very clear explanation so anyone can understand. Thanks very much.
Hi Adam, is it possible to create these pipelines as code as well? Or somehow create them from my actual Azure pipeline? It would be sheerly insane (but it is a Microsoft product) to require and maintain two pipeline one that’s yiur Azure pipeline for CI and CD and one for the ADF. I really would want the Azure pipeline to be able to fill/create the ADF pipeline. But I haven’t found anything yet.
Your video content is awesome!!! Your video is very useful to understand Azure concept specially for me who just started Azure journey. I would like to have one video where we can see how to deploy code from Dev to QA to Prod. How to handle connection string, parameter etc during deployment. thanks again for wonderful video content.
ADF CI/CD is definitely on the list. It's a bit complex topic to get it right so it might take time to prepare proper content around this. Thanks for watching and suggesting ;)
As useful another Awesome video Adam !!!. Excellent. It was to the POINT !!!. Keep up the good work which you have been doing for plenty of users like me. Eagerly waiting for more similar videos like this from you !!!. Can you please have some videos for Azure Search ...
Thank you so very much :) Azure Search is on the list but there is so many news coming from Ignite that I might need to change the order. Let's see all the news :).
Thank you so much for this. I subscribed immediately, very informative and straightforward azure info. Will definitely recommend your channel. Keep up the great work!
I would say this is the best content I've seen so far!! Thank you so much for making it Adam! Just wondering, is there a Crtl+Z or Crtl+Y command in case we did some changes in the dataflow and restore it to previous version?
Awesome, thanks! Unfortunately not, but you can use versioning in the data factory which will allow you to revert to previous version in case you broke something. Highly recommended. Unfortunately not reverts for specific actions.
@@549srikanth I publish each time I create a significant new step in the pipeline and I use data preview before moving on to the next step. Also, you can , I think, export the code version of the entire pipeline. Presumably you can, then, paste that into a new Pipeline to resurrect your previous version.
Hi Adam, few doubts. Please help me understand. 1. 10:04, After running the dataflow 1st time, there are 9125 rows got populated. Well, there is no output sink or output dataset associated with it dataflow yet, then where exactly those ingested rows are getting saved/populated? 2.15:04, after re-calculating "title" (by removing the year part), how come the previous original column (title) got disappeared? The modified title column should appear in addition to the previous original column (title) right?
Adam, Thanks for this excellent video. You explained almost every feature available there in data flows. Looking forward a video on Azure SQL DWH. I know it will be great to learn about it from you.
Around 20:20, We can see there is just one partition, does Azure automatically decide the number of partitions it needs to divide the dataset into ? Also is it done at some cost i.e. more partitions cost more or is it complementary ? Thank you for all the tutorials, I am binge watching them since 3 days now and thoroughly enjoying them ! Would love to see some tutorials for Synapse as well :) !
Your videos are really great and helped me understand lot of concepts of Azure. Can you please make one using SSIS package and show how to use that within Azure Data Factory
Here is the list of supported data sources for MDF docs.microsoft.com/en-us/azure/data-factory/data-flow-source?WT.mc_id=AZ-MVP-5003556 . Just copy data from REST API to Blob and then start MDF pipeline using that blob path as a parameter.
Great video Adam. What is the difference between Data Flow and Copy Activity within pipeline. When shall we go for Data flow creation instead of copy activity ?
Copy Activity just moves data in 1:1 fashion (or a subset of column). Data flow allows for data transformations/joining/aggregations of multiple input/outputs in a single step.
Hi Adam, so glad I found your channel. Your videos were a big help for achieving the AZ900 certificate. Now I am studying a lot to uplift my knowledge and get the Azure data engineer certificate. However, I have an important question! Data flows are expensive, sometimes clients don’t want to use this, are there alternatives to achieve the same result in azure data factory? Thank you very much!
@@AdamMarczakYT True! I am currently struggling with csv files that sometimes have extra spaces after the words in the header, this then gives error when doing a copy activity to Azure SQL Database. Do you have any idea to make my flow a bit more flexible so that it can deal with this? It needs some trimming in the header
I thought of doing a SELECT in a dataflow to then change to the correct header titles, but for this I need to know where the spaces will be in the future. So also not flexible.
Adam, FOr using transformation do I need to learn scala. Or just refer the documentation you specified for scala functions and write the transformation?
For anyone wondering how to make the year check (or any check) in the second step more robust, you can exchange the following expressions using the 'case' expression as used below which says, if this expression evaluates as true, do this, else do something else. Worth nothing here that in the first expression, there is only a true expression provided while the second expression has both true and false directives. As per the documentation on the 'case' expression: "If the number of inputs are even, the other is defaulted to NULL for last condition." /* Year column expression */ /* If the title contains a year, extract the year, else set to Null */ case(regexMatch(title, '([0-9]{4})'),toInteger(trim(right(title, 6), '()'))) /* title column expression*/ /* If the title contains a year, strip the year from the title, else leave the title alone */ case(regexMatch(title, '([0-9]{4})'),toString(left(title, length(title)-7)), title)
Thanks Paul :) I used as simple example as possible for people who aren't fluent in scala but of course you always need to cover all possible scenarios. Sometimes I like to fail the transformation rather than continue with fallback logic as I expect some values to be present.
@@AdamMarczakYT Of course, I just wanted to see if I could take it a step further to align more closely with what would be needed in a production data engineering scenario and thought others may have the same idea. Thanks for the content! :)
Hey, I do plan to have implementation videos like this in future. Although pipeline of videos is long so I can promise anything right now. I added this to the list of potential topics :) thanks!
Hello Adam, thanks a bunch for this excellent video. The tutorial was very thorough and anyone new can easily follow. I do have a question though. I am trying to replicate an SQL query into the Data Flow, however, I have had no luck so far. The query is as follows: Select ZipCode, State From table Where State in ('AZ', 'AL', 'AK', 'AR', 'CO', 'CA', 'CT'...... LIST OF 50 STATES); I tried using Filter, Conditional Split and Exists transforms, but could not achieve the desired result. Being new to the Cloud Platform, I am having a bit of trouble. Might I request you please cover topics like Data Subsetting/Filtering (WHERE and IN Clauses etc.) in your tutorials. Appreciate your time and help in putting together these practical implementations.
Hi Adam, Thanks for making this videos, very clear and concise. I have a question (sorry not related to this video) regarding Conditional split - Can the output stream activities, run in parallel ?
Adam, Your content is always easy to grab, excellent work mate. Could you please explain how to create a pipeline which has a copy activity followed by a mapping data flow activity.
I really like your tutorials. I have been looking for a "table partition switching" tutorial but haven't found any good ones. May be you could do one for us? I am sure it'll be very popular as there aren't any good ones out there and it is an important topic in certifications :-)
Great video. Question: Under "New Datasets", is there a capability to drop data into Snowflake? I see S3, Redshift, etc. I appreciate the video and feedback!
A quick question, Azure dataset seems only support already structured data, like CSV or JSON, what if my datasource is an unstructured text file that must be transformed into csv before being used? Is there a way to do this transformation (possibly python code) in data factory?
Adam, is there a way to preserve the filename and just have it change the extension? For instance, I'm adding a column with datetime, but at the end I would like it to have the same file name, just parquet. Is there a way to do that?
@@AdamMarczakYT Sorry if it was a dumb question, I'm still new to ADF. Ignore if it's too inane but is fileanem in the @pipeline parameter? I found one online but couldn't get it to parse.
Adam, excellent presentation of ADF concept. I find all your videos really helpful in understanding the ADF concept. One question in regards to the sink dataset in dataflow, how can I create dynamic folder in my blob storage based on the year, month and day when this dataflow was triggered?
Depends on what do you want to achieve. You can either set partitioning by date column which will split the data by date. Or if you want to put entire dataset in one folder using date then use formatDateTime expression like formatDateTime(utcNOw(), "yyyy/MM/dd") as path.
Hi Adam, please add some more contents about new features of dataflow, it's your channel only where I see azure add, no one teaches better than you do as I have compared with many channels.
It's the same as blob storage, just create linked service and select Azure Table Storage and create dataset for it. Not that this is not supported for Mapping Data Flows.
Hi Adam, that's a great tutorial, many thanks for it. I have a question that can we write the transformation functions in different language like Python or R instead of Scala? If yes can you please share some details on it?
Hi Adam: I am 76 years old and also very new in this technology of ETL which I got now clear. Keep on developing this kind of professional projects to be used in the resume preparations.
-1979 and ,12
This is why complex logic is needed. Nice tutorial :)
Adam, I have been watching many of your videos. As someone new to Azure, i find your videos immensely valuable. Keep up your great work, really appreciate!
Awesome, thank you!
Just discovered the channel. Your material is hight quality. It's excellent work. I will go watch more. Thank you Adam !
Thank you. This means much :)
Hello Adam, pls let me know how to connect to dynamic crm .. Pls send detail to pradysg@gmail.com
Your way of explaining is outstanding, after watching it feel like Azure is very easy to learn. kindly keep sharing good videos Thank You..
Thanks a ton :)
Your channel is totally underrated, man
Great video! Most videos seem to focus mostly on the evertisement material straight from Azure. At best they show you the very dumb step of copying data from a file to DB.
This is the first video I saw where you actually show how you can do something useful with the data and close to real life scenario.
Thank you.
This is quality stuff. Good for a quick upskill especially when prepping for an interview.
@ Work I'm having to build out a Data Mart with no training on my own. You are literally saving my hide with your videos. THANK YOU!
Glad to help! :)
I love you, Adam!
I have been struggling with using expression builder in Data Flow. I can't seem to figure out how to write the code. This video just made it look less complex. I'll be devoting more time to it.
I just find your videos while searching for ADF tutorials in youtube. The materials are fantastic and really helping me to learn. Thank you so much!!
Happy to help! :)
Thanks you so much Adam. I was able to crack an interview with the help of your videos. I prepared notes according to your explanation & 3 hrs before the interview i watched your videos again it helped me alot
Fantastic!
very very detailed work flow , i tried this and able to understand Data flow process so easily . Thank you for the wonderful session.
The best video about Azure Data Flows I can find. Thank you Adam!
Wow, thanks! :)
Nice one Adam. Cool one. Keep doing fabulous videos always fella.
Many THanks.
I am new to data and ETL stuff but your video's are too good. Excellent examples and very clear explanation so anyone can understand. Thanks very much.
Thank you, always happy to help!
Very crisp and clear information, I watched many videos but Adam's contents are awesome!! Thanks dear!! All the best for future good work!!
Thank you so much 🙂
Awesome video. I've seen a lot of site & videos and they are so complicated, but all yours are very crystal and anyone can understand.
Thanks Omar :)
Your videos are very informative and practical oriented. Keep doing .
Thank you, I will!
Outstanding !You just made Azure easy to learn. Thank you.
Awesome, thank you!
ADF is but just one part of about 100 significant tools and actions in Azure. :-(
Hi Adam, is it possible to create these pipelines as code as well? Or somehow create them from my actual Azure pipeline? It would be sheerly insane (but it is a Microsoft product) to require and maintain two pipeline one that’s yiur Azure pipeline for CI and CD and one for the ADF. I really would want the Azure pipeline to be able to fill/create the ADF pipeline. But I haven’t found anything yet.
Your video content is awesome!!! Your video is very useful to understand Azure concept specially for me who just started Azure journey.
I would like to have one video where we can see how to deploy code from Dev to QA to Prod. How to handle connection string, parameter etc during deployment.
thanks again for wonderful video content.
ADF CI/CD is definitely on the list. It's a bit complex topic to get it right so it might take time to prepare proper content around this. Thanks for watching and suggesting ;)
very good explanation Adam. keep it up.
Thanks, will do!
@@AdamMarczakYT Adam do we have trail version of Azure for Learning purpose?
Wow ! Fantastic explanation.
Glad you liked it!
Thanks!
These videos are great. Helping me so much! Thanks Adam
Glad you like them!
Another awesome video. The best part of Mapping Data Flow was the Optimization...where we could do Partitioning.
Thank you! Glad you like it :)
Adam, great tutorial! Kudos!
Glad you liked it!
excellent explanation with simple scenario. Thank you.
Glad it was helpful!
So nice of your talent explaining the data flow in simple way. Thank you so much Mr.Adam.
It must be very challenging to do all this thing in English for you I imagine, Adam! Congratulations for pushing through despite the difficulty. 🙂
Wow,I like your video, I did it today. and I had good result. thanks for your good explanation.
Great job! Thanks!
Very good explaining the Data Flow. Thanks Mr.Adam.
As useful another Awesome video Adam !!!. Excellent. It was to the POINT !!!. Keep up the good work which you have been doing for plenty of users like me. Eagerly waiting for more similar videos like this from you !!!.
Can you please have some videos for Azure Search ...
Thank you so very much :) Azure Search is on the list but there is so many news coming from Ignite that I might need to change the order. Let's see all the news :).
best video on azure I have ever seen❤❤
Hi Adam, Thank for helping us in learning new technologies. You are awesome 👌🏻👌🏻👌🏻👏👏.
My pleasure!
Nice video Adam. Professional as always
Wow, thanks!
Hello Adam, I just finished this video. Very well done indeed. Thanks and regards. Bharat
Thanks Bharat :)
Great video! Thanks Adam!
My pleasure!
Brilliant way of explanation
Subscribed to your channel
Thank you, appreciated 🙏
👍 Its amazing , Practical implementation of Data Flow.
Thank you, Adam. As always, you rock.
Impeccable to know reg Mapping Data Flow, Thanks Adam!
My pleasure!
This was explained very well. thank you.
You're very welcome!
Amazing Video, we want other parts !
Very well explained and demonstrated. Really helpful to get started with Data flows.
that was actually not so hard. thanks man, you're awesome.
No! you are awesome! :)
Features are very interesting. want to try with the different partitioning techniques. Thank you for sharing such amazing stuff
My pleasure! Thanks! :)
Thank you so much for this. I subscribed immediately, very informative and straightforward azure info. Will definitely recommend your channel. Keep up the great work!
Awesome, thank you!
very well done on explaining principles of mapping data flows!!!
Thanks a lot!
I would say this is the best content I've seen so far!! Thank you so much for making it Adam!
Just wondering, is there a Crtl+Z or Crtl+Y command in case we did some changes in the dataflow and restore it to previous version?
Awesome, thanks! Unfortunately not, but you can use versioning in the data factory which will allow you to revert to previous version in case you broke something. Highly recommended. Unfortunately not reverts for specific actions.
@@AdamMarczakYT Excellent!! Thank you so much for your reply!
@@549srikanth I publish each time I create a significant new step in the pipeline and I use data preview before moving on to the next step. Also, you can , I think, export the code version of the entire pipeline. Presumably you can, then, paste that into a new Pipeline to resurrect your previous version.
Adam you have an ability at explaining complex things, this tutorial made my day, thanks
Glad it helped! Thanks!
Great! You are the best Adam.
Thank you so much Adam! this was very clear and great video and a big help for my interview and knowledge.
Very welcome! Thanks for stopping by :)
So helpful! Thank you very much Adam!
Hi Adam, few doubts. Please help me understand.
1. 10:04, After running the dataflow 1st time, there are 9125 rows got populated. Well, there is no output sink or output dataset associated with it dataflow yet, then where exactly those ingested rows are getting saved/populated?
2.15:04, after re-calculating "title" (by removing the year part), how come the previous original column (title) got disappeared? The modified title column should appear in addition to the previous original column (title) right?
hey 1. it's amount of rows loaded. 2. if you create new column with the same name it will replace old one. In this case we replaced title column.
Adam, Thanks for this excellent video. You explained almost every feature available there in data flows. Looking forward a video on Azure SQL DWH. I know it will be great to learn about it from you.
Glad it was helpful! I'm just waiting for new UI to come to public preview then the video will be done :)
Around 20:20, We can see there is just one partition, does Azure automatically decide the number of partitions it needs to divide the dataset into ? Also is it done at some cost i.e. more partitions cost more or is it complementary ?
Thank you for all the tutorials, I am binge watching them since 3 days now and thoroughly enjoying them ! Would love to see some tutorials for Synapse as well :) !
Wow should have waited before making the comment as you have explained it later in the video itself. Thank you Adam !
Glad it helped, thanks! :)
Wow..lucid explanation..
Glad you think so!
Love these videos so easy to understand, do you have a video on new XML connector
Great, thanks! Not yet, maybe in near future :)
Very nice tutorial 👍
Thank you! Cheers!
Thank you Adam Dzienkuje, this is a great tutorial.
Thanks buddy ...Great work
My pleasure
best tutorial ever... 💪🏻💪🏻💪🏻
Very useful. Thank you so much.
Glad it was helpful!
Awesome videos Adam, your videos are great help to learn Azure. Keep it up :)
Thanks, will do!
Your videos are really great and helped me understand lot of concepts of Azure. Can you please make one using SSIS package and show how to use that within Azure Data Factory
thanks for the great content!! you are the man :)
I appreciate that!
nice & detailed video.
Thank you!
Lovely bro!!
Thanks 🔥
Thank you, very helpful tutorials
Appreciate you content. Thanks.
My pleasure! :)
Great Video. Can you use data from a REST Api as a source for a Mapping Data Flow or does the source have to be a dataset on Azure?
Here is the list of supported data sources for MDF docs.microsoft.com/en-us/azure/data-factory/data-flow-source?WT.mc_id=AZ-MVP-5003556 . Just copy data from REST API to Blob and then start MDF pipeline using that blob path as a parameter.
Excellent tutorials
Good explanation there.
Great video Adam. What is the difference between Data Flow and Copy Activity within pipeline. When shall we go for Data flow creation instead of copy activity ?
Copy Activity just moves data in 1:1 fashion (or a subset of column). Data flow allows for data transformations/joining/aggregations of multiple input/outputs in a single step.
Amazing videos.
Glad you think so! :)
Hi Adam, so glad I found your channel. Your videos were a big help for achieving the AZ900 certificate. Now I am studying a lot to uplift my knowledge and get the Azure data engineer certificate. However, I have an important question! Data flows are expensive, sometimes clients don’t want to use this, are there alternatives to achieve the same result in azure data factory? Thank you very much!
Well you can't have the cookie and eat the cookie :) In my opinion it's not that expensive compared to other available tools.
@@AdamMarczakYT True! I am currently struggling with csv files that sometimes have extra spaces after the words in the header, this then gives error when doing a copy activity to Azure SQL Database. Do you have any idea to make my flow a bit more flexible so that it can deal with this? It needs some trimming in the header
I thought of doing a SELECT in a dataflow to then change to the correct header titles, but for this I need to know where the spaces will be in the future. So also not flexible.
Would you plan to make video for introduction of each transforamtion components? Thanks
Adam, FOr using transformation do I need to learn scala. Or just refer the documentation you specified for scala functions and write the transformation?
Documentation should be enough. MDF is targeting simple transformations so in most cases documentation alone will suffice.
For anyone wondering how to make the year check (or any check) in the second step more robust, you can exchange the following expressions using the 'case' expression as used below which says, if this expression evaluates as true, do this, else do something else.
Worth nothing here that in the first expression, there is only a true expression provided while the second expression has both true and false directives. As per the documentation on the 'case' expression: "If the number of inputs are even, the other is defaulted to NULL for last condition."
/* Year column expression */
/* If the title contains a year, extract the year, else set to Null */
case(regexMatch(title, '([0-9]{4})'),toInteger(trim(right(title, 6), '()')))
/* title column expression*/
/* If the title contains a year, strip the year from the title, else leave the title alone */
case(regexMatch(title, '([0-9]{4})'),toString(left(title, length(title)-7)), title)
Thanks Paul :) I used as simple example as possible for people who aren't fluent in scala but of course you always need to cover all possible scenarios. Sometimes I like to fail the transformation rather than continue with fallback logic as I expect some values to be present.
@@AdamMarczakYT Of course, I just wanted to see if I could take it a step further to align more closely with what would be needed in a production data engineering scenario and thought others may have the same idea. Thanks for the content! :)
Thanks, I bet people will appreciate this :)
Please also explain how to use data analytics in pipeline flow.
This is best content . Thank u so much
Thanks you. As to your question can you elaborate on data analytics part? What exactly would you like to see.
Anything like how to make function or procedure and how to use this in pipeline to execute .basic flow of pipeline by using analytics.
Hey, I do plan to have implementation videos like this in future. Although pipeline of videos is long so I can promise anything right now. I added this to the list of potential topics :) thanks!
okay, Thanks
Hello Adam, thanks a bunch for this excellent video. The tutorial was very thorough and anyone new can easily follow. I do have a question though. I am trying to replicate an SQL query into the Data Flow, however, I have had no luck so far.
The query is as follows:
Select ZipCode, State
From table
Where State in ('AZ', 'AL', 'AK', 'AR', 'CO', 'CA', 'CT'...... LIST OF 50 STATES);
I tried using Filter, Conditional Split and Exists transforms, but could not achieve the desired result. Being new to the Cloud Platform, I am having a bit of trouble.
Might I request you please cover topics like Data Subsetting/Filtering (WHERE and IN Clauses etc.) in your tutorials.
Appreciate your time and help in putting together these practical implementations.
Thank you Adam.
Thanks Adam !! very informative video.Liked it a lot..
Thanks and you are most welcome! Glad you hear it.
Hi Adam, Thanks for making this videos, very clear and concise. I have a question (sorry not related to this video) regarding Conditional split - Can the output stream activities, run in parallel ?
They typically run in parallel as it's Apache Spark behind the scenes.
@@AdamMarczakYT Thank you !
Adam, Your content is always easy to grab, excellent work mate. Could you please explain how to create a pipeline which has a copy activity followed by a mapping data flow activity.
Thanks, just drag and drop copy activity and data flow blocks on the pipeline and drag a line from copy to data flow activity.
I really like your tutorials. I have been looking for a "table partition switching" tutorial but haven't found any good ones. May be you could do one for us? I am sure it'll be very popular as there aren't any good ones out there and it is an important topic in certifications :-)
Great video.
Question: Under "New Datasets", is there a capability to drop data into Snowflake? I see S3, Redshift, etc.
I appreciate the video and feedback!
A quick question, Azure dataset seems only support already structured data, like CSV or JSON, what if my datasource is an unstructured text file that must be transformed into csv before being used? Is there a way to do this transformation (possibly python code) in data factory?
Hey, you can call azure databricks which can transform any file using Python/Scala/R etc. But data factory itself can't do it.
@@AdamMarczakYT Got it. Thanks a lot! It looks like I have to learn Spark :-)
Adam, is there a way to preserve the filename and just have it change the extension? For instance, I'm adding a column with datetime, but at the end I would like it to have the same file name, just parquet. Is there a way to do that?
Use expressions :) That's what they are for.
@@AdamMarczakYT Sorry if it was a dumb question, I'm still new to ADF. Ignore if it's too inane but is fileanem in the @pipeline parameter? I found one online but couldn't get it to parse.
Video is excellent. I want to know the problem statement which Data flow is solving?
Adam, excellent presentation of ADF concept. I find all your videos really helpful in understanding the ADF concept. One question in regards to the sink dataset in dataflow, how can I create dynamic folder in my blob storage based on the year, month and day when this dataflow was triggered?
Depends on what do you want to achieve. You can either set partitioning by date column which will split the data by date. Or if you want to put entire dataset in one folder using date then use formatDateTime expression like formatDateTime(utcNOw(), "yyyy/MM/dd") as path.
Hi Adam, please add some more contents about new features of dataflow, it's your channel only where I see azure add, no one teaches better than you do as I have compared with many channels.
Thank you! What kind of features would you think would be interesting to see?
can you please explain who to connect source dataset from azure data lake storage gen 2 tables in data flows of Azure data factory?
It's the same as blob storage, just create linked service and select Azure Table Storage and create dataset for it. Not that this is not supported for Mapping Data Flows.
Thanks for such good video
instead of scala functions is there a way we can use the pyspark functions for debug, BTW these are great videos thankyou
Unfortunately not at this time. if you need more complex constructs or different languages you need to use Databricks or HDInsight :)
Great job! Thanks for all
Thank you too!
very good explanation..keep doig
Thanks. Will do.
Hi Adam, that's a great tutorial, many thanks for it. I have a question that can we write the transformation functions in different language like Python or R instead of Scala? If yes can you please share some details on it?
Unfortunately not right now :( If you need those then use Azure Databricks instead.
Or a python/R script in a batch process, right? Databricks would be better option of you need spark, since its also more expensive than batch
How do you solve the parallel execution of your pipeline when triggered by events to avoid duplicates?
You need to do this as part of your flow design. Unfortunately some things can't be solved by tools. Thanks for watching! :)
@@AdamMarczakYT Ok maybe handle it by Run ID I guess :)