Видео 78
Просмотров 156 463

Data Engineering Observability in Microsoft Fabric!

6:51

Visualise your Medallion Architecture with Task Flows in Microsoft Fabric

9:27

Testing Notebooks with Microsoft Fabric - Titanic Survivor Predictive Analytics - Part 3

12:53

Power BI Data Stories: Global Brand Insights: 20 Years of Financial Trends

16:04

Compelling Data Storytelling with Power BI: Titanic Survivors

4:53

10x Spark performance improvement in Microsoft Fabric

13:20

Microsoft Fabric + Data Mesh - a perfect fit? ❤️ or 💔?

In this video, Barry Smart, Director of Data & AI, examines how Microsoft Fabric can support a Data Mesh vision for data and analytics.
Overview:
As we transition to a digital age powered by data, we discuss how data professionals can help their organizations thrive. Discover the capabilities of Microsoft Fabric, its role in creating data products, and how it aligns with the core principles of Data Mesh. Learn about the advancements in technology, the integration of open-source tools, and the importance of a socio-technical approach to data. Don't miss insights on federated computational governance, and DataOps. Watch to understand how to drive value and innovation in your organization with...

Видео

Data Engineering Observability in Microsoft Fabric!

6:51

Data Engineering Observability in Microsoft Fabric!

Просмотров 433Месяц назад

In part 5 of this course Barry Smart, Director of Data and AI, walks through a demo showing how to improve observability of your data engineering processes using readily available technology and platforms in with Microsoft Fabric and Azure. Barry begins the video by explaining that one of the pitfalls of using Fabric Notebooks in an operational setting is that it can impede observability of wha...

Visualise your Medallion Architecture with Task Flows in Microsoft Fabric

9:27

Visualise your Medallion Architecture with Task Flows in Microsoft Fabric

Просмотров 505Месяц назад

Adopting Task Flows in Fabric: Titanic Diagnostic Analytics Series Part. 4 In this episode of our Titanic Diagnostic Analytics series, we dive into the new task flow feature in Fabric to optimize workspace organization and implement reference architectures. We'll recap our ongoing data product development aimed at creating an interactive Power BI report to analyze Titanic passenger survival pat...

Testing Notebooks with Microsoft Fabric - Titanic Survivor Predictive Analytics - Part 3

12:53

Testing Notebooks with Microsoft Fabric - Titanic Survivor Predictive Analytics - Part 3

Просмотров 4172 месяца назад

In part three of the Titanic Diagnostic Analytics series, Barry Smart delves into testing data engineering functionality in Microsoft Fabric Notebooks. This video focuses on developing, testing, and automating code to project data to the gold layer of the lake, employing test-driven development principles. We also outline the benefits of using Fabric Notebooks and how to overcome potential pitf...

16:04

Power BI Data Stories: Global Brand Insights: 20 Years of Financial Trends

Просмотров 2,1 тыс.2 месяца назад

This walkthrough of The Global Brand Insights Report using financial data ingested from the Yahoo Finance API, examines how design and visualisation choices data stories with Power BI compelling. Explore the creative journey behind the Global Brand Insights Report on Power BI, showcased on the Power BI Data Stories Gallery: community.fabric.microsoft.com/t5/Data-Stories-Gallery/Global-Brand-Ins...

Compelling Data Storytelling with Power BI: Titanic Survivors

4:53

Compelling Data Storytelling with Power BI: Titanic Survivors

Просмотров 3374 месяца назад

Creative Walkthrough: Titanic Passenger Diagnostic Report in Power BI Explore the Titanic Passenger Diagnostic Report created using Power BI and published to the Data Stories Gallery. In this video, Paul Waller, walks you through the design decisions and data visualization techniques used, inspired by an interactive museum exhibit. Learn about the demographics, survival rates, and the aftermath...

10x Spark performance improvement in Microsoft Fabric

13:20

10x Spark performance improvement in Microsoft Fabric

Просмотров 6034 месяца назад

Boosting Apache Spark Performance with Small JSON Files in Microsoft Fabric. Learn how to achieve a 10x performance improvement when ingesting small JSON files in Apache Spark hosted on Microsoft Fabric. Ian Griffiths, Technical Fellow at endjin, shares insights and techniques to overcome Spark's challenges with numerous small files, including parallelizing file discovery and optimizing data lo...

Microsoft Fabric: Good Notebook Development Practices 📓 (End to End Demo - Part 8)

13:31

Microsoft Fabric: Good Notebook Development Practices 📓 (End to End Demo - Part 8)

Просмотров 3 тыс.5 месяцев назад

Microsoft Fabric End to End Demo - Part 8 - Good Notebook Development Practices Notebooks can very easily become a large, unstructured dump of code with a chain of dependencies so convoluted that it becomes very difficult to track lineage throughout your transformations. With a few simple steps, you can turn notebooks into a well-structured, easy-to-follow repository for your code. In this vide...

Microsoft Fabric: Machine Learning Tutorial - Part 2 - Data Validation with Great Expectations

14:03

Microsoft Fabric: Machine Learning Tutorial - Part 2 - Data Validation with Great Expectations

Просмотров 1,5 тыс.6 месяцев назад

In part 2 of this course, Barry Smart, Director of Data and AI, walks through a demo showing how you can use Microsoft Fabric to set up a "data contract" that establishes minimum data quality standards for data that is being processed by a data pipeline. He deliberately passes bad data into the pipeline to show how the process can be set up to "fail elegantly" by dropping the bad rows and conti...

Microsoft Fabric: Machine Learning Tutorial - Part 1 - Overview of the Course

8:32

Microsoft Fabric: Machine Learning Tutorial - Part 1 - Overview of the Course

Просмотров 1,4 тыс.6 месяцев назад

In this video Barry Smart, Director of Data and AI, provides an overview of the end to end demo of Microsoft Fabric that we will be providing as a series of videos over the coming weeks. The demo will use the popular Titanic data set to show off features across both the data engineering and data science experiences in Fabric. This will include Notebooks, Pipelines, Semantic Link, MLflow (Experi...

20:38

Data is a socio-technical endeavour

Просмотров 376 месяцев назад

Our experience shows that the the most successful data projects rely heavily on building a multi-disciplinary team.

No Code Low Code is Software DIY How Do You Avoid a DIY Disaster

14:57

No Code Low Code is Software DIY How Do You Avoid a DIY Disaster

Просмотров 776 месяцев назад

No-code/Low-code democratizes software development with little to no coding skills needed. But how do you evaluate if software DIY is the right choice for you? From the blog post: endjin.com/blog/2024/03/no-code-low-code-software-diy

7:15

How to Build Navigation into Power BI

Просмотров 716 месяцев назад

Explore a step-by-step guide on designing a side nav in Power BI, covering form, icons, states, actions, with a view to enhancing report design & UI. From the blog post: endjin.com/blog/2024/03/how-to-build-navigation-in-power-bi

5:47

Data & AI Engineering Maturity

Просмотров 236 месяцев назад

As data and AI become the engine of business change, we need to learn the lessons of the past to avoid expensive failures. From the blog post: endjin.com/blog/2024/03/data-ai-engineering-maturity

The Heart of Reactive Extensions for .NET (Rx.NET)

5:19

The Heart of Reactive Extensions for .NET (Rx.NET)

Просмотров 1,1 тыс.7 месяцев назад

The Heart of Reactive Extensions for .NET (Rx.NET)

Microsoft Fabric: Processing Bronze to Silver using Fabric Notebooks

10:48

Microsoft Fabric: Processing Bronze to Silver using Fabric Notebooks

Просмотров 6 тыс.10 месяцев назад

Microsoft Fabric: Processing Bronze to Silver using Fabric Notebooks

Microsoft Fabric: Role of the Silver Lakehouse in the Medallion Architecture

6:14

Microsoft Fabric: Role of the Silver Lakehouse in the Medallion Architecture

Просмотров 2,6 тыс.10 месяцев назад

Microsoft Fabric: Role of the Silver Lakehouse in the Medallion Architecture

15:12

Microsoft Fabric: Local OneLake Tools

Просмотров 2,8 тыс.Год назад

Microsoft Fabric: Local OneLake Tools

Show & Tell: A Brief Intro to Tensors & GPT with TorchSharp

24:55

Show & Tell: A Brief Intro to Tensors & GPT with TorchSharp

Просмотров 778Год назад

Show & Tell: A Brief Intro to Tensors & GPT with TorchSharp

Microsoft Fabric: Creating a OneLake Shortcut to ADLS Gen2

8:42

Microsoft Fabric: Creating a OneLake Shortcut to ADLS Gen2

Просмотров 6 тыс.Год назад

Microsoft Fabric: Creating a OneLake Shortcut to ADLS Gen2

Microsoft Fabric and The Pace of Innovation - The Decision Maker's Guide - Part 3

11:51

Microsoft Fabric and The Pace of Innovation - The Decision Maker's Guide - Part 3

Просмотров 760Год назад

Microsoft Fabric and The Pace of Innovation - The Decision Maker's Guide - Part 3

Microsoft Fabric & Generative AI - The Decision Maker's Guide - Part 2

9:20

Microsoft Fabric & Generative AI - The Decision Maker's Guide - Part 2

Просмотров 1,1 тыс.Год назад

Microsoft Fabric & Generative AI - The Decision Maker's Guide - Part 2

Hedging your Microsoft Fabric Bet - The Decision Maker's Guide - Part 1

17:51

Hedging your Microsoft Fabric Bet - The Decision Maker's Guide - Part 1

Просмотров 2,3 тыс.Год назад

Hedging your Microsoft Fabric Bet - The Decision Maker's Guide - Part 1

Microsoft Fabric: Ingesting 5GB into a Bronze Lakehouse using Data Factory - Part 3

14:27

Microsoft Fabric: Ingesting 5GB into a Bronze Lakehouse using Data Factory - Part 3

Просмотров 7 тыс.Год назад

Microsoft Fabric: Ingesting 5GB into a Bronze Lakehouse using Data Factory - Part 3

Microsoft Fabric: Inspecting 28 MILLION row dataset in Bronze Lakehouse - Part 2

12:13

Microsoft Fabric: Inspecting 28 MILLION row dataset in Bronze Lakehouse - Part 2

Просмотров 7 тыс.Год назад

Microsoft Fabric: Inspecting 28 MILLION row dataset in Bronze Lakehouse - Part 2

Microsoft Fabric: Lakehouse & Medallion Architecture - Part 1

12:32

Microsoft Fabric: Lakehouse & Medallion Architecture - Part 1

Просмотров 16 тыс.Год назад

Microsoft Fabric: Lakehouse & Medallion Architecture - Part 1

A 10 minute Tour Around Microsoft Fabric

12:37

A 10 minute Tour Around Microsoft Fabric

Просмотров 6 тыс.Год назад

A 10 minute Tour Around Microsoft Fabric

Microsoft Fabric Briefing - after 6 months of use on the private preview.

19:10

Microsoft Fabric Briefing - after 6 months of use on the private preview.

Просмотров 23 тыс.Год назад

Microsoft Fabric Briefing - after 6 months of use on the private preview.

Reactive Extensions API in depth: Marble Diagrams, Select() and Where()

4:50

Reactive Extensions API in depth: Marble Diagrams, Select() and Where()

Просмотров 115Год назад

Reactive Extensions API in depth: Marble Diagrams, Select() and Where()

18:47

Rx .NET Workshop: 08 Schedulers

Просмотров 322Год назад

Rx .NET Workshop: 08 Schedulers

@Milkfloatbandit 5 дней назад
I've really enjoyed your videos in this series and would love to see more!
@torkilj 6 дней назад
Really nice series, when is the next episode due?
@torkilj 6 дней назад
TDD in Data Analysis 😍
@endjin 4 дня назад
You may enjoy another talk of ours: endjin.com/what-we-think/talks/how-to-ensure-quality-and-avoid-inaccuracies-in-your-data-insights
@DanielSilva-sr1dm 17 дней назад
Great presentation and Video!! The slides are fantastic!! Any hints where to find a pdf with this?
@younesshajjoubi5198 25 дней назад
Hello Ed and thank you for the video, I have a question : Is it possible to see Notebook and the others "non data related" items in Azure Storage Explorer ?
@Neilakirby 27 дней назад
Awesome series. I work for a large organisation and was wondering how to implement the medallion architecture. Would it be best to have workspaces per domain/groups e.g. Transport/Finance/HR each with bronze/silver/gold?
@hansschenker Месяц назад
It is no question that Jeffrey van Gogh painted the marbles since his Grand-Grand-Grand-Grand father was the famous painter Van Gogh!!
@applicitaaccount1258 Месяц назад
Great video and explanation of the data mesh and dataOps principals (if anyone wants more info on this watch the other Titantic videos here), most of the phases are being covered.
@endjin Месяц назад
Thank you for watching, if you enjoyed this episode, please hit like 👍subscribe ✅ and turn on notifications 🔔- it helps let the RUclips algorithm know that our content is worth watching! 🙏
@nitishjaiswal1703 Месяц назад
how do i create such best report , i have ever seen , pls share the technical session of the creation of this report to me.
@applicitaaccount1258 Месяц назад
What a great bit of investigative problem-solving - nice work!
@applicitaaccount1258 Месяц назад
For people like me who aren't user experience gifted there was lots of great ideas and value here, as someone else commented, less is definitely more!
@endjin Месяц назад
Yes, we obviously very much agree!
@applicitaaccount1258 Месяц назад
This video is great, it helps me take principles and practices we understand and believe in the C# world and apply them to our fabric projects - nice work!
@endjin Месяц назад
Excellent! Yes, that's very much the point. Many of the new cloud native analytics platforms allow data folk to adopt and embrace the engineering practices that software folk have enjoyed for the past 25 years! 🎉
@applicitaaccount1258 Месяц назад
A really valuable video to help with an often overlooked part of using notebooks!
@endjin Месяц назад
We're really glad you found it useful!
@endjin Месяц назад
Thank you for watching, if you enjoyed this episode, please hit like 👍subscribe, and turn notifications 🔔 on it helps let the RUclips algorithm know that our content is worth watching!
@endjin Месяц назад
Thank you for watching, if you enjoyed this episode, please hit like 👍subscribe, and turn notifications on 🔔it helps us more than you know. 🙏
@endjin Месяц назад
Thank you for watching, if you enjoyed this episode, please hit like 👍subscribe, and turn notifications on 🔔it helps us more than you know. 🙏
@endjin Месяц назад
Thank you for watching, if you enjoyed this episode, please hit like 👍subscribe, and turn notifications on 🔔it helps us more than you know. 🙏
@endjin Месяц назад
Thank you for watching, if you enjoyed this episode, please hit like 👍subscribe, and turn notifications on 🔔it helps us more than you know. 🙏
@endjin Месяц назад
Thank you for watching, if you enjoyed this episode, please hit like 👍subscribe, and turn notifications on 🔔it helps us more than you know. 🙏
@SothearithKONGMrMuyKhmer 2 месяца назад
Cool! Thanks for sharing this!
@mohamedtmam6379 2 месяца назад
vary good practice but code files not uploaded . please to help me to understand code upload to github
@pankajmaheshwari128 2 месяца назад
Just now i have completed this whole playlist as i have to start with one new project on fabric. Thanks a lot for providing these framework level information. Highly appreciated. Please do continue updating this playlist with more insight. liked and subscribed 😊
@ThiagoOliveira-fj9me 2 месяца назад
That is brilliant -- thanks for making this series of such quality on Microsoft Fabric.
@davidudosen7289 2 месяца назад
Is it possible to do a step by step video on this ? 🥹
@endjin 2 месяца назад
What aspects are you particularly interested in?
@Ajay-ft3hb 2 месяца назад
We are interested to know how the onelake process works. Appreciate it very much.
@vesperpiano 2 месяца назад
Hello. Thanks for your interest. We'll try to find time to do another demo to show how we could move the solution onto Microsoft Fabric and use OneLake. There are two other video series in our channel which step through data engineering features in Microsoft Fabric including how we make use of OneLake.
@amit167 2 месяца назад
hi, do you have a source code somewhere in github that i can experient on..
@beyondknowing9063 2 месяца назад
Perfect example of how simplicity is the powerful. unlike other dashboard crowded with too much information yours just take care of one element at a page. Great learning.
@vesperpiano 2 месяца назад
Thanks for the feedback! I worked with my colleague Paul on this project who is a visual design expert. My inclination is to throw more and more data onto the page. But Paul always challenges that. He makes sure we adhere to a user experience (UX) that is accessible, intuitive, delivers a user journey and visually pops. When we build reports like this for clients we also have a domain expert involved. So we do lots of small iterations to get the right balance between those three concerns: business goals, analytics and UX. This multi-disciplinary approach really helps to make sure you get a good product at the end.
@lifewithoutborders1348 3 месяца назад
from where you got other data
@endjin 2 месяца назад
Hi there. The Postcode data comes from: geoportal.statistics.gov.uk/datasets/a8a2d8d31db84ceea45b261bb7756771/about Ed
@geirforsmo8749 3 месяца назад
Hi, great video series. I am really enjoyed by watching your contribution of knowlegde. One thing I need to ask about is when you read the csv file, it has no headers. Then you do the apply_transformation things which I believe need header info to work properly, or am I wrong? I can't see any steps or code that add header info before you do the df = PricePaidWrangler.apply_transformations(df). Can you comment on this?
@sgfgdsfae 3 месяца назад
Great content! Fingers crossed you release next parts in the series soon
@endjin 3 месяца назад
Part 3 is currently being worked on!
@moeeljawad5361 3 месяца назад
Hello Nice video. could you please explain from where exactly did you get the dfs url that you had pasted in the ADLS shortcut connector? in my url i am having a blob part in it and that is preventing me from doing the connection. Thanks
@endjin 2 месяца назад
Hi there, You should be able to just change "blob" to "dfs" if your storage account is ADLS Gen2 enabled. If it's not ADLS Gen2 enabled, then sadly you can't create a shortcut to a Blob Storage account at the moment. Ed
@moeeljawad5361 2 месяца назад
@@endjin Thanks for your reply. I found out that my storage account is not ADLS Gen2 enabled.
@MdRahman-wl6qi 3 месяца назад
how to move data from one lakehouse's to another lakehouse table using pyspark?
@Ben-p4p 2 месяца назад
Connect to both lakehouses on the left pane of the notebook. Use spark sql to select and manipulate the data from your bronzelake, then essentially do the merge similar to what is done in the video while selecting the silverlake.
@endjin 2 месяца назад
Further to Ben's response, I would recommend using two-part naming (<lakehouse_name>.<table_name>) when transferring this data. However, if you're just wanting to expose the data from one lakehouse in another lakehouse, remember you can avoid copying data altogether by using Shortcuts. And if you need to do this programmatically, there's an API for this: learn.microsoft.com/en-us/rest/api/fabric/core/onelake-shortcuts/create-shortcut?tabs=HTTP#create-shortcut-one-lake-target-example Ed
@ullajutila1659 3 месяца назад
The video was ok, but it did not explain how the ADLS storage account networking needs to be configured. More specifically, how to configure it in a secure manner, without allowing access from all networks.
@endjin 2 месяца назад
Hi there, Thanks for the feedback! This video was recorded at a time when it wasn't possible to connect to ADLS Gen2 if it wasn't publicly accessible. However, now that's changed: if you'd like to understand how to access a partially restricted ADLS account, please see the documents here: learn.microsoft.com/en-us/fabric/security/security-trusted-workspace-access. If you've fully disabled public access, there's no easy way to create a Shortcut that I'm aware of, sadly. Hopefully that functionality will come. Ed
@ravishkumar1739 3 месяца назад
Hi @endjin great videos, have you uploaded the architecture diagram file anywhere that I can download and reuse for my own projects?
@endjin 2 месяца назад
Not yet - but it will come! Thanks for your comment, Ed
@MuhammadKhan-wp9zn 3 месяца назад
How can i contact you pls let me know Thanks
@MuhammadKhan-wp9zn 4 месяца назад
This is a framework level work, not sure how many will understand and appreciate your efforts you did to create a video, but I will highly appreciate your thoughts and work and at one point I was thinking if I got a chance to create a framework how I will do, you gave very nice guide line here, once again thank you for video, I would like to see your other videos too.
@endjin 2 месяца назад
Thanks for the comment! Ed
@vinzent345 4 месяца назад
Is there an option to connect from your local machine directly to the synapse spark cluster? Doesn't seem that debug friendly, having to compile & upload it every time. It almost feels more sensible to host your own autoscaling Spark Cluster in Azure Kubernetes Services. If I do so, I can interact directly with the Cluster and build Sessions locally. What do you think?
@idg10 4 месяца назад
In this scenario, it would make more sense to run Spark locally. There are a few ways you can do that, but as you'd expect it's not entirely straightforward, and not something easily addressed in a comment.
@ManojKatasani 4 месяца назад
very clean explanation, appreciate your efforts. is there any chance we get code on each layer ( Bronze to sliver etc.. advance thank you
@ManojKatasani 4 месяца назад
you are the best
@endjin 2 месяца назад
Thanks for the kind words! Ed
@rodrihc 4 месяца назад
Thanks for the video! Is there any alternative to run the test notebooks of synapse from the cicd pipeline in azure devops?
@jamesbroome949 4 месяца назад
There's a couple of ways to achieve this - neither are immediately obvious but definitely possible! There's no API for just running a notebook in Synapse, but you can submit a Spark batch job via the API. However, this requires a Python file as input, so it might mean pulling your tests out of a Notebook and writing and storing them separately in an associated ADO repo: learn.microsoft.com/en-us/rest/api/synapse/data-plane/spark-batch/create-spark-batch-job?view=rest-synapse-data-plane-2020-12-01&tabs=HTTP Possibly an easier route would be to create a separate Synapse Pipeline definition that runs your test notebook(s) and use the API to trigger that pipeline run from your ADO pipeline. This is a straightforward REST API but operates asynchronously, so you'd need to poll for completion as the pipeline/tests are running: learn.microsoft.com/en-us/rest/api/synapse/data-plane/pipeline/create-pipeline-run?view=rest-synapse-data-plane-2020-12-01&tabs=HTTP Hope that helps!
@YvonneWurm 5 месяцев назад
Do you know how to view the definition of the view or stored procedures?
@jamesbroome949 4 месяца назад
Hi - I don't believe there's a way in Synapse Studio to automatically script out the definitions like you can do in, say SQL Server Management Studio. But you can see the column definitions for you View if you find your database under the Data tab and expand the nodes in the explorer. Hope that helps!
@datasets-rv7jf 5 месяцев назад
looking forward to many more of this!
@endjin 5 месяцев назад
Barry has ~9 parts planned!
@ThePPhilo 5 месяцев назад
Great videos 👍👍 Microsoft advocate using seperate workspaces for bronze, silver and gold but that seems to be harder to achieve due to some current limitations. If we go with a single workspace and a folder based set up like the example will it be hard to switch to seperate workspaces in future? Is there any prep we can do to make this switch easier going forward (or would there be no need to switch to a workspace approach)?
@edfreeman7867 2 месяца назад
Hi there, Thanks for your comment! It's a great question. Personally, unless there are strong requirements (e.g. high data sensitivity/unique security requirements) for splitting zones apart into separate workspaces, I would default to one workspace. In my experience, the same team is often involved in managing all three layers of the Lakehouse, and "end-users" mostly only get access to semantic models (or the "Gold" layer at a push), so giving every zone its own workspace isn't really justified. Deployment also becomes trickier when multiple workspaces are involved. All that being said, I can appreciate there are scenarios where multiple is more suitable. W.r.t. how to design for the future: my first comment would be "only do it if you need to". No need to change tack just for the sake of it. If a single workspace is working for you then just stick with it. But if you do need to switch for whatever reason, then sadly there'll always be a significant migration overhead. The best thing you can do is have well structured workspaces and notebooks like I've shown here. Highlight the artifacts that are relevant to each layer of the architecture so you can get a picture of the interdependencies. Deployment pipelines / REST APIs will be your friend if you have to migrate too - worth getting familiar with these if you haven't already. The beauty of the medallion architecture is that the core structure of your data pipelines stays the same, whatever workspace infrastructure architecture you opt for! Hope this helps, Ed
@ramonsuarez6105 5 месяцев назад
Thanks a lot Barry. Great video. I couldn't find the repository for the series among your Github repos. Will there be one?
@vesperpiano 5 месяцев назад
Thanks for the feedback. Glad to hear you are enjoying it. Yes - we are planning to release the code for this project on Git at some point soon.
@StefanoMagnasco-po5bb 5 месяцев назад
Thanks for the great video, very useful. One question: you are using PySpark in your notebooks, but how would you recommend modularizing the code in Spark SQL? Maybe by defining UDFs in separate notebooks that are then called in the 'parent' notebook?
@endjin 5 месяцев назад
Sadly you don't have that many options here without having to fall back to Python/Scala. You can modularize at a very basic level using notebooks as the "modules", containing a bunch of cells which contain Spark SQL commands. Then call these notebooks from the parent notebook. Otherwise, as you say, one step further would be defining UDFs using some Python and then using spark.udf.register to be able to invoke them from SQL. Ed
@applicitaaccount1258 5 месяцев назад
Really looking forward to this series, thanks for taking the time to put it together. I really enjoy the pace and detail level if the conte t @endjin put together.
@applicitaaccount1258 5 месяцев назад
Great series, what the naming convention you are using in the full version of the solution ? I noticed the LH is prefixed with HPA
@endjin 5 месяцев назад
The HPA prefix stands for "House Price Analytics", although the architecture diagram on the second video has slightly old names, as you've probably noticed. The full version uses <medallion_layer>_Demo_LR, where LR stands for "Land Registry". Ed
@applicitaaccount1258 5 месяцев назад
@@endjin - Thanks for the clarification Ben.
@kingmharbayato643 5 месяцев назад
Finally someone made this video!! Thank you for doing this.
@malleshmjolad 5 месяцев назад
Hi Ed, We have loaded few tables using synapse link into adls gen2 and created shortcut to access the adlsgen2 files in fabric,but while loading the files into tables,we are not getting the column names for the tables and it is showing as c0,c1....etc which is causing an issue,can you please give some insights on how to overcome this and load the tables with metadata also
@endjin 5 месяцев назад
Hi - thanks for the comment! Which Synapse Link are you using? Dataverse? If so, this uses the CDM model.json format which doesn't include header rows in the underlying CSV files. You would have to read the shortcut data, apply the schema manually, and then write the data out to another table (inside a Fabric notebook or something) if you wanted to use that existing data. However, if you're using Synapse Link for Dataverse, you should instead consider using the new "Link to Microsoft Fabric" feature available in Dataverse: learn.microsoft.com/en-us/power-apps/maker/data-platform/azure-synapse-link-view-in-fabric. This will include the correct schema.

endjin

Комментарии