- Видео 20
- Просмотров 167 613
Data Engineering With Nick
США
Добавлен 1 авг 2020
Data Engineering, Azure, Data Pipelines, Python, Databricks
Teaching Data Engineering, Azure, Python, SQL, Databricks
Teaching Data Engineering, Azure, Python, SQL, Databricks
How to Work with JSON Data in SQL Server (Simple and Complex JSON)
This video goes over how to work with JSON data in SQL Server. It covers how to work with and parse simple JSON, complex JSON (using OPENJSON and Cross Apply), and finally how to modify JSON data.
Links:
-Original Pokemon JSON data modified from: github.com/Biuni/PokemonGO-Pokedex
Links:
-Original Pokemon JSON data modified from: github.com/Biuni/PokemonGO-Pokedex
Просмотров: 609
Видео
Azure Data Factory CI/CD Process with Azure Pipelines Using Linked Templates (New Template Specs)
Просмотров 8483 месяца назад
This video goes over how to write code to package up and promote a DEV Azure Data Factory to a UAT Data Factory and PROD Data Factory using Linked ARM Templates via a new Linked Template Specs process. Linked Templates are needed when your Data Factory size is over 4MB. This process goes over how to deploy large Data Factories. If your Data Factory size is under 4MB, use the normal CI/CD proces...
Learn Azure Functions Python V2 (Part 2: Deploy, Configure, and Use in Azure)
Просмотров 6 тыс.6 месяцев назад
This video goes over how to deploy, configure, and use the Python V2 Functions in the cloud and builds on the part 1 video (setting up and using the functions locally: ruclips.net/video/I-kodc4bs4I/видео.html). The video covers the following: -What Azure resources are required before deploying to the cloud -How to deploy local Python V2 Functions to the cloud -How to use and run the functions i...
Azure Pipelines Download Latest Successful Artifact From Another Pipeline Automatically
Просмотров 9368 месяцев назад
This video goes over how to write code to automatically find and download the latest successful artifact from an Azure Pipeline to another pipeline and an option to manually pass in a specific build instead. Links: -GitHub Code Repo: github.com/DataEngineeringWithNick/AzurePipelineDownloadOtherPipelineArtifact -Install Azure CLI Extensions: learn.microsoft.com/en-us/cli/azure/azure-cli-extensio...
Learn Azure Functions Python V2 (Local Setup and Examples)
Просмотров 24 тыс.11 месяцев назад
This video goes over how to setup and work with Azure Functions Python V2 locally. Includes HTTP and storage example functions. Links: -Part 2 RUclips Video (Deploy, Configure and Use in Azure): ruclips.net/video/_349bwtFkE8/видео.html -Install Azure Functions Core Tools: learn.microsoft.com/en-us/azure/azure-functions/functions-run-local?tabs=windows,isolated-process,node-v4,python-v2,http-tri...
Complete Azure Data Factory CI/CD Process (DEV/UAT/PROD) with Azure Pipelines
Просмотров 35 тыс.Год назад
This video goes over how to write code to package up and promote a DEV Azure Data Factory to a UAT Data Factory and PROD Data Factory. Links: -GitHub repo code: github.com/DataEngineeringWithNick/DataFactoryCICD -Data Factory automated publishing CI/CD documentation: learn.microsoft.com/en-us/azure/data-factory/continuous-integration-delivery-improvements -Npm Data Factory utilities package: ww...
How to Install Python on Windows (and Fix Errors)
Просмотров 5 тыс.Год назад
In this video, I'll show you how to install Python on Windows, which version to select and how to fix issues when Python is installed but not running correctly. Link for the Python Windows downloads: www.python.org/downloads/windows/
Azure Data Factory Pipeline Bad Request Null Error Fixed
Просмотров 1,4 тыс.Год назад
If you've worked with Azure Data Factory long enough, you will hit the bad request null error when trying to run your pipeline. In this video, I'll show you how to fix it including the most common reasons it happens and how to debug it in a larger pipeline with multiple activities. Hope this helps!
How To Use GitHub Action Secrets In Your Python Script In 3 Steps
Просмотров 7 тыс.Год назад
Step-by-step guide on how to use GitHub Action Secrets in a Python script that can be used for any Python script.
How to Run Python Scripts in GitHub Action Workflows
Просмотров 17 тыс.Год назад
This video goes over how to run Python scripts in GitHub Action Workflows and how to do the following: - Use GitHub Action Secrets in your Python Script - Run the script with the Setup Python Action - Run with PowerShell - Run Multiple Python Versions and OS either one at a time or in parallel - Cache Dependencies GitHub Repo Link: GitHub repo: github.com/Ludwinic1/GitHub-Actions-Python
Purview Automation Python Library to Help Scale and Automate Azure Purview
Просмотров 1,7 тыс.Год назад
Purview Automation is a Python library designed to be simple to use and makes scaling and automating Azure Purview easier. Create, delete, scale, rollback and automate Purview collections and delete assets in collections! Links: Purview Automation Repo: github.com/Ludwinic1/purviewautomation Azure Purview Info: learn.microsoft.com/en-us/azure/purview/overview Azure Service Principal Info: learn...
Code Runner Visual Studio Code Python Error Fixed
Просмотров 15 тыс.2 года назад
This video goes over how to configure Code Runner to run correctly in your Python virtual environments and how to fix some common errors you'll see when setting it up.
Azure Data Pipeline Overview
Просмотров 9 тыс.4 года назад
This is the first video of an eight part video series on how to build an Azure data pipeline from scratch. In this video, we go over what the completed pipeline will look like and execute the completed pipeline to see it in action! Below is the link to the entire video series: ruclips.net/video/7tvg-UMdes0/видео.html
Working with Databricks and Writing a Python Script to Clean/Transform Data
Просмотров 28 тыс.4 года назад
This is the forth video of an eight part video series on how to build an Azure data pipeline from scratch. In this video, we load data from the Azure Data Lake, write a Python script to clean/transform it and then write the clean data back into a new Data Lake folder. We also go over some helpful Databricks tips. Below is the link to the entire video series: ruclips.net/video/7tvg-UMdes0/видео....
Setup AzCopy to Upload Local Files to Azure
Просмотров 1,2 тыс.4 года назад
This is the seventh video of an 8 part video series on how to build an Azure data pipeline from scratch. In this video, we setup AzCopy (command line utility) in order to upload files to the cloud easily and quickly. Below is the link to the entire video series: ruclips.net/video/7tvg-UMdes0/видео.html
Create Data Factory and an Azure SQL Database
Просмотров 2,5 тыс.4 года назад
Create Data Factory and an Azure SQL Database
Building Data Factory Pipeline Steps and Creating an Event Trigger
Просмотров 1,5 тыс.4 года назад
Building Data Factory Pipeline Steps and Creating an Event Trigger
Create an Azure Data Lake and Key Vault
Просмотров 3,7 тыс.4 года назад
Create an Azure Data Lake and Key Vault
Create and Connect Databricks to an Azure Data Lake
Просмотров 5 тыс.4 года назад
Create and Connect Databricks to an Azure Data Lake
this did not work for dedicated App Service plan, last step of the publish is not syncing triggers. Tried manually restarting without any luck. also tried adding pip versions still no luck. @MrJossSo
you're a life-saver. that looks like exactly what i need! thank you so much!
Could you share a short video on how to do it via Azure Pipeline (yml)
You're a life saver man, I waas struggling to get started with the python programming model, went through numerous documentations and videos just a simple waste off, found your video finally ❤
This is an awesome explanation. I was having trouble getting the values when there's another JSON array inside, and I was finally able to do it after watching your video. Thank you so much. Cheers from Brazil.
Brilliant videos Nick.
Very nice two videos on functions, specially the managed identity connection here! I'll be deploying a function to get data from an API and ingest it into adls, let's see how it goes. Also that video on the ci/cd pipelines for Functions would be great!
this video is gold, jumping into part 2 right now
Hi Nice, Great Video, i am getting the following error while running the "func start". Could you please help me with this issue. "Error building configuration in an external startup class. [2024-11-26T09:11:16.992Z] Error building configuration in an external startup class. System.Net.Http: An error occurred while sending the request. System.Private.CoreLib: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.. An existing connection was forcibly closed by the remote host. [2024-11-26T09:11:17.096Z] A host error has occurred during startup operation 'e87ed335-a090-4df3-bc55-b9480f5f2d45'. [2024-11-26T09:11:17.098Z] Microsoft.Azure.WebJobs.Script: Error building configuratd by the remote host. Value cannot be null. (Parameter 'provider')"
One note: creating a secret in the key vault may not work for new users. First make sure to: step1: select the Resource group where creating Azure Key Vault -> select "Access Control(IAM) ->Add "Add role assignment" and for Role search for "Key Vault Administrator" -> select the member by searching name or email. or you may run in to: issue: The operation is not allowed by RBAC. If role assignments were recently changed, please wait several minutes for role assignments to become effective.
These videos are so good. Thank you
I tried it but no luck. I upload my trigger but it is not shown in the portal. any idea or suggestion?
Were you able to get this to work? Each time I push a new deployment my function is deleted in the function app. And the deployment logs say no http triggers even though there was one on the first deployment.
@@wanderer-1986 Yes, What i noticed is that,if you are not using docker, you should put the version of the libraries used, in the requirements.txt file. Once that is done, Azure recognizes the triggers. Hope this helps!
@@MrJossSo that fixed it for me. Thank you so much for responding!.I was stuck on this for the longest
Very crisp and helpful. Thanks a lot.
Nick did you also explored the CICD method by Jenkins? My organization does not support Azure devops.
Thanks for this would be interested in seeing how to deploy using azure pipelines and a code that is stored in azure repos
Amazing vids
Awesome! You explained very well a lot of scenarios 👍
Thank you Nick
We need more people like you
Very very helpful, thank you! I finally got to figure out what I was doing wrong, trying to install python for the last couple years!!! The joy I felt when the command line prompt "python --version" showed the python version! THANK YOU!!
Waiting for your next video cicd for azure function. Can you pls make one . Also can you think about making a course in Udemy in azure function with python
Bro this is so useful! I love humans like you! Keep it up!
I would to think you very very much, it was the more clear lesson I took, I subscribe
"This is an excellent tutorial! I've built a few Azure functions before, but after taking a six-month break, I had forgotten most of the process. This guide helped me not only refresh my memory but also taught me even more along the way!
This tutorial is by far the best that's currently out here on RUclips. Subscribed.
Another question I have: what is the best way to manage the parameters.json files when working with multiple people? If I make new linked service in dev, should I add it first to the parameters file?
Maybe a stupid question, but why do you not publish to the dev adf first?
Can u explain why its necessary to stop the current triggers?
Great
thanks Nick. I was able to replicate this logic to my personal repo to test my CICD pipeline. One last question: As we build the ARM Templates from Dev region to be deployed in higher environments which means if I created a pipeline or some resources with different name than Dev resource or created a new pipeline in higher region then during deployment process those resources will be deleted because of mismatch with the resources defined in Dev Armtemplate. Is my understanding correct?
Buddy, It's very useful. Thank you.
I have another question, the use of adf-uat-template-parameters.json or prod file need to be updated with actual value for each data factory. For Example: I should create the Linked services in UAT and PROD environment first and then mention them in my json file. Is that correct? Or I am going in wrong direction?
Hi. Wrong direction. The easiest way I’ve done it is when you add a new linked service (or global parameter or managed PE, etc.) in your dev data factory, in data factory, select the manage tab (toolbox icon) then arm template then extract arm template and grab the parameters file that gets outputted. Then replace the resource name/s with the uat and prod names. For example, if a new Linked Service called LSKeyVault points to a Key Vault named devkeyvault01 then in your uat-template-parameters file replace the Key Vault name with the uat key vault name (uatkeyvault01, etc.). Just make sure that UAT Key Vault exists first :).
@@dataengineeringwithnick7532 thanks Nick for clarification. But the ARMTemplateForFactory.json created in Artifact during build phase is used thru out the UAT and Prod deployments; my parameter file in Prod has different name of linked service so I am getting error on Parameters passed vs what are present in ARMTemplate file in Artifact. As long as the parameters in ARMTemplate file is same as Template_PArameter file under adf_cicd folder then it pass the stage which I did in UAT Deployment but it failed in Prod deployment because my linked service in prod is ls_data_prod whereas in ARMTemplate file it is defined as ls_data. Any suggestions!! 🙏
I think I figured it out. The parameter file under adf-cicd for prod and UAT should match with what is generated in Artifact but the related items like Linked services/Managed PE etc can be of different names in Data Factory. Is my understanding correct?
I just did another trial. I kept my Parameters for UAT and Prod same as what was generated in Artifact and I deleted all Linked Services from Prod Data Factory and during deployment everything was recreated automatically.
So many details and additional suggestions for how to build and manage an Azure Function in this video. Love it!
When you created this setup did you create the publish Branch while configuring the repo in Azure data Factory?
Yes. It’s not needed for the CICD code/deployment but you usually will need it to publish in your DEV ADF if you want to test Triggers.
@@dataengineeringwithnick7532 so if I create the adf_publish branch and when my build pipeline is run then my cd pipeline is erroring out with message Armtemplate unable to locate . Do you have any suggestions on that ?
Thanks, Nick, for this tutorial. It's the best tutorial on Function App I found on RUclips.
@dataengineeringwithnick7532 I was getting buffer error and I'm using node version 18... I was followed exactly the same steps and failing at the last stage - Buffer() is deprecated due to security and usability issues
salute brother 🫡
If I use all the yaml files and other supporting files, do I still need to create the flow under Release under pipeline section of Azure Devops or do I need to still create Environment?
No, you don’t need a separate release pipeline flow. The yaml files does all the CICD required for ADF
Can I use the Library Groups inplace of the KeyVault for subscriptions IDs or any variable which has the highest confidentiality?
Yes, you can use secret variables or library groups or whatever you’d like.
Awesome job Nick, i followed through all the videos. I'm already a data analyst and engineer with a Fortune 500 in Canada.
please make a video with the ci/cd pipeline
Great video, very helpful! Thank you
Thanks a lot for sharing your knowledge
was waiting it from long !! Thank you buddy
How to override global params when selecting "Include global parameters in ARM template"? Do I have to override parameters in AzureResourceManagerTemplateDeployment@3 somehow?
In my example I override them in the template-parameters.json files. For example, see the cicd/adf-cicd adf-prod-template-parameters.json and adf-uat-template-parameters.json files. I override the global parameters (default_properties_GL_STRING_value and default_properties_GL_NUMBER_value) in those files. Those are global parameters that are updated (different values) in each environment. You can also override parameters in the AzureResourceManagerTemplateDeployment@3 (if you didn't want to use template-parameter files) using the overrideParameters input. For reference: learn.microsoft.com/en-us/azure/devops/pipelines/tasks/reference/azure-resource-manager-template-deployment-v3?view=azure-pipelines
Error: ##[warning]Can't find loc string for key: Info_GotAndMaskAuth in the pipeline
This is just a warning from the Npm@1 task and has recently been updated by the Microsoft team (just not released yet). See here: github.com/microsoft/azure-pipelines-tasks/issues/20120 and here: github.com/microsoft/azure-pipelines-tasks-common-packages/pull/345. Code update should be in the next release (github.com/microsoft/azure-pipelines-tasks-common-packages/releases). Either way, it should resolve itself and there's no code changes needed on the ADF pipeline.
This is one of the best videos on this topic! I have one question and very high hopes that I will get my answer here. Why is dev ADF in the picture when the build is triggered on main branch? Previously in the manual process, when the dev branch is merged with main, one would "switch" to the main branch (ie collaboration branch) in ADF to click the publish button. This ensures generation of arm templates on main branch. However, with npm package method as shown in the video has both main branch as a trigger but also referencing Dev ADF. This confused me. The reason I would like to know this is because we have a slightly different setup. Our main goes into prod ADF. We have a feature branch for collaboration as I have more than one data engineers working on their own Dev branches. Once their codes merge into feature branch, we intend to deploy it in our test environment. Only upon successful testing, it will be merged into main which will then trigger prod deployment. Please help!
Great question. The build (packaging up the ADF code) is done from the repo (main branch in this example) where all of the ADF JSON files are. The DEV ADF /subscriptions/.../adf name in the code Validate And Generate ADF ARM Template And Scripts seems to only help set the default values/info for the artifacts (ARM template, etc.) from what I've seen. For example, I've tested completely deleting the actual DEV ADF (with the ADF JSON files still in the repo main branch) and the pipeline still builds it. I've also tested using a random ADF DEV name that doesn't exist and it still builds it using the repo JSON files but will use the default value in the ARM Template and files as the ADF name that doesn't exist. So whatever your collaboration branch is (the feature branch you mentioned for example), as long as you checkout that branch in your build pipeline, you should be fine as that's where the code is. Then in a different pipeline (or different stage in the same pipeline) you can checkout the main branch and deploy to PROD separately. Hope this helps.
@@dataengineeringwithnick7532 thank you so much for getting back. I'm going to try it shortly and will let you know.
@@dataengineeringwithnick7532 Starting to implement and stuck at one place. Do we really need to use the adf-uat-template-parameters.json and prod files? One thing that I liked about the previous manual process was I didn't have to worry about all the Linked Services etc details of my ADF (we've many ADLS, Lakehouse, SQL, KV). It autogenerated everything. I used to enter override values for the parameters on the ADO UI. Is it not possible here? How to achieve that by say totally not referencing these files? Thank you again! Please keep up the great work!
I dont have an UAT environment, can i use this for dev and prod only?
Yes you can. You would just remove the Deploy to UAT code in the pipeline.
@@dataengineeringwithnick7532 Thanks, did u add the variables also in the pipeline? The DevSubscriptionID and productionID?
@@mainuser98those are actually secret variables (so the values don’t get logged/show in plain text). To create secret variables, in Azure Pipelines click on your pipeline then click edit and then variables.
An interesting solution is to install a virtual space, but there are still problems with addition, and the ability to output only clean code without unnecessary information, such as the location of the directory. Configuring VSCode is still a problem for using Python, and only Jupyter notepad works fine.
It turned out, but the list became bloated compared to the video. 50 lines at least.... It was great wisdom, now it's time to do it.
Saved a day, thank you for teaching this, its has wide amount applications