- Видео 177
- Просмотров 57 918
CognitiveCoders
Индия
Добавлен 4 июл 2021
Hello Coders,
This channel is to provide the knowledge
of Programming Languages,SQL, Azure Data Engineering, multiple Azure service related stuff.
Please follow me on LinkedIn for more Information.
1 Subscriber, 1👍🏻, 1Comment = 100 Motivation 🙏🏼
🙏🏻Please Subscribe 🙏🏼
This channel is to provide the knowledge
of Programming Languages,SQL, Azure Data Engineering, multiple Azure service related stuff.
Please follow me on LinkedIn for more Information.
1 Subscriber, 1👍🏻, 1Comment = 100 Motivation 🙏🏼
🙏🏻Please Subscribe 🙏🏼
Dynamic column mapping in Azure Data Factory | Azure Data Factory|Real Time Scenario
Welcome to our comprehensive Azure Data Factory RealTime scenarios where we'll take you through the dynamic column mapping process in Azure Data Factory. Whether you're a beginner or looking to expand your Azure skills, this video is designed to help you master ADF with practical, step-by-step instructions.
Why Use Azure Data Factory?
Azure Data Factory (ADF) is a powerful cloud-based data integration service that allows you to create data-driven workflows for orchestrating and automating data movement and data transformation. It's a key component for any data engineer working with big data, ETL processes, and data lakes in the Azure environment.
🚀 Key Topics Covered:
Real-Time Data Ingestion...
Why Use Azure Data Factory?
Azure Data Factory (ADF) is a powerful cloud-based data integration service that allows you to create data-driven workflows for orchestrating and automating data movement and data transformation. It's a key component for any data engineer working with big data, ETL processes, and data lakes in the Azure environment.
🚀 Key Topics Covered:
Real-Time Data Ingestion...
Просмотров: 69
Видео
How to implement multi-threading in Databricks Notebook | Pyspark Tutorial | Step-By-Step Approach
Просмотров 8814 дней назад
If you like this video please do like,share and subscribe my channel. PySpark playlist : ruclips.net/p/PL7DrGo85HcssOo4q5ihH3PqRRXwupRe65 PySpark RealTime Scenarios playlist : ruclips.net/p/PL7DrGo85HcstBR0D4881RTIzqpwyae1Tl Azure Datafactory playlist : ruclips.net/p/PL7DrGo85HcsueO7qbe3-W9kGa00ifeQMn Azure Data Factory RealTime Scenarios playlist : ruclips.net/p/PL7DrGo85HcsulFTAXy2cgcS6bWRlIs...
How to execute Store Procedure with Output Parameter using ADF|Azure Data Factory|Real Time Scenario
Просмотров 10121 день назад
Welcome to our comprehensive Azure Data Factory RealTime scenarios where we'll take you through the process to execute Store Procedure with Output Parameter using Azure Data Factory. Whether you're a beginner or looking to expand your Azure skills, this video is designed to help you master ADF with practical, step-by-step instructions. 📚Queries: CREATE OR ALTER procedure dbo.usp_employees_dataC...
Query to get sales difference between quarters | PWC | Latest Data Engineering Interview Question
Просмотров 13921 день назад
If you like this video please do like,share and subscribe my channel. PySpark playlist : ruclips.net/p/PL7DrGo85HcssOo4q5ihH3PqRRXwupRe65 PySpark RealTime Scenarios playlist : ruclips.net/p/PL7DrGo85HcstBR0D4881RTIzqpwyae1Tl Azure Datafactory playlist : ruclips.net/p/PL7DrGo85HcsueO7qbe3-W9kGa00ifeQMn Azure Data Factory RealTime Scenarios playlist : ruclips.net/p/PL7DrGo85HcsulFTAXy2cgcS6bWRlIs...
How to copy multiple tables data using ADF | Azure Data Factory | Real Time Scenario
Просмотров 10628 дней назад
Welcome to our comprehensive Azure Data Factory RealTime scenarios where we'll take you through the process to copy multiple tables data dynamically from SQL db using Azure Data Factory. Whether you're a beginner or looking to expand your Azure skills, this video is designed to help you master ADF with practical, step-by-step instructions. Why Use Azure Data Factory? Azure Data Factory (ADF) is...
Top SQL Data Engineering Interview Questions Asked at Uber | Ace Your Data Engineer Interview
Просмотров 196Месяц назад
If you like this video please do like,share and subscribe my channel. CREATE TABLE transactions ( user_id INT, spend DECIMAL, transaction_date DATE ); INSERT INTO transactions VALUES (111,100.5,'2022-01-08'), (111,55,'2022-01-10'), ( 111 , 89.6 , '2022-02-05' ) , ( 121 , 36 , '2022-01-18' ) , ( 121 , 22.2 , '2022-04-01' ) , ( 121 , 67.9 , '2022-04-03' ) , ( 145 , 24.99 , '2022-01-26' ) , ( 145 ...
Slowly Changing Dimension(SCD)Type2 using Data Flow in ADF | Azure Data Factory | Real Time Scenario
Просмотров 127Месяц назад
Welcome to our comprehensive Azure Data Factory RealTime scenarios where we'll take you through the process to implement Slowly Changing Dimensions (SCD) Type2 in Azure Data Factory. Whether you're a beginner or looking to expand your Azure skills, this video is designed to help you master ADF with practical, step-by-step instructions. Why Use Azure Data Factory? Azure Data Factory (ADF) is a p...
How to Access Azure Data Lake Storage (ADLS) from Databricks Notebooks | Step-by-Step Tutorial
Просмотров 82Месяц назад
Learn How to Access Azure Data Lake Storage (ADLS) from Databricks Notebooks: In this tutorial, we’ll guide you step-by-step on how to access Azure Data Lake Storage (ADLS) directly from Databricks notebooks. Whether you’re working with ADLS Gen1 or Gen2, this video covers the entire process, from setting up your Azure environment to authenticating and integrating Databricks with ADLS for seaml...
How to store pipeline execution log in SQL table using ADF | Azure Data Factory | Real Time Scenario
Просмотров 132Месяц назад
How to store pipeline execution log in SQL table using ADF | Azure Data Factory | Real Time Scenario
Top Data Engineering Interview Question From FAANG | SQL Interview Question | FAANG
Просмотров 125Месяц назад
Top Data Engineering Interview Question From FAANG | SQL Interview Question | FAANG
How to validate file schema before processing in ADF | Azure Data Factory | Real Time Scenario
Просмотров 151Месяц назад
How to validate file schema before processing in ADF | Azure Data Factory | Real Time Scenario
Query the list of candidates for the job | Data Engineering Interview Question | LinkedIn | Pyspark
Просмотров 1232 месяца назад
Query the list of candidates for the job | Data Engineering Interview Question | LinkedIn | Pyspark
How to add new column with source file name using ADF | Azure Data Factory | Real Time Scenario
Просмотров 1642 месяца назад
How to add new column with source file name using ADF | Azure Data Factory | Real Time Scenario
Top Data Engineering Interview Question | KANTAR Group | Pyspark Interview Question
Просмотров 2082 месяца назад
Top Data Engineering Interview Question | KANTAR Group | Pyspark Interview Question
How to get count of files in a folder using ADF | Azure Data Factory | Real Time Scenario
Просмотров 1212 месяца назад
How to get count of files in a folder using ADF | Azure Data Factory | Real Time Scenario
Slowly Changing Dimension(SCD)Type1 using Data Flow in ADF | Azure Data Factory | Real Time Scenario
Просмотров 1303 месяца назад
Slowly Changing Dimension(SCD)Type1 using Data Flow in ADF | Azure Data Factory | Real Time Scenario
Top Data Engineering Interview Questions From Impetus | Pyspark | SQL | Interview Question
Просмотров 4573 месяца назад
Top Data Engineering Interview Questions From Impetus | Pyspark | SQL | Interview Question
How to create running total using Data Flow in ADF | Azure Data Factory | Real Time Scenario
Просмотров 1143 месяца назад
How to create running total using Data Flow in ADF | Azure Data Factory | Real Time Scenario
Top 4 Data Engineering Interview Questions | Accenture | Pyspark | Tehnical Round Question
Просмотров 3443 месяца назад
Top 4 Data Engineering Interview Questions | Accenture | Pyspark | Tehnical Round Question
How to create incremental key using Data Flow in ADF | Azure Data Factory | Real Time Scenario
Просмотров 1263 месяца назад
How to create incremental key using Data Flow in ADF | Azure Data Factory | Real Time Scenario
How to remove duplicate rows using dataflow in ADF | Azure Data Factory | Real Time Scenario
Просмотров 1283 месяца назад
How to remove duplicate rows using dataflow in ADF | Azure Data Factory | Real Time Scenario
Latest Data Engineering Interview Question from PWC | BigData | SQL | Azure Data Engineer
Просмотров 2844 месяца назад
Latest Data Engineering Interview Question from PWC | BigData | SQL | Azure Data Engineer
How to process fixed length text file using ADF DataFlow| Azure Data Factory | Real Time Scenario
Просмотров 1014 месяца назад
How to process fixed length text file using ADF DataFlow| Azure Data Factory | Real Time Scenario
How to copy last n days data incrementally from ADLS Gen2 | Azure Data Factory | Real Time Scenario
Просмотров 1714 месяца назад
How to copy last n days data incrementally from ADLS Gen2 | Azure Data Factory | Real Time Scenario
Delta Lake : Slowly Changing Dimension (SCD Type2) | Pyspark RealTime Scenario | Data Engineering
Просмотров 4084 месяца назад
Delta Lake : Slowly Changing Dimension (SCD Type2) | Pyspark RealTime Scenario | Data Engineering
How to copy latest or last modified file from ADLS Gen2| Azure Data Factory | Real Time Scenario
Просмотров 1994 месяца назад
How to copy latest or last modified file from ADLS Gen2| Azure Data Factory | Real Time Scenario
Latest Tiger Analytics coding Interview Questions & Answers | Data Engineer Prep 2024
Просмотров 6 тыс.4 месяца назад
Latest Tiger Analytics coding Interview Questions & Answers | Data Engineer Prep 2024
How to get source file name dynamically in ADF | Azure Data Factory | Real Time Scenario
Просмотров 1545 месяцев назад
How to get source file name dynamically in ADF | Azure Data Factory | Real Time Scenario
Top LTIMindtree SQL Interview Questions | Data Engineering Career Guide 2024 | Data Engineering
Просмотров 4,1 тыс.5 месяцев назад
Top LTIMindtree SQL Interview Questions | Data Engineering Career Guide 2024 | Data Engineering
How to upsert data into delta table using PySpark | Pyspark RealTime Scenario | Data Engineering
Просмотров 1985 месяцев назад
How to upsert data into delta table using PySpark | Pyspark RealTime Scenario | Data Engineering
window = Window.partitionBy("cust_id").orderBy("flight_id").rowsBetween(Window.unboundedPreceding, Window.unboundedFollowing) df1 = df.withColumn("origin", first("origin").over(window))\ .withColumn("destination", last("destination").over(window)) df1.select(["cust_id","origin","destination"]).dropDuplicates().show() This is a short approch
What was the experience for this candidate? I mean people with how much experience can expect this qs
Don't say simple for everything....we see difficulties here
Thank you. Pls keep posting videos on pyspark interview questions
I have a question on threadpool in spark. When we use threadpool executor, all threads are running on same node? Like only on driver node? Or Will it utilize all the workers in the cluster? Can you please clarify ?
When you use threadpool executor, all threads are running on the same node, might run out of memory as well. o tackle your problem, can you try running each notebook as a separate process and create a Spark Context within that process. Please try using "subprocess" module in Python to spawn a new process for each notebook.
null_df=null_df.agg(sum(when(col('id')=='Null',1).otherwise(0)).alias('id'),sum(when(col('name')=='Null',1).otherwise(0)).alias('name'),sum(when(col('age')=='Null',1).otherwise(0)).alias('age')) null_df.display()
Pretty simple
Please try the others also
Bro Can You provide an Roadmap for me I'M an fourth year student previously i have knowledge on mysql,python,pandas,powerbi,excel But Wanted learn data engneering for better options Can you Guide me or can you suggest me any courses
You can start with pyspark. Most important part for a data engineer. You can go through below playlist. ruclips.net/p/PL7DrGo85HcssOo4q5ihH3PqRRXwupRe65&si=j08vTTkVi7pUAaR8
Can you please post the dataset in the comment section it breaks the flow and sorry to say but quite annoying
Please collect it from our telegram channel
❤❤❤❤❤❤❤
bro also attach csv file in the description it will be better we can prac. although u r doing great job
Please get the dataset from telegram channel
once we migrated the data from source to destination -> how do we validate that data whether we have migrated correct data (how to compare source and destination data) could you ans this
Please watch this for getting your answer. ruclips.net/video/_1fhG7H05aA/видео.htmlsi=RrWlVOxYjCR9fRqQ
Your voice very low , increse the Audio
Please change the video quality and check
Hi Pritam Saha, I'm also from TCS. Recently I'm continuously watching your pyspark Interview Series, Really helps alot to develope my problem solving skills in Pyspark, Great work brother. If possible try to create a interview preparation series for python as well, for SQL there are lot of platforms and other channels are there but for Python only few. So would appreciate if you start for python as well thank you brother.
Sure. We'll start a python specific interview series. Please share and support us
Can you please make a similar video using pytest framework for testing databricks notebooks
Will do that
Nice content brother, usefull for all the aspiring Data Engineers.
it means a lot to us. please stay with us.
hello.... i am looking for Data engineer roles... mostly openings are in accenture,mindtree, deloiite, ....but how to check these kind of product based openings..
very disturbed explanation.. seems due to linguistic oddities
1. window_spec = Window.partitionBy("product").orderBy("sale_date") wdf = df.withColumn("2nd day pre sales" , lead("amount" , 2).over(window_spec))\ .withColumn("3rd day pre sales" , lag("amount" , 2).over(window_spec))
Thanks for sharing🎉
Can you please create video on DLT streaming tables. I'm facing issues while using SCD1. My bronze notebook is seperate and Silver notebook is seperate. I'm facing issues while calling bronze table as stream and loading into silver.
we'll create and upload. stay tuned with us
thanks for sharing. keep up the good work!!