Azure Data Engineer Mock Interview | PySpark | Delta Live Tables| Managerial

Поделиться
HTML-код
  • Опубликовано: 8 янв 2025

Комментарии • 13

  • @imranhossain1660
    @imranhossain1660 9 месяцев назад +8

    Optimize is a perfomance optimization technique available in delta lake table. Whenever we perform any kind of DML operations in Delta table each and every time it generates a new records. Over a period of time it generates a huge number of small files and it is kind of overhead for the delta engine to effectively perform the execution of our query as it eventually increases our resource usage such as i/o read/write and the computaion. Optimize command helps us to combine these small files to a larger file which eventually improves the performance of the delta table. As after this optimize operation, delta table refers to this latest snapshot file in order to retrieve results whever we query our table. And we can delete the obselete small files and free up the space with the help of vaccum command.

  • @prakashtripathi270
    @prakashtripathi270 9 месяцев назад

    Thank you Sumit sir for arranging such a insightful session..

    • @sumitmittal07
      @sumitmittal07  9 месяцев назад

      Always happy to help the community!

  • @VaidehiH-v2l
    @VaidehiH-v2l 9 месяцев назад

    Thank u sumit sir

  • @maheshtiwari2297
    @maheshtiwari2297 9 месяцев назад +1

    Hello sir, i have interview for big data/Etl developer at amazon please guide me for that.

  • @MsMohanj
    @MsMohanj 9 месяцев назад

    Is it the join is correct or we can go for left join

  • @VaidehiH-v2l
    @VaidehiH-v2l 9 месяцев назад

    Sir, need one video to know how bussiness requirement is, and how data engg gets the bussiness requirement and working strategy

    • @sumitmittal07
      @sumitmittal07  9 месяцев назад +1

      Noted. Will have a session around this aspect!

    • @tahiliani22
      @tahiliani22 9 месяцев назад

      I would like to add to this. If I am understanding correctly, @user_j% is talking about questions like Design Yelp but from a Database perspective.

  • @codinggeek9992
    @codinggeek9992 8 месяцев назад

    Less questions from Azure Synapse....

  • @Rakesh-q7m8r
    @Rakesh-q7m8r 9 месяцев назад +3

    create table oldest_youngest(person varchar(10),type varchar(20),age int);
    insert into oldest_youngest values
    ('A1','ADULT',54),
    ('A2','ADULT',53),
    ('A3','ADULT',52),
    ('A4','ADULT',58),
    ('A5','ADULT',54),
    ('C1','CHILD',20),
    ('C2','CHILD',19),
    ('C3','CHILD',22),
    ('C4','CHILD',15);
    WITH ranked_adult AS
    (
    SELECT person as adult, ROW_NUMBER() OVER(ORDER BY age desc) as r_a FROM oldest_youngest where type = 'ADULT'
    ),
    ranked_child as
    (
    SELECT person as child, ROW_NUMBER() OVER(ORDER BY age asc) as r_c FROM oldest_youngest where type = 'child'
    )
    SELECT adult,child FROM
    ranked_adult a
    left join
    ranked_child c
    on a.r_a = c.r_c

  • @zaffer2024
    @zaffer2024 9 месяцев назад

    Tough sql question, 😭

    • @Rakesh-q7m8r
      @Rakesh-q7m8r 9 месяцев назад

      Check this.
      create table oldest_youngest(person varchar(10),type varchar(20),age int);
      insert into oldest_youngest values
      ('A1','ADULT',54),
      ('A2','ADULT',53),
      ('A3','ADULT',52),
      ('A4','ADULT',58),
      ('A5','ADULT',54),
      ('C1','CHILD',20),
      ('C2','CHILD',19),
      ('C3','CHILD',22),
      ('C4','CHILD',15);
      WITH ranked_adult AS
      (
      SELECT person as adult, ROW_NUMBER() OVER(ORDER BY age desc) as r_a FROM oldest_youngest where type = 'ADULT'
      ),
      ranked_child as
      (
      SELECT person as child, ROW_NUMBER() OVER(ORDER BY age asc) as r_c FROM oldest_youngest where type = 'child'
      )
      SELECT adult,child FROM
      ranked_adult a
      left join
      ranked_child c
      on a.r_a = c.r_c