Data Science Interview Questions | Data Science Interview Questions Answers And Tips | Simplilearn

Поделиться
HTML-код
  • Опубликовано: 6 янв 2025

Комментарии • 92

  • @SimplilearnOfficial
    @SimplilearnOfficial  6 лет назад +15

    "🔥Data Scientist Masters Program (Discount Code - YTBE15) - www.simplilearn.com/big-data-and-analytics/senior-data-scientist-masters-program-training?JZsSNLXXuE&Comments&RUclips
    🔥IITK - Professional Certificate Course in Data Science (India Only) - www.simplilearn.com/iitk-professional-certificate-course-data-science?JZsSNLXXuE&Comments&RUclips
    🔥Caltech Post Graduate Program in Data Science - www.simplilearn.com/post-graduate-program-data-science?JZsSNLXXuE&Comments&RUclips
    🔥Brown University - Applied AI & Data Science - www.simplilearn.com/applied-ai-data-science-course?JZsSNLXXuE&Comments&RUclips"

    • @Albert-fe8jx
      @Albert-fe8jx 6 лет назад

      All great content. On #11, reducing dimensionality, the inevitable followup question would be "So how would you reduce the 8 million word pairings to 'interesting' data?" #26, What does the entropy value tell you about the data?

    • @ssj4vrn
      @ssj4vrn 5 лет назад +1

      Why is it one way anova and not t test?

    • @M8B2L8
      @M8B2L8 5 лет назад +2

      @@ssj4vrn there are 2 coupons(A &B), so there will be three groups, Group A Group B And Group C (who didn't use a coupon to purchase).

    • @SimplilearnOfficial
      @SimplilearnOfficial  5 лет назад

      @@M8B2L8 Thanks for your valuable input!

    • @srs.shashank
      @srs.shashank 4 года назад

      In Q 9, for imputing missing data instead of mean, using median would be a better solution as median is not affected by outliers, what do you think?

  • @andyh.1882
    @andyh.1882 5 лет назад +38

    For the 1st question, I did it differently. Step 1 Fiil in the 3 liter bucket and pour the water in 5 liter bucket. (2 liter still not filled) Step 2 Fill in the 3 liter bucket again and pour the water in 5 liter bucket until it is filled (2 liter was available) so you have 1 liter left in 3 liter bucket. Step 3 Empty your 5 liter bucket completely and pour your 1 liter from 3 liter bucket in 5 liter bucket (you have 1 liter of water in 5 liter bucket). Step 4 Fill in the 3 liter bucket completely and then pour the 3 liters in 5 liter bucket (you have 4 liter in 5 liter bucket). This is more steps involved but also possible.

    • @SimplilearnOfficial
      @SimplilearnOfficial  5 лет назад +6

      True! We appreciate your effort.
      However, There are always multiple ways to reach the solution but finding the best possible way is the deal here. CHeers!

  • @ruthesan
    @ruthesan 5 лет назад +64

    Correct the error at 35:13 => Recall Rate = (True Positive) / (True Positive + False Negative)

  • @unnikrishnantp8308
    @unnikrishnantp8308 5 лет назад +12

    In Random Forest, we bootstrap sample both features and training instances (rows). Very important point. Bootstrap sampling the features reduce bias error, and second one controls overfitting to a slight extend only though

  • @amsouvikghosh
    @amsouvikghosh 3 года назад +10

    Excellent and very much informative video. For the question number 19, I was thinking to mention about Augmented Dickey Fuller test (ADF Test) which is a common statistical test used to test whether a given Time series is stationary or not.

  • @ajaykushwaha6137
    @ajaykushwaha6137 5 лет назад +3

    One of the greatest videos so far in the field of data science.

    • @SimplilearnOfficial
      @SimplilearnOfficial  5 лет назад +2

      WooHoo! We are so happy you love our videos. Please do keep checking back in. We put up new videos every week on all your favorite topics. Whenever you have the time, you must also check out our blog page @www.simplilearn.com and tell us what you think. Have a good day!

  • @ShadowknitezRS
    @ShadowknitezRS 4 года назад +7

    For the question at around the 45th minute
    The first solution that comes to mind is your solution. However, if the rope is not uniform, doesn't that mean that folding it in half would not work? Let's say the left half burns completely in 20 minutes while the right half in 40 minutes, folding it in half would not really help you measure 30 minutes, and same goes to the folding in 4.

    • @kamilsmolak5793
      @kamilsmolak5793 3 года назад +1

      Just wanted to point it out. You're right, the halving solution is a bit too naive.

  • @EyiBillion
    @EyiBillion 5 лет назад +5

    Thank you. No video has impacted me this much.

    • @SimplilearnOfficial
      @SimplilearnOfficial  5 лет назад +1

      WooHoo! We are so happy you love our videos. Please do keep checking back in. We put up new videos every week on all your favorite topics. Whenever you have the time, you must also check out our blog page @www.simplilearn.com and tell us what you think. Have a good day!

  • @seymatas
    @seymatas 4 года назад

    As a physicist - data scientist, I first planned to make two pendulums using ropes, find the period using T =2Pi sqrt(length/gravitational acceleration). Measure time by using this pendulum clock. :)

  • @marwaa.6759
    @marwaa.6759 4 года назад +4

    Very rich and informative video.. thanks for the great effort.

    • @SimplilearnOfficial
      @SimplilearnOfficial  4 года назад +1

      Glad you enjoyed our video! We have a ton more videos like this on our channel. We hope you will join our community!

  • @enes-the-cat-father
    @enes-the-cat-father 4 года назад +2

    Great video, thank you!
    Additional info : 36:11, this is also called pigeonhole principle.

    • @SimplilearnOfficial
      @SimplilearnOfficial  4 года назад

      Glad you enjoyed our video! We have a ton more videos like this on our channel. We hope you will join our community!

  • @siddheshshanker4162
    @siddheshshanker4162 6 лет назад +4

    Excellent video. Compiled almost all the important aspects of Data Science interview.
    I have a doubt. For the recommendation, the algorithm that is being used is Decision tree.

    • @SimplilearnOfficial
      @SimplilearnOfficial  6 лет назад +1

      Hi Siddhish, we appreciate your kind comment. Decision tree algorithm is used only for classification. For recommendation, there is a separate algorithm called "Recommender system".

  • @seshadris8323
    @seshadris8323 5 лет назад +8

    In the final question: Does offering coupons impacts purchase decision ?
    Here we have 2 categorical variables - 'Coupons' and 'Purchased' both cotain 0 & 1.
    Can't this be done using Cho Square?

  • @faychen3872
    @faychen3872 5 лет назад +5

    Thanks for sharing. Can you explain a little bit more about ANOVA/one-way ANOVA, when should we use ANOVA?

    • @ah2522
      @ah2522 4 года назад +1

      when you want to test if there is at least 1 group that is different from any other group

  • @dudepamal
    @dudepamal 5 лет назад +3

    Great video. Thanks for sharing. I think answer to question 11 could has more to do with curse of dimensionality, rather than computation and storage.

    • @SimplilearnOfficial
      @SimplilearnOfficial  5 лет назад

      Hey Pamal, thank you for watching our video. We will definitely look into your doubt. Do subscribe and stay tuned for updates on our channel. Cheers :)

  • @harrison6082
    @harrison6082 3 года назад +3

    0:17 That first question proves you really have to be something else to take this major and career choice.
    Because the overwhelming majority of the population would get discouraged and quit after seeing a problem like this.
    Plus its the first question.

    • @SimplilearnOfficial
      @SimplilearnOfficial  3 года назад

      Thanks for watching our video and sharing your thoughts. Do subscribe to our channel and stay tuned for more. Cheers!

  • @bobrick3491
    @bobrick3491 3 года назад +4

    Q1 is soo ambiguous. Can you empty water? How much water is there to begin with? Is it 3L, 5L of water or are the buckets that size? If not, do we just assume we even have 4L of water to begin with?

    • @SimplilearnOfficial
      @SimplilearnOfficial  3 года назад

      Thanks for watching our video and sharing your thoughts. Do subscribe to our channel and stay tuned for more. Cheers!

  • @conorsmyth12358
    @conorsmyth12358 3 года назад +8

    Let's talk about the pronunciation of hierarchical, a priori, and chi.

    • @SimplilearnOfficial
      @SimplilearnOfficial  3 года назад

      Thanks for watching our video and sharing your thoughts. Do subscribe to our channel and stay tuned for more. Cheers!

  • @antoniovazquez4900
    @antoniovazquez4900 4 года назад +2

    For question 15 you assume independence, which is (with the provided data) the only way to go, but it's a BIG assumption.

  • @shreeramshankarpattanayak7409
    @shreeramshankarpattanayak7409 Год назад

    Great video !

  • @shobhamourya8396
    @shobhamourya8396 3 года назад +1

    @17:22 instead of using condition fizzbuzz % 3 == 0 and fizzbuzz % 5 == 0
    use fizzbuzz %15 == 0

    • @SimplilearnOfficial
      @SimplilearnOfficial  3 года назад

      Thank you for letting us know know about this. Your feedback helps us get better. We are looking into this issue and hope to resolve it promptly and accurately.

    • @shobhamourya8396
      @shobhamourya8396 3 года назад

      @@SimplilearnOfficial Here's my R code for it:
      fnFizzBuzz = function(x){

      y

  • @clevo4040
    @clevo4040 5 лет назад

    " e to the base 2" might want to reconsider that one.
    You got it right the second time you said it!

    • @SimplilearnOfficial
      @SimplilearnOfficial  5 лет назад

      Hi Christopher, thanks for checking out our tutorial and for sharing the information. We will definitely look into that. Do subscribe to our channel to stay posted on upcoming tutorials. Cheers!

  • @sislastew2693
    @sislastew2693 5 лет назад

    Thank you so much. Keep the good stuff coming

    • @SimplilearnOfficial
      @SimplilearnOfficial  5 лет назад

      Glad you enjoyed our video! We have a ton more videos like this on our channel. We hope you will join our community!

  • @RahulKumar-nt7go
    @RahulKumar-nt7go 5 лет назад +2

    This is a great video! Thank you for sharing.
    Is association rule mining type of content based filtering?

    • @SimplilearnOfficial
      @SimplilearnOfficial  5 лет назад

      Hi, content based filtering is not similar to association rule mining. It is one of the types of recommend systems.

  • @VeereshDammur
    @VeereshDammur 4 года назад

    correct the error @42:32 minus sign should be there for both first and second terms

  • @mindsetnuggets
    @mindsetnuggets 4 года назад

    Excellent video. Very much helpful

    • @SimplilearnOfficial
      @SimplilearnOfficial  4 года назад +1

      Hey, thank you for watching our video. We are glad that you liked our video. Do subscribe and stay connected with us. Cheers :)

  • @hemantsharma276
    @hemantsharma276 4 года назад +1

    Dimension reduction does not take account tg the redundant features, it only take care of the variance.....

  • @PejmanJ
    @PejmanJ 5 лет назад +1

    Nice! There's an issue with Entropy formula though...

    • @SimplilearnOfficial
      @SimplilearnOfficial  5 лет назад

      Hi Pejman, could you please elaborate the issue with entropy formula? Thanks.

  • @conorsmyth12358
    @conorsmyth12358 3 года назад

    Please explain what is meant at 4:59 by e to the base 2?

  • @BryanCheong
    @BryanCheong 3 года назад +2

    What's "postscriptive" at 13:24?

    • @SimplilearnOfficial
      @SimplilearnOfficial  3 года назад

      Hi, Simplilearn provides online training across the world. We would be happy to help you regarding this. Please visit us at www.simplilearn.com and drop us a query and we will get back to you! Thanks!

  • @vassilyn5378
    @vassilyn5378 4 года назад +2

    28:36 - it's not union, it's intersection

    • @SimplilearnOfficial
      @SimplilearnOfficial  4 года назад

      Thanks for the correction. We will share your feedback with our team.

  • @omgitsbenhayes
    @omgitsbenhayes 6 лет назад +7

    Random forest algorithm randomly chooses 'k' features at each split not just within a decision tree.

    • @SimplilearnOfficial
      @SimplilearnOfficial  6 лет назад

      Random Forest algorithm randomly chooses 'K' features to build a decision tree. If you want to have a clear understanding of it, check this video: ruclips.net/video/HeTT73WxKIc/видео.html. If you have any questions related to these videos, you can post in the comments section, we will clear your queries/doubts.

  • @ashveentube
    @ashveentube 5 лет назад +1

    Thanks

  • @redarabie7098
    @redarabie7098 6 лет назад +1

    thank you a lot for your videos, i have a question
    what is the best regression method for a dataset with huge number of variables ( 1000 variables ) also maybe we can found a lot of redundant variables

    • @SimplilearnOfficial
      @SimplilearnOfficial  6 лет назад

      You can use both linear and logistic regression while dealing with data that has 1000 variables. It depends on the type of your target variable. If you are trying to predict a numerical value, you can use multiple linear regression and if your target variable is categorical, then go for logistic regression. Hope that helps!

    • @redarabie7098
      @redarabie7098 6 лет назад

      @@SimplilearnOfficial thank you . Actually I have continues value to predict but when I do the MLR the calculation of beta coefficient become impossible beta=(X'X)-1*(X'y) has no solution because (X'X) is inversible

    • @Beny123
      @Beny123 6 лет назад +1

      Reda Rabie dont invert that matrix. It is unwiledly to say the least . Both python and R have libraries for multiple types of regressions .

    • @abhinavjain8873
      @abhinavjain8873 6 лет назад

      @@redarabie7098 Try to use regularized equation, it will even help you remove the invertability of matrix.

    • @SimplilearnOfficial
      @SimplilearnOfficial  6 лет назад +1

      Thanks for the valuable input!

  • @KixMayne91
    @KixMayne91 4 года назад

    If I'm not mistaken cant Apriori fall into either category? Because you can augment it to use class labels.

  • @AlexK-uh5my
    @AlexK-uh5my 4 года назад +2

    Explanation of sigmoid function (5:15..) is shady. Casts doubt on the quality of the whole presentation and creating organization.

  • @andrew1257
    @andrew1257 5 лет назад +7

    Chai square hehe ;D

  • @AnishaAlluru
    @AnishaAlluru 5 лет назад

    24:29 how did you get X=1?

  • @NitinGuptalko
    @NitinGuptalko 5 лет назад

    good video. just one point, entropy answer does not look correct. second term also should be negative. is it not?

    • @SimplilearnOfficial
      @SimplilearnOfficial  5 лет назад

      You get such error when you are giving a decimal input in your code. So use float() in your code and run it.

  • @simba8073
    @simba8073 3 года назад

    Thank god. Virendra Sehwag did not have to learn the data science and such teasers. He just believed that ball is there to hit....

    • @SimplilearnOfficial
      @SimplilearnOfficial  3 года назад

      Hi, Simplilearn provides online training across the world. We would be happy to help you regarding this. Please visit us at www.simplilearn.com and drop us a query and we will get back to you! Thanks!

  • @shouryateja4667
    @shouryateja4667 5 лет назад +1

    In the bucket example I will fill each bucket with half 2.5 and 1.5 which is 4L

    • @srinivasrao6866
      @srinivasrao6866 5 лет назад +3

      Well, there is no demarcation of half or 60% full or 70% full. If there were then the simplest would be 80% of the 5L bucket.

  • @mustafabohra2070
    @mustafabohra2070 5 лет назад

    What do we mean by Feedback mechanism?

    • @SimplilearnOfficial
      @SimplilearnOfficial  5 лет назад +3

      A feedback mechanism is a loop system wherein the system responds to a particular feedback. The response may be in the same direction or in the opposite direction based on the received feed. Hope that helps!

    • @rolandheinze7182
      @rolandheinze7182 5 лет назад

      @@SimplilearnOfficial please clarify in the context of an example, this answer is very general.

    • @SimplilearnOfficial
      @SimplilearnOfficial  5 лет назад +2

      Feedback mechanism is a concept used in data science to feed the results back to the model. This is done to improve the accuracy of the model and minimize the errors in it.

  • @nessessence8261
    @nessessence8261 5 лет назад +2

    are you the same guy as the instructor in linear algebra on Khanacademy ?

    • @SimplilearnOfficial
      @SimplilearnOfficial  5 лет назад +2

      No, he is not the same instructor. We have our in-host dedicated instructors. Thanks.

  • @troys9099
    @troys9099 5 лет назад

    Where is the part 2

    • @SimplilearnOfficial
      @SimplilearnOfficial  5 лет назад

      Hi Troy, we don't have Data Science Interview questions part - 2. Thanks.

  • @harishatejg268
    @harishatejg268 3 года назад +1

    Thank you