Dan Bochman
Dan Bochman
  • Видео 15
  • Просмотров 92 488
LinkedIn Algorithm Test
Same content, different exposure? 🤔
Lately, many influential content creators on LinkedIn are wondering why their viewership isn't proportionate to the amount of followers they have.
External links seem to be the most prominent suspect as a severely decreasing factor in exposure.
Being the data geek that I am, I've decided to tackle this hypothesis in the form of A/B testing.
I've released 2 identical posts (one of which you're seeing right now), where in one the video is uploaded natively and in the other from my RUclips channel.
I will give weekly updates on the view count for both of these posts.
If you think this experiment is useful for the community, please help scale it up, making it mo...
Просмотров: 216

Видео

ML Model Deployment with Flask - Part II - Hosting on Heroku | ML & DS Open-source Spotlight #9
Просмотров 4114 года назад
Host ML apps on the Web with ease using Heroku 📤 See the first part of this video here - ruclips.net/video/Od0gS3Qeges/видео.html Heroku is a platform as a service that enables developers to build, run, and operate applications entirely in the cloud. If you don't have any hardware or driver requirements like GPU or CUDA for your ML app to run, it's so convenient to host your app on Heroku! It b...
ML Model Deployment with Flask | Machine Learning & Data Science Open-source Spotlight #8
Просмотров 7404 года назад
Deploy your ML models easily with Flask! ⚗️ Deploying a trained machine learning model successfully for other people to use and enjoy, is an increasingly important skill, but often neglected in the curriculum of a data scientist. In many companies, the task of model deployment is the sole responsibility of the software engineering team, but I believe that as this field advances, this privilege ...
HoloViews | Machine Learning & Data Science Open-source Spotlight #7
Просмотров 1,3 тыс.4 года назад
Question: What's the simplest way to make high quality plots in Python? Answer: 👉 HoloViews! 👈 " HoloViews is an open-source Python library designed to make data analysis and visualization seamless and simple. With HoloViews, you can usually express what you want to do in very few lines of code, letting you focus on what you are trying to explore and convey, not on the process of plotting. " - ...
Datashader in 15 Minutes | Machine Learning & Data Science Open-source Spotlight #6
Просмотров 6 тыс.4 года назад
Real-time interactive(!) big data visualizations with Datashader Bokeh 👁️ The main challenge of visualizing huge datasets is NOT computing power or memory! It's actually making meaningful plots which are able to highlight the dense sections in your data together with the outliers. Datashader pre-renders even the largest datasets into a fixed-size raster image that faithfully represents the data...
Dask in 15 Minutes | Machine Learning & Data Science Open-source Spotlight #5
Просмотров 50 тыс.4 года назад
Should you use Dask or PySpark for Big Data? 🤔 Dask is a flexible library for parallel computing in Python. In this video I give a tutorial on how to use Dask for parallel computing, handling Big Data and integration with Deep Learning frameworks. I compare Dask to PySpark and list the relative advantages I see of choosing Dask as your primary choice for Big Data handling. Link to Notebook: nbv...
Plotly & Cost Function Visualizations | Machine Learning & Data Science Open-source Spotlight #4
Просмотров 10 тыс.4 года назад
Stop doing 3D plots with Matplotlib! ❌ Plotly's Python graphing library makes interactive, publication-quality graphs. It's based on JavaScript and similar to Bokeh which I covered last week. I especially like using Plotly for creating 3D interactive plots. It has an intuitive API for passing the data and creating the grid required for 3D plotting. Much simpler and flexible compared to other li...
bokeh | Machine Learning & Data Science Open-source Spotlight #3
Просмотров 1,1 тыс.4 года назад
Are you using bokeh? 📊 bokeh is an interactive visualization library for modern web browsers. Although it's an already well-established Python package with sponsors, I think not many people are choosing this tool for data visualizations. With this short tutorial I'm aiming to help you get started with this great package, so you can easily start making professional-looking interactive plots. Wit...
Featuretools | Machine Learning & Data Science Open-source Spotlight #2
Просмотров 8 тыс.4 года назад
Are you using Featuretools? 🔎 Featuretools is a Python open-source library which offers automated feature engineering. What I particularly liked in this library is the ability to elegantly extract features from multiple tables and aggregate them to one final dataset. Video notebook: github.com/danbochman/Open-Source-Spotlight/tree/master/Featuretools Featuretools: github.com/FeatureLabs/feature...
Pandas Profiling | Machine Learning & Data Science Open-source Spotlight #1
Просмотров 5634 года назад
Are you using Pandas Profiling? 🐼 There are amazing open-source Python libraries out there for machine learning and data science, which are well-deserved to be mainstream staple choices in every professional's toolkit. With these new "Machine Learning & Data Science Open Source Spotlight" weekly videos, my objective is to introduce many game-changing libraries, which I believe many people can b...
חמישה צעדים בשביל להפוך לדאטה סיינטיסט
Просмотров 1855 лет назад
בסרטון זה אני מתאר מניסיון אישי מהם הצעדים הרלוונטים והאפקטיבים ביותר לדעתי שניתן לקחת בשביל להיכנס לתחום של Data Science ו/או Machine Learning כתלות ברקע וניסיון קודם
How Do Instagram Filters Work? - ?איך פילטרים באינסטגרם עובדים
Просмотров 1 тыс.5 лет назад
האם תהיתם לעצמכם איך פילטר של אינסטגרם באמת עובד מאחורי הקלעים? סרטון זה מציג עקרונות בסיסים בעיבוד תמונה וכלים פשוטים בשפת התכנות פייתון ליצירת פילטר של אינסטגרם.
Decision Trees - עצי החלטה
Просмотров 1,5 тыс.5 лет назад
סרטון המציג את הנושא של עצי החלטה. חלק ראשון מתוך הרצאה מקיפה יותר בנושא של Tree Models and Ensembles המועברת תחת תוכנית ההכשרה Future Learning. אתר התוכנית: futurelearning.ai/
Real-time Action Recognition with Non-local Network
Просмотров 1,3 тыс.5 лет назад
Inference demo of a project done by me and Daniel Shafer (The person in the video) Webcam feed is inferred to ResNet101 w/ Non-local Blocks trained on the UCF-101 dataset. Precision can be improved by utilizing the optical flow; However, creating optical flow data and running it on a parallel network hinders real-time performance substantially. Source code on GitHub: github.com/danbochman/Real-...
One Hot Encoding with Python | Handling Categorical Data
Просмотров 11 тыс.6 лет назад
In this tutorial you can see how one hot encoding is applied in order to handle categorical data, step-by-step, in a real world data problem environment. You can check out the whole project from A to Z on my GitHub page: github.com/danbochman/FARS_LEARNING If you have any questions, feel free to ask in the comments! Please let me know if there are any specific machine learning tutorials you wis...

Комментарии

  • @anantharamaniyer9135
    @anantharamaniyer9135 2 месяца назад

    Watching this again reveals even more ideas! Many thanks for this Dan. Also do you have similar ones using Datashader and hvplot using pandas / polars /dask for plotting line,bar charts etc?

  • @4096fb
    @4096fb 10 месяцев назад

    Thanks for the video! One question, what to do when I have z as pd.Series and not as a matrix? Not sure what would be the right way to convert it to matrix. I can use reshape, but I'm not sure it will shape the matrix as required.

  • @agracian1
    @agracian1 Год назад

    Hi Dan, got the following error in cell [7] when trying to run your script in Colab: --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-7-f76c603a8d2f> in <cell line: 1>() ----> 1 feature_matrix_customers, features_defs = ft.dfs(entities=entities, 2 relationships=relationships, 3 target_entity="customers") 1 frames /usr/local/lib/python3.10/dist-packages/featuretools/utils/entry_point.py in function_wrapper(*args, **kwargs) 30 # call function 31 start = time.time() ---> 32 return_value = func(*args, **kwargs) 33 runtime = time.time() - start 34 except Exception as e: TypeError: dfs() got an unexpected keyword argument 'entities'

  • @-mwolf
    @-mwolf Год назад

    awesome intro

  • @apratimdey6118
    @apratimdey6118 Год назад

    Hi Dan, this was a great introductory video. I am learning Dask and this was very helpful.

  • @ecemgungor6208
    @ecemgungor6208 Год назад

    Thanks for the videos. I have a question about multivariate data. I have three independent variables and would like to see their occurrences by coloring the data based on their probability densities (plot type can be contour, surf etc.) Which function should I use? Could you please help me with this?

  • @saidmkc3069
    @saidmkc3069 Год назад

    Hey Dan, can you explain how to change the color scale in green - yellow - red?

  • @realsushi_official1116
    @realsushi_official1116 Год назад

    Great video, worth trying the notebook after reading the paper

  • @joseleonardosanchezvasquez1514

    Great

  • @FabioRBelotto
    @FabioRBelotto Год назад

    I've added dask delayed to some functions. When I visualize it, there are several parallel works planned, but my cpu does not seems to be affected (its only using a small percentage of it)

  • @FabioRBelotto
    @FabioRBelotto Год назад

    Thanks man. It's really hard to find some information about dask

  • @parikannappan1580
    @parikannappan1580 Год назад

    4:12 ,visualize() , where can in get documetation? I tried to use for sorted() did not work

  •  Год назад

    Hi 😀 Dan. Thank you for this video. Do you an example which uses the apply() function? I want to create a new column based on a data transformation. Thank you!

  • @felixtechverse
    @felixtechverse 2 года назад

    How to schedule a dask task? for example how to put a script to run every day at 10:00 with dask for example

  • @2false637
    @2false637 2 года назад

    Saved my ass, man. Thanks!

  • @vegaalej
    @vegaalej 2 года назад

    Many thanks for this excellent video! It is really clear and helpful! I just have one question, I tried to run the notebook, and ran pretty well after some minor updates. Just the last line I was not able to make it run: # never run out-of-memory while training "model.fit_generator(generator=dask_data_generator(df_train), steps_per_epoch=100)" Gives me an error message: InvalidArgumentError: Graph execution error: TypeError: `generator` yielded an element of shape (26119,) where an element of shape (None, None) was expected. Traceback (most recent call last): TypeError: `generator` yielded an element of shape (26119,) where an element of shape (None, None) was expected. [[{{node PyFunc}}]] [[IteratorGetNext]] [Op:__inference_train_function_506] Any recommendation on how I should modify it to make it run? Thanks AG

  • @piotr780
    @piotr780 2 года назад

    very good ! thank you ! :)

  • @obamaengineer4806
    @obamaengineer4806 2 года назад

    fantastic video must say,,keep going sir...ur really really great...teaching such a complex thing in such short video ythat too so clear...thanks a lottt again

  • @JimmyGelhaar
    @JimmyGelhaar 2 года назад

    Great job on this video, Dan! Datashader is pretty sweet!

  • @_XY_
    @_XY_ 2 года назад

    👏👏

  • @alikoko1
    @alikoko1 2 года назад

    Great video! Thank you man 👍

  • @淘宝买的会员
    @淘宝买的会员 2 года назад

    Thank you. You should come back and bring more content.

  • @pritommojumder367
    @pritommojumder367 2 года назад

    Nice demo. Thank you for sharing. Keep up the good work.

  • @aydoganavcoglu2675
    @aydoganavcoglu2675 2 года назад

    Many thanks your very sophisticated steps by steps instructions! I would like to ask you how we can reach these kind of data set like more than a millition being able to use datashader?

  • @vegaalej
    @vegaalej 2 года назад

    Great Video an Explanation! Thank you very much for it! IT is really helpful! I tried to run the notebook, and it ran pretty well after some minor updates. I just had problems to run the latest part. "never run out-of-memory while training", seems the generator or steps per epoch part is giving some prblem I cant fogure hout how to solve. Any possible suggestion on how to fix the code? Thanks! InvalidArgumentError: Graph execution error: TypeError: `generator` yielded an element of shape (26119,) where an element of shape (None, None) was expected. Traceback (most recent call last): TypeError: `generator` yielded an element of shape (26119,) where an element of shape (None, None) was expected.

    • @AlejandroGarcia-ib4kb
      @AlejandroGarcia-ib4kb 2 года назад

      Interesting, I have the same problem in the last part of the notebook. Seems it is related to IDE, it needs and update.

  • @krocodilnaohote1412
    @krocodilnaohote1412 2 года назад

    Man, great video, thank you!

  • @r.m10234
    @r.m10234 2 года назад

    Thanks Man!

  • @sripadvantmuri89
    @sripadvantmuri89 2 года назад

    Great explanations for beginners!! Thanks for this...

  • @marialuisaargaezsalcido4957
    @marialuisaargaezsalcido4957 2 года назад

    Hi, trying to do your excerise code , but an error appears : TypeError: dfs() got an unexpected keyword argument 'entities'.

    • @danbochman
      @danbochman 2 года назад

      Hi Maria, It has been 2 years, so they probably changed their dfs function arguments. Looking in the documentation for dfs: featuretools.alteryx.com/en/stable/generated/featuretools.dfs.html It seems this function now expects an argument called entityset And this entityset is: "entityset (EntitySet) - An already initialized entityset. Required if dataframes and relationships are not defined." EntitySet seems to be a new class : featuretools.alteryx.com/en/stable/generated/featuretools.EntitySet.html Perhaps my guide is outdated in terms of syntax but the concept should stay the same!

  • @summerxia7474
    @summerxia7474 2 года назад

    Very nice video!!!! Thank you so much!!! Pls make more about this hands-on video! You explain them very clear and helpful!!!!

  • @ГерманРыков-ъ6в
    @ГерманРыков-ъ6в 2 года назад

    amazing. Very interesting theme.

  • @samable9585
    @samable9585 2 года назад

    wonder what is difference between encoding and mapping. For example if STATE_CD goes from 1 to 50 say, now its numeric - can it be used in AI learning without resorting to one hot encoding?

    • @danbochman
      @danbochman 2 года назад

      If you map states to numbers 1 to 50, it can be used in ML, but you inserted an internal relationship that doesn’t exist (state 1 is more similar to state 2, very far from state 50)

    • @samable9585
      @samable9585 2 года назад

      @@danbochman thanks for the reply. understood

  • @bodenseeboys
    @bodenseeboys 2 года назад

    Chapeau, well explained and healthy usage of memes!

  • @datadiggers_ru
    @datadiggers_ru 2 года назад

    Great intro. Thank you

  • @asd222treed
    @asd222treed 2 года назад

    Great video! Thank you for sharing.But I think your code would have some incorrect codes in machine learning with dask part. There is no X in the code (model.add(..., input_dim=X.shape[1], ... ) and when I training model.fit_generator, the tensor flow saids model.fit_generator is deprecated.. and finally displayed error - AttributeError: 'tuple' object has no attribute 'rank'

    • @danbochman
      @danbochman 2 года назад

      Hey! Whoops, I must've changed the variable name X to df_train and wasn't consistent in the process, it probably didn't pop a message to me because X was still in my notebook workspace. You can either change df_train to X or change X to df_train X <==> df_train. Just be consistent and it should work!

  • @gholamrezadar
    @gholamrezadar 2 года назад

    Great video with good examples. Loved the MiniNet part. Thank you.

  • @madhu1987ful
    @madhu1987ful 2 года назад

    Hey thanks for the awesome video and the explanation. I have a use case. I am trying to build a Deep learning tensorflow model for time series forecasting. For this I need to use multinode cluster for parallelization across all nodes. I have a single function which can take data for any 1 store and predict for that store. Likewise I need to do predictions for 2 lakh outlets. How can I use dask to parallelize this huge task across all nodes of my cluster? Can you please guide me. Thanks in advance.

    • @danbochman
      @danbochman 2 года назад

      Hi Madhu, Sorry, wish I could help, but node cluster parallelization tasks are more dependent on the framework iteself (e.g. Kubernetes), than Dask. You have the dask.distributed module (distributed.dask.org/en/stable/), but handling the multi-worker infrastructure is where the real challenge lies...

  • @cristian-bull
    @cristian-bull 2 года назад

    I really appreciate the batch-on-the-fly example with keras.

  • @markp2381
    @markp2381 2 года назад

    Great content! One question, isn't it strange to use function in this form: function(do_something)(on_variable) instead of function(do_something(on_variable)) ?

    • @danbochman
      @danbochman 2 года назад

      Hey Mark! Thanks. I understand what you mean, but when a function returns a function this makes sense, as opposed to a function which outputs the input to the next function. delayed(fn) returns a a new function (e.g. "delayed_fn"), and this new function is then called regularly delayed_fn(x). So its delayed(fn)(x). All decorators are functions which return callable functions. In this example they are used quite unnaturally because I wanted to keep both versions of the functions. Hope the explanation helped!

  • @guideland1
    @guideland1 2 года назад

    Really great Dask introduction and the explanation is so easy to understand. That was useful. Thank you!

  • @Единыймир.Переводыисубтитры

    Many thanks. Now I understand why the file was not read

  • @rashidsyed
    @rashidsyed 3 года назад

    Nice

  • @vitorruffo9431
    @vitorruffo9431 3 года назад

    Good work sir, your video has helped me to get started with Dask. Thank you very much.

  • @unathimatu
    @unathimatu 3 года назад

    This is really greAT

  • @DanielWeikert
    @DanielWeikert 3 года назад

    Great. I would also like to see more on DASK and Deep Learning. How exactly would this generator be used in pytorch? Instead of the DataLoader. Thanks for the video(s)

  • @chenzakaim3
    @chenzakaim3 3 года назад

    יש משהו דומה על למידת מכונה?

  • @dwivedys
    @dwivedys 3 года назад

    Excellent!

  • @pizonshetu39
    @pizonshetu39 3 года назад

    Thank you so much for your explanation, I learned more in this one video than reading multiple articles where my mind felt bogged and bored each time I read a line but this is so digestible and easy to understand

  • @fish3485
    @fish3485 3 года назад

    Dan, I just found your videos. They’re great! Will you be making any more?

    • @danbochman
      @danbochman 3 года назад

      Hey Benjamin, glad you liked it! Unfortunately, I don't think I'll be making new videos soon... Just out of curiosity, what kind of videos/topics would you be interested in?

  • @priyabratasinha1478
    @priyabratasinha1478 3 года назад

    Thank you so much man... you saved a project... 🙂🙂❤❤🙏🙏