Data Science for Computational Drug Discovery using Python (Part 1)

Поделиться
HTML-код
  • Опубликовано: 29 янв 2025

Комментарии •

  • @DataProfessor
    @DataProfessor  4 года назад +39

    Did you find value in this End-to-end tutorial in Bioinformatics/Cheminformatics? If you would like more videos like this please give it a 👍Like and ❤️Subscribe to the channel. Please comment down below your thoughts and suggestions 👇

  • @FrancoCiminoPrado
    @FrancoCiminoPrado 4 года назад +16

    I'm just starting with python, I'm an organic chemist looking to change my field from wet lab to comp chem, this is gold for me, thank you very much.

    • @DataProfessor
      @DataProfessor  4 года назад +5

      Thanks Franco for the comment. I'm planning on making more of these data science for drug discovery videos.

    • @FrancoCiminoPrado
      @FrancoCiminoPrado 3 года назад +1

      @@Kelrash31 Hi Alain, it's been slow but consistent. I've been working on QSAR and Docking at the moment. Still haven get that much into scripting for data managing but it's in the future plans.

    • @rahimakhatun4935
      @rahimakhatun4935 3 года назад

      @@DataProfessor Hi I am enthusiastic to learn QSAR and MD simulation for Protein Degraders. Do you recommend particular blog or any Book to learn data science for drug design/optimization. Your lectures/explanation are fantastic, getting enlightened all the pros/cros about CADD. Thank you very much

  • @michaeloladunjoye5258
    @michaeloladunjoye5258 4 года назад +3

    I'm presently working on drug discovery with deep neural networks and I found this tutorial very helpful.

    • @DataProfessor
      @DataProfessor  4 года назад

      That's awesome Michael! Speaking of drug discovery, I have a several more videos covering the topic here, you are more than welcome to check them out bit.ly/dataprofessor-bioinformatics

  • @epicakku5381
    @epicakku5381 4 года назад +1

    i m a student in india i have so much interest in bioinformatics and i found u thank u so much.... I m currently studying to get in a college for bioinfo undergrad

    • @DataProfessor
      @DataProfessor  4 года назад +1

      It's my pleasure, welcome to the channel and welcome to bioinformatics 😊

  • @shwetaredkar734
    @shwetaredkar734 4 года назад +2

    Just loving the content you make.

    • @DataProfessor
      @DataProfessor  4 года назад +1

      Thanks Shweta for the kind comment!

  • @marcofestu
    @marcofestu 4 года назад +3

    I was waiting for this one, thank u 😁

    • @DataProfessor
      @DataProfessor  4 года назад +1

      Thanks Marco for the comment! Glad to hear that!

    • @parobe6167
      @parobe6167 4 года назад

      @@DataProfessor Great Video! What books you recommend me? I am going to start my PhD in Drug Discovery. Moreover, if you have githubs or colabs, all is perfecto for me. Thanks!

  • @RojinaPanta1
    @RojinaPanta1 5 месяцев назад

    are these descriptors any better than molecular fingerprint ?

  • @josejonnyrodriguezfajardo4135
    @josejonnyrodriguezfajardo4135 4 года назад +1

    I already subscribe to your chanel. First time I found something like this after very long time searching for this kind of videos. I'm very pleased with the information on this video. Congratulations 👏🎊🎉 dear Data professor.I want to become data scientist on the field of drugs discovery and design, can you advise me where to start and which book should I reed first from the list below. Thank you.

  • @traveldiaries347
    @traveldiaries347 4 года назад +3

    That's great, kindly make a whole series for drug discovery pipeline using ML/DL methods, Thanks you

    • @DataProfessor
      @DataProfessor  4 года назад +2

      Hi, thanks for the comment. This channel got you covered, make sure to go through this Bioinformatics playlist of 17 videos (more to come) that includes theory and practice (step-by-step) to get you started in doing bioinformatics projects. bit.ly/dataprofessor-bioinformatics

    • @traveldiaries347
      @traveldiaries347 4 года назад +2

      @@DataProfessor thank you so much Professor

  • @sebastianjorgecastro2452
    @sebastianjorgecastro2452 4 года назад +1

    I would like to suggest a video using RDKit for conformational search and energy minimization. I'm just starting my bioinformatic project and this video was really helpfull! Thanks!

  • @rasianaik9084
    @rasianaik9084 3 года назад +2

    awsome series.......could you please make a video on extracting important information of drugs like chemical structure, target proteins, sideeffects from different databases...thankyou

  • @sandeepurandur7930
    @sandeepurandur7930 4 года назад +2

    Hey, I'm a pharmacy graduate and I'm new to data science, we're working on solubility prediction, your video seems to be useful to estimate the aq.sol of compounds. We have few new compounds whose solubility needs to be estimated. If you can tell me how to use the above method for new compounds it will be more helpful to us.

    • @DataProfessor
      @DataProfessor  4 года назад +4

      Hi, I've made a video on how to build a solubility prediction web app, you can check it out here ruclips.net/video/iZUH1qlgnys/видео.html
      A demo of this web app is also provided in the video description.

  • @vrschwrngsthrtkr22
    @vrschwrngsthrtkr22 4 года назад +3

    How can this be used to do DIY, at home drug discovery? What I mean by this is, does this only have academic value or can you apply the results to something you can easily get without an university degree or comparable credentials? As you might know, a small company started selling kits that allow you to genetically modify bacteria and frogs with crispr. I am looking for something along the lines.

    • @vrschwrngsthrtkr22
      @vrschwrngsthrtkr22 4 года назад

      So?

    • @DataProfessor
      @DataProfessor  4 года назад +2

      The computational model can definitely be built by anyone if following the step-by-step tutorials. As for bringing the discovered knowledge to the next step, you may need to collaborate with many people (chemists, other biologists, FDA officials/ clinical trials, etc.) Bringing a drug to market is a billion-dollar endeavor that involves many people/organizations.

    • @vrschwrngsthrtkr22
      @vrschwrngsthrtkr22 4 года назад +1

      @@DataProfessor Not everyone lives in the united states. That being said, I conclude that this only has academic value.
      Stop and think for a moment about exceptions. Which experiments can you conduct that come as close as possible to real drug design without the need for paid chemists, other biologists, clinical trials?

    • @DataProfessor
      @DataProfessor  4 года назад +1

      @@vrschwrngsthrtkr22 Thanks for the discussion. I agree, that it takes a lot of resources for bringing a drug to market. Actually, much of the budget for carrying drug discovery and development are from big pharmaceutical companies while academia accounts for a minor portion.

    • @vrschwrngsthrtkr22
      @vrschwrngsthrtkr22 4 года назад +2

      @@DataProfessor Which experiments can you conduct that come as close as possible to real drug design without the need for paid chemists, other biologists, clinical trials?

  • @keerthikonjety6257
    @keerthikonjety6257 4 года назад +1

    Precise and clear.Thank you so much!

    • @DataProfessor
      @DataProfessor  4 года назад

      Thanks for watching and for the kind words 😁

  • @gustavoespinoza7940
    @gustavoespinoza7940 2 года назад

    You can use pandas apply function to simplify a lot of the computation involving the ratio between aromaticity and heavy atoms.
    if you define a function
    def foo(row):
    ## compute aromaticity by heavy atom for a single row
    ## row contains the fields for a given row in your pd dataframe
    then do
    df["aromatic_to_heavy"] = df.apply(foo)
    I think with pandas its best to use their in-built functions for iterations to save computational power

  • @negarmokhtari3411
    @negarmokhtari3411 4 года назад +1

    Can you help me with finding how to counts the number of atom in compound with rdkit?I wanna use'non-carbon proportion' feature in my model!

    • @DataProfessor
      @DataProfessor  4 года назад

      Yes, you can use the .GetNumAtoms() function on the molecule object. More details provided here www.rdkit.org/docs/GettingStartedInPython.html

  • @louisl7245
    @louisl7245 3 года назад +1

    Thanks. It is very great learning process via your video

  • @aayushividhoy5943
    @aayushividhoy5943 Год назад

    I have completed my biomedical engineering...and currents working in clinical SAS will I be able to Switc job in this domain ?

  • @sametgumus1281
    @sametgumus1281 4 года назад +2

    thank you professor please share more about drug discovery

    • @DataProfessor
      @DataProfessor  4 года назад +1

      Thanks Samet, my pleasure, please stay tuned by hitting the notification bell 😃

  • @michalisgeorgiou2886
    @michalisgeorgiou2886 4 года назад +4

    Thank you for your videos they are amazing!! Is it possible to provide us with some tips theoretical or practical knowledge on the data science for bioinformatics? e.g which are the most used data-preprocessing steps, feature selection steps and models, validation modes and on which bioinformatic problems can we use them?

    • @DataProfessor
      @DataProfessor  4 года назад +1

      Thanks Michalis for the suggestion! I’ll put this excellent idea into the to-do list for future videos.

  • @ropon-palaciosg.7760
    @ropon-palaciosg.7760 4 года назад +1

    i'm try predicted drung FDA approved using pharmacophore modelling, please as can i use DNN method for this approach.

    • @DataProfessor
      @DataProfessor  4 года назад

      I think you can, DNN is used to build the model, you’ll have to decide which input are you using, e.g. SMILES, chemical structure image, descriptors, fingerprints, etc.

  • @Chimie-Universitaire
    @Chimie-Universitaire 3 года назад

    i am a doctor in organic chemistry and macromolecular , I want to predicted the solubility of polymers using delanay predicted , p^lease can you give me the idea or the step that I should do it in the first
    I want to make experience and compared with this
    can you help me

  • @SuperShiva619
    @SuperShiva619 4 года назад +1

    Will there be usage of other ensemble algorithms like adaboost and GB ?

    • @DataProfessor
      @DataProfessor  4 года назад +1

      Do you mean for this dataset? yes, you can also use that here. This tutorial reproduces the research published by Dalaney and so also used linear regression to match the approach that they used.

    • @SuperShiva619
      @SuperShiva619 4 года назад

      @@DataProfessor thank u professor for the response.
      Could you also give some thoughts on how this model helps in future in drug development process ?

  • @xjeffrey344
    @xjeffrey344 3 года назад +1

    Thanks, professor. It is a really good tutorial. Can I use this method to predict the solubility of one chemical in a liquid solution (or lipid solubility)? If not , is there any suggestions or tools I can use for lipid solubility prediction? Thank you very much.

  • @stefanrucman5352
    @stefanrucman5352 4 года назад +1

    Amazing 👌🙌👌 insightful

  • @afolabiowoloye804
    @afolabiowoloye804 Год назад

    @Data Professor, many thanks

  • @ikechukwumichael1383
    @ikechukwumichael1383 6 месяцев назад +1

    Thank you

  • @liaanggraini8667
    @liaanggraini8667 4 года назад +2

    Hi prof, thank you for posting and sharing knowledge. My background is computer science and I am interested this topic about data science or AI driven in drug discovery since last year especially about drug interaction. However along the way, I found some difficulties regarding to understand this biological data, process and terms. Could you give me some tips to thrive in this field? I really want this field a
    to be my primary research topic in master degree

    • @DataProfessor
      @DataProfessor  4 года назад

      Hi Lia, thanks for sharing your interest in computational drug discovery. I've written some review articles on the topic that may provide some introductory viewpoints to the field.
      www.tandfonline.com/doi/full/10.1517/17460441.2015.1016497
      www.researchgate.net/publication/338639486_Best_Practices_for_Constructing_Reproducible_QSAR_Models
      A more complete list is at www.researchgate.net/profile/Chanin_Nantasenamat/research

    • @liaanggraini8667
      @liaanggraini8667 4 года назад

      @@DataProfessor hi professor, thank you so much for this. I am sorry for the late reply. I am starting to follow your video learning so I can understand both coding and biology data at the same time. Hopefully, we can collaborate in academic research in the future :). Keep spreading the knowledge, you are a great tutor

  • @rasianaik9084
    @rasianaik9084 4 года назад

    Hello sir, how to calculate drug pairwise similarity based on chemical structure fingerprint corresponding to 881 chemical structures defined in PubChem database?

    • @DataProfessor
      @DataProfessor  4 года назад

      Hi the pairwise molecular similarity can be computed using the Tanimoto coefficient, I think rdkit allows to do that.

    • @rasianaik9084
      @rasianaik9084 4 года назад

      @@DataProfessor thanks a lot ...is there any video of yours on that as i am new to this field .i have to start from scratch...any suggestions will be highly appreciated

  • @zapy422
    @zapy422 4 года назад +1

    Very useful.
    Where to find good data for training?

    • @DataProfessor
      @DataProfessor  4 года назад +2

      Thanks for watching! There are larger datasets available on chemical databases such as ChEMBL, PubChem, BindingDB, etc. which can be used as external datasets to the dataset used in this video.

  • @waleedrashad822
    @waleedrashad822 2 года назад +1

    Perfect

  • @bikashpradhan5954
    @bikashpradhan5954 4 года назад +1

    Does PhD is necessary for becoming a data scientist in the field like biotechnology or bioinformatics?

    • @DataProfessor
      @DataProfessor  4 года назад +1

      The answer really depends on the type of work that you would like to do. PhD is not necessary to become a data scientist working in the field of biotechnology/ bioinformatics. A PhD is necessary if you want to become a principal investigator and lead a research group (probably applicable more to academia, and maybe industry)

  • @satvikkg3059
    @satvikkg3059 4 года назад +2

    Can you please make a simple tutorial for gromacs on colab.

    • @DataProfessor
      @DataProfessor  4 года назад

      Satvik, coincidentally, it is in the making, I have already drafted a notebook but will film the video soon, please stay tuned. Please turn on the notifications so that you will be notified as soon as a new video comes out. Thanks for your suggestion!

  • @pcliang2693
    @pcliang2693 4 года назад +1

    love love ,nice course。

  • @miroslavanedyalkova5174
    @miroslavanedyalkova5174 4 года назад +1

    Could you share the notebook? Very nice tutorial.

    • @DataProfessor
      @DataProfessor  4 года назад

      Thanks Miroslava for the kind comment. The link to the notebook code for all videos in this channel is in the video descriptions of all videos. For this video, the link is github.com/dataprofessor/code/blob/master/python/cheminformatics_predicting_solubility.ipynb

  • @kashafnaz_
    @kashafnaz_ 3 года назад

    Awesome

  • @ernestbonat2440
    @ernestbonat2440 3 года назад +1

    Excellent videos by the Data Professor. Feel free to read the following blog paper on Medium website “Apply Machine Learning
    Algorithms for Genomics Data Classification”. This will help you to understand how to apply Machine Learning algorithms for
    genomic data classification. This blog paper contains the latest ML/AI technologies applied to human genomic data classification today.

  • @aruchan9890
    @aruchan9890 4 года назад +1

    Could you please tell me how to get rdkit in python2 colab notebook?

    • @DataProfessor
      @DataProfessor  4 года назад

      Thanks for the comment, owing to compatability issues, the code provided in the header part of the code on the provided DataProfessor GitHub works optimally. Please kindly refer to the provided code link in the video description.

  • @datascienceespanol869
    @datascienceespanol869 4 года назад +2

    Great exercise! I am also a Data Scientist but my videos are in spanish in case there's anyone interested!😁😁

  • @JohnDoe-oo9ll
    @JohnDoe-oo9ll 3 года назад

    It's totally not my place to say, and I know the professor must have thought a lot about his "lisp", but I believe he can maser the "s" sound if he focuses on how his teeth touch the tongue when pronouncing an open "eeeeeeeeeee" sound and slowly raising the tip of his tongue; AT SOME POINT STOP VIBRATING your throat (while making the "eee" sound) and just allow air to pass the tube created by the tongue and slowly raise the tip of your tongue to the roof of your mouth (it doesn't need to TOUCH the roof) without blocking the entire passageway for the air. ALSO for him specifically he might try to pull back the tongue (keeping contact with the same portion of the roof of the moutn) and use a more forward portion of the tip of the tongue. Air should NOT leave anywhere but from the front of the tongue

  • @MrChristian331
    @MrChristian331 4 года назад +2

    I'm totally lost with the aromatic atoms part

    • @DataProfessor
      @DataProfessor  4 года назад +1

      Hi Kris, I have written a complementary article on Medium explaining this at link.medium.com/8OB3NXKwo9