Call for Participation in the Open Bioinformatics Research Project

Поделиться
HTML-код
  • Опубликовано: 11 окт 2024
  • In this video, I will share a novel bioinformatics dataset. I have compiled a collection of bioactivity datasets from the ChEMBL (version 29) database. Particularly, there are 136 CSV files belonging to 136 variants of the Beta-Lactamase target protein. I also provide a high-level overview of the dataset as well as my thoughts on some of the analysis that you can perform and contribute to this Open Bioinformatics Research Project.
    You can think of this as sort of like a Hacktoberfest! Let’s work on this and learn together!
    Contribute to this Open Bioinformatics Research Project
    👉 GitHub github.com/dat...
    👉 Kaggle www.kaggle.com...
    Prerequisite knowledge
    👉 An Introduction to Computational Drug Discovery • An Introduction to Com...
    👉 How to use PaDelPy to calculate molecular descriptors/fingerprints from SMILES notation • How to build machine l...
    👉 Playlist of Bioinformatics videos • Bioinformatics Project...
    Support my work:
    👪 Join as Channel Member:
    / @dataprofessor
    ✉️ Newsletter newsletter.data...
    📖 Join Medium to Read my Blogs / membership
    ☕ Buy me a coffee www.buymeacoff...
    Recommended Resources
    📚 Books kit.co/datapro...
    😎 Taro (Tech Career Mentorship) www.jointaro.c...
    📜 Google Data Analytics Professional Certificate imp.i384100.ne...
    🤔 Interview Query www.interviewq...
    🖥️ Stock photos, graphics and videos used on this channel 1.envato.marke...
    Subscribe:
    🌟 Coding Professor / @codingprofessor
    🌟 Data Professor www.youtube.co...
    Disclaimer:
    Recommended books and tools are affiliate links that gives me a portion of sales at no cost to you, which will contribute to the improvement of this channel's contents.
    #bioinformatics #machinelearning #research #dataprofessor

Комментарии • 168

  • @DataProfessor
    @DataProfessor  3 года назад +10

    To share your progress or comments on various social media platforms about this Open Bioinformatics Research Project initiative, please use the tag #dataprofessor
    👉Twitter twitter.com/thedataprof
    👉LinkedIn www.linkedin.com/company/dataprofessor/
    🌟 Join as a Channel Member to support us:
    ruclips.net/channel/UCV8e2g4IWQqK71bbzGDEI4Qjoin
    🌟 Download Kite for FREE www.kite.com/get-kite/?

    • @OlehMezhenskyi
      @OlehMezhenskyi 3 года назад +1

      I just finished my notebook on Kaggle, hope it will be interesting to look at

    • @DataProfessor
      @DataProfessor  3 года назад

      @@OlehMezhenskyi Awesome, Thanks for your submission! :)

    • @twirlntaal
      @twirlntaal 3 года назад +1

      Is there any deadline for submissions ?

  • @paulomenezes4867
    @paulomenezes4867 3 года назад +4

    Hello Data Professor.
    Very nice video!! I'm a biologist, finishing my phd thesis in structural biochemistry. I'm studying an specific enzyme (Isocitrate lyase), an important enzyme that plays a central role in the mobilization of lipidic reserves in seeds during the germination process (metabolic pathway == glyoxylate cycle), as a molecular target for new herbicides mechanisms of actions. We are studying it in silico, in vivo and in vitro. I found your video very interesting, and I believe that interdisciplinary studies applying machine learning will be very promising in many fields of science in the near future. I will keep following the development of your project.
    Cordially, Paulo Menezes (Brazil).

    • @DataProfessor
      @DataProfessor  2 года назад +2

      Sounds like an interesting project, yes machine learning could definitely be used to provide insights from such data and good luck on your research endeavor. :)

  • @JGTB95
    @JGTB95 3 года назад +2

    What a great chanel! It really drives me to keep studying programming and statistics!

  • @annapurnasatti8650
    @annapurnasatti8650 3 года назад +5

    I am extremely delighted to participate and contribute my best. I am an experienced life science researcher and a beginner in the field of data science.

    • @DataProfessor
      @DataProfessor  3 года назад

      Glad to have you on board, welcome to the initiative.

  • @danielajbq
    @danielajbq 3 года назад +4

    I’m very excited about this! Cant wait to contribute. I have my bachelors in biochemistry, and I’m currently doing a masters in bioinformatics.

  • @gustavojuantorena
    @gustavojuantorena 3 года назад +6

    Awesome! I want to play with this dataset. I think the idea of analyzing it together is great 👍

  • @haldanesghost
    @haldanesghost 3 года назад +2

    I’m gonna take a bite out of this once I get out of work. Exciting stuff!

  • @ayushsingh5543
    @ayushsingh5543 3 года назад +7

    I would like to contribute to this project, I love the way you share your knowledge🙌

    • @DataProfessor
      @DataProfessor  3 года назад +1

      Awesome, thank you! Look forward to it! 😊

  • @024_soumeemukherjee2
    @024_soumeemukherjee2 3 года назад +3

    I would like to participate as well!! This looks like a great idea for a project!!!

  • @sameerdangol3243
    @sameerdangol3243 3 года назад +1

    I am really interested. Your video series on the CHEMBL dataset was the gateway for me to actually do hands-on bioinformatics work which then lead me to learn further programming. So, definitely would love to participate and contribute to the project.

    • @DataProfessor
      @DataProfessor  3 года назад +1

      Wonderful! Great to have you on board as well!

  • @souldiezcamp2380
    @souldiezcamp2380 3 года назад +2

    I was one of those interested keep it up bro u are among the best people in the scientific field

  • @Ibraheem_ElAnsari
    @Ibraheem_ElAnsari 3 года назад +2

    I'm quite interested ! Bioinformatics is totally new to me though, I'll look into it seems interesting

    • @DataProfessor
      @DataProfessor  3 года назад +1

      Thanks Ibraheem, you’re most welcome to the challenge! Aside from feature interpretation which requires domain expertise in biology, everything else is a series of data processing/wrangling and model building.

  • @marydinchen7141
    @marydinchen7141 3 года назад +4

    i'd love to contribute to this. im currently a bachelor's student in computer science and always had the goal of wanting to specialize in bioinformatics!!

    • @marydinchen7141
      @marydinchen7141 3 года назад +2

      i'll go through the prerequisites hopefully!

    • @claudiaschmidt3708
      @claudiaschmidt3708 3 года назад

      There is a bachelor and masters in bioinformatics... Why just specialize?

  • @wayne88ho
    @wayne88ho 3 года назад +1

    Can’t wait to participate!

  • @bastiencasini6883
    @bastiencasini6883 3 года назад

    Hi, super interesting. I am doing research on beta lactam, lactamases and transpeptidase, would love to contribute. Is there any discord or channel to organise and don't make things twice? :)

  • @rajarshimondal
    @rajarshimondal 3 года назад +2

    Wow. Very much interested.

  • @athena7793
    @athena7793 3 года назад +1

    Sounds fun. I recently submitted my phd thesis in bioinformatics. I will have a look

    • @DataProfessor
      @DataProfessor  3 года назад +1

      Awesome, it's an open experiment, hopefully we can learn together as a community and publish a paper in the process.

  • @dailyDesi_abhrant
    @dailyDesi_abhrant 3 года назад +3

    Let’s do this. I am currently pursuing a biomedical engineering degree and I am very interested in machine learning and AI . I will try my best to contribute. Is it fine if I use kaggle and upload public notebooks there ?

    • @DataProfessor
      @DataProfessor  3 года назад +1

      Yes, please upload public notebooks to the betalactamase Kaggle page (links in the repo)

  • @graviyt
    @graviyt 3 года назад +1

    This is awesome, thanks for sharing this project. I would like to contribute as well!

  • @chakladar1
    @chakladar1 3 года назад +2

    This is quite interesting. I recently completed my master's dissertations of building a QSAR Machine learning classification model. And I would really like to contribute in this project.

    • @DataProfessor
      @DataProfessor  3 года назад +2

      Great, looking forward to your contribution, the easiest way is to contribute a notebook to Kaggle (links in video description).

  • @sampresman5128
    @sampresman5128 3 года назад +1

    sounds amazing i would love to join!

  • @simonli8510
    @simonli8510 3 года назад +4

    Question: When considering removing molecules with pchembl based on standard deviation, should that be based on each target protein? As in if molecule had std > 2 for target protein A but not for target protein B, then remove the rows with target protein A b/c the assay is still valid for target protein B.

    • @DataProfessor
      @DataProfessor  3 года назад +1

      Yes, you are correct, it is based on each unique target. Afterwards you can build models separately for each target.

  • @francescociulla
    @francescociulla 3 года назад +2

    I would love to participate?🔥

  • @danielvo3750
    @danielvo3750 3 года назад +3

    Would you consider adding the "Hacktoberfest" topic to the repository? Maybe it could get more people interested in contributing!

    • @DataProfessor
      @DataProfessor  3 года назад +2

      Great suggestion! I’ve added the Hacktoberfest mention to the video description and keywords. Thanks Daniel!

  • @Rameez1230
    @Rameez1230 3 года назад +1

    Thanks a lot for sharing the dataset. I would like to participate in the project. You are truly an inspiration for us.

    • @DataProfessor
      @DataProfessor  3 года назад

      Thanks Rameez, welcome to the challenge!

  • @ShravanKumar147
    @ShravanKumar147 2 года назад +1

    This is a fantastic initiative.

  • @prathamprasoon2535
    @prathamprasoon2535 3 года назад +3

    This looks awesome, I'm in :)

    • @DataProfessor
      @DataProfessor  3 года назад

      Hey awesome Pratham, glad to have you!

    • @SlazeM7
      @SlazeM7 3 года назад

      Wow, I know you from twitter. So this feels like a crossover episode.

  • @wolfpain5928
    @wolfpain5928 3 года назад +3

    This looks very interesting. I would like to participate in this project.

  • @mostafagafer8621
    @mostafagafer8621 2 года назад +1

    I work as a bioequivalence clinical research associate, and I am new in data science and data analytics, I started to build a data frame from my own data, which contains all the relevant pharmacokinetic parameters (Cmax, AUCs, T-Half) and many chemical properties for each molecule that I scrapped from drugbank with the formula for each product. and I would love to participate in this project. although it is pretty advanced for me. but I'll do my best trying to participate.
    thanks for your videos, it gives me hope in the ML field. you are the best

    • @DataProfessor
      @DataProfessor  2 года назад +1

      Awesome, what you're doing is equally advanced, look forward to your contributions.

  • @m.fathurrahmanakidb.mazlan2448
    @m.fathurrahmanakidb.mazlan2448 3 года назад +1

    i gladly want to try and participate in the reasearch

  • @SuperJg007
    @SuperJg007 3 года назад +1

    I wanna contribute too. It sounds like a very fun project.

  • @user-qy2wf2lt6v
    @user-qy2wf2lt6v 3 года назад +1

    I really love this channels and the way it covers it's topics. What would it take for one to become involved academically or professionally into the field of Bioinformatics?

    • @DataProfessor
      @DataProfessor  3 года назад +1

      Teaching, conducting research and publishing papers are some ways to contribute to the field of bioinformatics. This open project aims just that, to contribute to scholarly research in bioinformatics.

    • @user-qy2wf2lt6v
      @user-qy2wf2lt6v 3 года назад +2

      @@DataProfessor Thank you for your answer, Data Professor! I will fallow your advice and content. I hope I am not too annoying for asking one more question.
      I already have a degree in Computer Science and somewhat good foundation with Data Science, but my lack of knowledge, in chemistry and biology, seems like a very limiting factor. Will it be beneficial to pursue further education in a similar field if I am to fallow a career path in bioinformatics?
      Thank you for your time!

  • @arushiverma7207
    @arushiverma7207 3 года назад +1

    Yess.. im interested in thiss project 🙌

  • @biology_ki_chhatra
    @biology_ki_chhatra 3 года назад +1

    Sounds interesting!
    I would like to join

  • @mohiagahi2573
    @mohiagahi2573 3 года назад +1

    I'm interested to join 🙋🏻‍♀️

  • @dumisanindhlovu6415
    @dumisanindhlovu6415 3 года назад +2

    I would like to participate in this project , am very much interested in the soft computing science !

  • @Gagant
    @Gagant 3 года назад +2

    I would like to contribute to this project!! I’m a 4th year biology undergrad so I can def help with the paper writing aspect. I’ve also taken courses on bioinformatics

    • @DataProfessor
      @DataProfessor  3 года назад +1

      Awesome, glad to have you on board. More info on the biology aspect (probably on the model interpretation part) will be announced in the future once the model has been built.

  • @Mina-zw5xh
    @Mina-zw5xh 3 года назад +2

    I am very early in my MSc for precision medicine and am only beginning to gain experience in R/python/Linux for data analysis - will this process be well documented so that I could follow along and learn from once my basic knowledge is stronger?

    • @DataProfessor
      @DataProfessor  3 года назад +1

      Thanks for the comment, It's a community effort, so we can probably all learn together in the open.

    • @Mina-zw5xh
      @Mina-zw5xh 3 года назад +2

      @@DataProfessor perfect - thank you

  • @ayanprasadmukherjee9292
    @ayanprasadmukherjee9292 3 года назад +1

    Wow! I am also interested!!

  • @saeedr7863
    @saeedr7863 3 года назад

    Hello professor, I tried to get fingerprint binary data for 'canonical_smiles', but descriptor generated only 1407 observations. I think actual data has 71973 observations. Where did I make mistake? Thanks in advance

  • @babjishaik5605
    @babjishaik5605 3 года назад +1

    I would like contribute my best in this project and also learn more from the project. I hope with the help of you I can gain knowledge and also a good job 👍

    • @DataProfessor
      @DataProfessor  3 года назад

      Awesome, welcome to the challenge!

    • @babjishaik5605
      @babjishaik5605 3 года назад

      @@DataProfessor thank you sir. Can you explain me what is the work alloted me to do?
      I'll try finish.

  • @nedafiroz514
    @nedafiroz514 3 года назад +1

    I will be interested

  • @jordi.z327
    @jordi.z327 3 года назад +1

    Maybe a good idea to make a Discord server where people can easily collaborate? Also helps to know what others are working on so we don't reinvent the wheel multiple times within the same community (aka basic EDA + molecular simulation results, etc)?

    • @DataProfessor
      @DataProfessor  3 года назад

      Great suggestion. Will have to figure out how to make the Discord server.

  • @Rekhabohra19
    @Rekhabohra19 3 года назад +1

    Seems very interesting.. I would like to contribute.

  • @atirutboribalburephan6479
    @atirutboribalburephan6479 3 года назад +1

    Very interesting. I'm in!!

  • @soras2327
    @soras2327 3 года назад +1

    I am a graduate student in biotechnology. I am really interested to be a part in it.

  • @janeeshashashimini170
    @janeeshashashimini170 3 года назад +1

    I am interested to kick start this...

  • @gangotrisingh7057
    @gangotrisingh7057 3 года назад +1

    I'm 4th yr btech biotech student . I know basics of bioinformatics and I would like to contribute to this project.

  • @wonwill519
    @wonwill519 3 года назад +1

    It sounds fantastic!!

  • @saeedr7863
    @saeedr7863 3 года назад

    Hello professor,
    Padeldescriptor function generates fingerprint binary data for the mol_dir 'smi' file. Is that true? Thank you

  • @vaishalipatil3617
    @vaishalipatil3617 9 месяцев назад

    ❤❤

  • @albertmakhmudov
    @albertmakhmudov 3 года назад +1

    Great idea Prof!

  • @poonamvishwakarma694
    @poonamvishwakarma694 3 года назад +1

    I would like to participate in this research work!!

  • @rickharold7884
    @rickharold7884 3 года назад +1

    Super awesome

  • @williamguesdon400
    @williamguesdon400 3 года назад +1

    Awesome initiative!

  • @danielniels22
    @danielniels22 3 года назад +1

    Hello, im new into ur channel. were all of your Bioinformatics projects made for everyone or just for the bioinformatics people? can we from outer domain knowledge follow this?

    • @DataProfessor
      @DataProfessor  3 года назад

      Hi, the bioinformatics contents on this channel are created for the general audience. I'll be linking prerequisite contents in the video description.

  • @alejandroochoa9541
    @alejandroochoa9541 2 года назад +1

    Hi Prof, i'm a ungraduate student trying to learn through this proyect. I've been working on this, but after filtering, my model fails in learning. Strange behavior of the train and test datasets are present, so the model is not learning. I tried with different architectures but it was not working. Any suggestions? I am pretty sure that the preprocessing is correct

    • @DataProfessor
      @DataProfessor  2 года назад +1

      Hi Alejandro, have you (1) used the csv file to build the model or (2) have you computed the descriptors from the SMILES and use the generated features to predict the pIC50 values (convert IC50 to pIC50 via negative log) in a regression model or bin the IC50 values to active/inactive class in a classification model. Please follow option 2. Hope this helps 😊

  • @charmytwala
    @charmytwala 3 года назад +1

    Great, I'm very interested and would like to be part of this project.

  • @yankoteixeira9138
    @yankoteixeira9138 3 года назад +1

    I would love to be able to contribute. I'm right now at my master's in bioinformatics, and I would love to contribute to this project

    • @DataProfessor
      @DataProfessor  3 года назад

      You're very welcome, look forward to your submission! 😊

  • @adnaneaouidate3934
    @adnaneaouidate3934 3 года назад

    I know where to spend my week end now :D, let's do some coding thank you for the idea I will participate

  • @ayeshak791
    @ayeshak791 3 года назад +1

    Hello I would like to join as well !! I am a biological science student with some experience in machine learning. Let me know if I can contribute .

    • @DataProfessor
      @DataProfessor  3 года назад

      Yes, welcome to the initiative. To contribute you can perform analysis on the dataset and upload your notebook to Kaggle (links in the description).

  • @nadia_islam_1985
    @nadia_islam_1985 3 года назад +1

    Lovely! I would like to join if that’s not too late!

    • @DataProfessor
      @DataProfessor  3 года назад

      Yes, you can join, it’s just started

  • @internetdude9000
    @internetdude9000 2 года назад +1

    Best regards Data Professor, is this call still open?

    • @DataProfessor
      @DataProfessor  2 года назад

      Hi, yes the Open project is on-going. I'll create more follow-up videos about this soon.

  • @semaatasever3802
    @semaatasever3802 2 года назад

    Number of PubChem molecular descriptors (338 molecules) obtained with PaDEL differs
    from the number of molecules in the molecule.smi file (64424 molecules). How can we solve this problem?

    • @DataProfessor
      @DataProfessor  2 года назад

      Hi, I suspect there may be some error in the SMILES notation which PaDEL may not properly read. I recommend to check the log file from the calculation, error details are normally written there.

  • @MM-yg2zj
    @MM-yg2zj 3 года назад

    I want to participate. Is it too late????

  • @Anil-Behera
    @Anil-Behera 3 года назад +1

    Dear Sir,..I would love to participate..

  • @christianzanou4107
    @christianzanou4107 3 года назад +1

    I would like to contribute to you project

  • @shubhamgajbhiye9679
    @shubhamgajbhiye9679 3 года назад +1

    I am also interested, please let me know how to join

    • @DataProfessor
      @DataProfessor  3 года назад

      Awesome, joining is simple, you can submit a Jupyter notebook with your analysis to the Kaggle dataset (links in the video description).

  • @sebastiancastro4126
    @sebastiancastro4126 2 года назад +1

    Hello! I'm a chemist, Am I still on time to join the project?

    • @DataProfessor
      @DataProfessor  2 года назад

      Yes, the project is currently ongoing, participants are contributing on Kaggle and GitHub pages of the project.

  • @ajwadakil6020
    @ajwadakil6020 2 года назад +1

    Is the project over or can i still contribute to it?

  • @hitharthkadam4874
    @hitharthkadam4874 3 года назад +1

    Sir,what is the deadline for submission of jupyter notebook?

    • @DataProfessor
      @DataProfessor  3 года назад

      There’s no hard deadline, but look forward to them at your earliest convenience. It’s a community effort, let’s learn together.

    • @hitharthkadam4874
      @hitharthkadam4874 3 года назад

      @@DataProfessorPerfect Thank you Sir😀

  • @sagarhm2237
    @sagarhm2237 3 года назад +1

    even im intrsensted

  • @sangeethakannan7545
    @sangeethakannan7545 3 года назад +1

    Sir I am interested but I have a Very basic bioinformatics and python knowledge...can I take part in the project?

    • @DataProfessor
      @DataProfessor  3 года назад

      Yes sure, you can use the dataset as a practice data by applying some of the model building that I’ve also shown in several other videos in the channel in the Bioinformatics playlist.

  • @Sukoon_Shubh
    @Sukoon_Shubh 3 года назад +1

    Sir make a video on job place and companies hiring to bioinformatics students and comments if any known vaccancy for bioinformatics remote basis

    • @DataProfessor
      @DataProfessor  3 года назад

      Thanks for suggestion, will look into this.

  • @md.sharifulislam8296
    @md.sharifulislam8296 3 года назад +1

    I am interested

  • @irfanalghanikhalid2291
    @irfanalghanikhalid2291 3 года назад +1

    Let's see how my bioinformatics skills can be applied here :D

  • @OlehMezhenskyi
    @OlehMezhenskyi 3 года назад +1

    Does R notebooks accepted in this project?

  • @felicianaureenrosario5547
    @felicianaureenrosario5547 3 года назад +1

    Interested

  • @muhammadvickyastriahera3006
    @muhammadvickyastriahera3006 3 года назад

    Want to participate too. I'm a biology student on my 4th year and now looking for my thesis projects. 😇

  • @mashaelabdullah6758
    @mashaelabdullah6758 3 года назад +1

    I am from computer science background and I am into data science can I participate?

    • @DataProfessor
      @DataProfessor  3 года назад +1

      Yes, you are welcome to join. To participate you can perform data analysis in a Jupyter notebook and do a PR on the betalactamase repo (links in the video description). Or you can share a Jupyter notebook to the betalactamase page on Kaggle (links in the video description).

  • @schakaravarthy6244
    @schakaravarthy6244 3 года назад +1

    Hi I'm Bioinformatician I'm intrested in this

  • @IffssIffss
    @IffssIffss Месяц назад

    Im interested too

  • @ShravanKumar147
    @ShravanKumar147 2 года назад

    Do we have a slack channel?

  • @agniruudrrasinha7946
    @agniruudrrasinha7946 3 года назад +1

    I am also interested in computational drug discovery and as such want to help out

  • @marydinchen7141
    @marydinchen7141 3 года назад +2

    Does someone want to open a discord for everyone who's interested?

    • @DataProfessor
      @DataProfessor  3 года назад +1

      Thanks for suggestion, due to popular request I'll be creating and set up a discord server soon.

  • @manjukasana5102
    @manjukasana5102 3 года назад +1

    I would like to participate sir ,🙏

  • @saikunde2010
    @saikunde2010 3 года назад +1

    Hello ! Sir I am interested in joining this project

    • @DataProfessor
      @DataProfessor  3 года назад

      That's great, welcome to the initiative!

  • @semaatasever3802
    @semaatasever3802 3 года назад +1

    I would like to contribute.

    • @DataProfessor
      @DataProfessor  3 года назад

      Awesome, looking forward to your contribution, e.g. Jupyter notebook on Kaggle or a Pull Request on GitHub to the betalactamase repo. A tweet about your contribution can be made by tagging me @thedataprof on Twitter would also be great.

  • @mriganka7331
    @mriganka7331 3 года назад +1

    Sir, I'm interested to participate.

  • @gayatrinavle2792
    @gayatrinavle2792 3 года назад +1

    I am also interested

  • @newview3874
    @newview3874 3 года назад +1

    Hi I wan to join this pls

  • @shk5253
    @shk5253 3 года назад +1

    Wanna join

  • @sameerquazi2626
    @sameerquazi2626 3 года назад +1

    Heya, Let me know if you would like to get it published OA. We have some leftover funds and would love to contribute financially. Also I can carry out MD simulations and calculations using Schrodinger's Suite.

  • @ankitganeshpurkar
    @ankitganeshpurkar 3 года назад +1

    I am intrested

  • @abultooshil3259
    @abultooshil3259 2 года назад

    May I start research with you with zero knowledge about bioinformatics ? I have a degree of ECE. Kindly give me a the mail address of you.

  • @microbiologyshow8128
    @microbiologyshow8128 3 года назад

    I am interested

  • @satyajitchowdhury3350
    @satyajitchowdhury3350 3 года назад +1

    I would like to contribute.

    • @DataProfessor
      @DataProfessor  3 года назад

      Yes definitely, welcome to the challenge!