Bioinformatics in Python: DNA Toolkit. Part 1: Validating and counting nucleotides.

Поделиться
HTML-код
  • Опубликовано: 14 янв 2025

Комментарии • 124

  • @JohnnyUtah13
    @JohnnyUtah13 4 года назад +41

    I spent two days figuring out how to count nucleotides by converting strings to lists and using an overly complicated list of if/else commands. You just showed me a far superior method in less than 10 minutes. I am already thankful and can't wait to watch the entire series.

    • @rebelScience
      @rebelScience  4 года назад +8

      Awesome! I know the feeling. We will try diving deeper in more complex but more interesting stuff soon. It is important to make sure you understand Python fundamentals to be able to use it effectively. Make sure you watch Cory's video series. You will be surprised when we make 2-3 Pythonic lines of code out of 10.

    • @rebelScience
      @rebelScience  4 года назад +5

      I just wanted to add that you can join our community chat and we will try helping you next time, so you save 2 days of figuring things out on your own.

  • @tiamat1628
    @tiamat1628 2 года назад +7

    I am an MD and I want to become a bioinformatician, I have zero exp in programming and I found your video very easy to understand and digest.
    Thank you very much, you earnd a new sub.

  • @matthewmarshall5730
    @matthewmarshall5730 Год назад +1

    Thank you, I like the pace of the teaching and the relevant examples used for bioinformatics.

  • @rebelScience
    @rebelScience  5 лет назад +5

    FYI: at 3:25 it is ASCII Table: t.ly/jyGG8. In ASCII table 'a' = 97 and 'A' = 65. So 97 != 65 and 'a' != 'A'

    • @broytingaravsol
      @broytingaravsol 3 года назад +1

      i'll go through all ur work on bioinformatics in python, i'm on both

    • @felipeddds
      @felipeddds 4 месяца назад

      exactly.

  • @carosfine
    @carosfine Год назад

    jesus, i'm in love with this playlist. thank you so much

  • @gabrielevetrugno6089
    @gabrielevetrugno6089 5 лет назад +4

    Amazing! Love to see and try new stuff about the topic 💪🏻

    • @rebelScience
      @rebelScience  5 лет назад

      Thanks! We will cover some very exiting and interesting research in the future.

  • @GGLazyJJ
    @GGLazyJJ 4 года назад

    your recommendations are always the best part of your videos!

  • @boogywoogy2395
    @boogywoogy2395 4 года назад +7

    great content...really helpful and well explained

  • @amitrupani9898
    @amitrupani9898 5 лет назад +1

    Thanks rebelCoder! Enjoyed learning from this lesson. Look forward to upcoming lessons!

    • @rebelScience
      @rebelScience  5 лет назад +1

      Thank you for watching! I am glad you liked it. We are just getting started! We will cover some complex stuff after we cover all the basics.

    • @rebelScience
      @rebelScience  5 лет назад +1

      And please feel free to comment and suggest things as I want to have an open and collaborative approach!

    • @amitrupani9898
      @amitrupani9898 5 лет назад +1

      ​@@rebelScience Sure, as a Bioinformatician, I often come across situations where I have to compare multiple files (sometimes, 100's of gb's in size) based on genomic coordinates to create new file/files.
      Would be nice to see something similar in one of the lessons. Also, methods of code optimization for quick file comparisons for bigger size files would be great!
      :-)

    • @rebelScience
      @rebelScience  5 лет назад +1

      @@amitrupani9898 Sounds interesting! I will be covering memory optimizations, speed optimizations and multi threaded approach too. I plan to cover writing super fast routines in C++ or Rust and hooking into them from Python. It was hard to figure out where to start this series, and I decided to go with the basics first and build up. There is so much to cover...

    • @amitrupani9898
      @amitrupani9898 5 лет назад +1

      @@rebelScience I think its a great way to start (given basic programming skills are a prerequisite). Look forward to a great leaning experience! :-)

  • @akanimohosutuk928
    @akanimohosutuk928 5 месяцев назад +1

    Currently running all these code in a decentralised Cartesi VM for a side project. Thanks for these videos

    • @rebelScience
      @rebelScience  5 месяцев назад

      Sounds amazing. I know the Cartesi Blockchain project.

    • @akanimohosutuk928
      @akanimohosutuk928 5 месяцев назад +1

      @@rebelScienceI will share with you when I am done next week

  • @dylanneal8244
    @dylanneal8244 3 года назад +1

    So cool. Thanks for this video!

  • @matt-g-recovers
    @matt-g-recovers 3 года назад

    Agreed Corey's videos are really good.
    What I know of Python I got much from him.

  • @williamcowan4936
    @williamcowan4936 3 года назад +2

    at 1:12 should we have downloaded something other than python and our IDE? or should we make those files/projects exactly as we see on the video?

    • @rebelScience
      @rebelScience  3 года назад +1

      Hi. We don't download anything in our videos. We create everything from scratch. I have a video on setting up the code editor also.

  • @felipepedro1678
    @felipepedro1678 2 года назад +1

    Great content!

  • @borispyakillya4777
    @borispyakillya4777 3 года назад +1

    Thank you a lot for the video!

  • @StephenAigbepue
    @StephenAigbepue 3 месяца назад

    Thanks for this beautiful tutorial, But please how do i incoorporate tis in jupyter notebook

  • @daniocionini7043
    @daniocionini7043 3 года назад

    really great! Thank you for that

  • @ivanviveros
    @ivanviveros 4 года назад +1

    Your videos are so good man! Thank you. Btw, which vs code theme is that? I love the color scheme!

    • @rebelScience
      @rebelScience  4 года назад

      Hey! Thank you. I have configured my theme a long time ago and interestingly enough, it was changing on its own by becoming darker. I think extensions I was using for my theme kept getting updated and that is why it changed for me with the time. I will try figuring out my config and share it with you as a few other people were interested in this.

    • @wilku1039
      @wilku1039 3 года назад

      @@rebelScience hey, any updates on that? the theme looks really good, and i couldn't find any information about it from you

  • @maheshrani6609
    @maheshrani6609 10 месяцев назад

    please share the gitlab link.

  • @TragoudistrosMPH
    @TragoudistrosMPH 4 года назад +1

    Hi, I noticed your DNAtoolkit file is not on your gitlab folder DNA Toolset. I clicked the history and found the file.
    I noticed that when I tried to import the file.
    Hopefully that helps (and I'm not foolishly misunderstanding anything haha)

    • @rebelScience
      @rebelScience  4 года назад +1

      Hey. We are not importing anything. DNA Toolkit is not a Python module. DNA Toolkit is a tool we write from scratch in Python.

    • @TragoudistrosMPH
      @TragoudistrosMPH 4 года назад

      @@rebelScience I see, DNA Toolkit is in the history, and you were importing it while writing it. I had never imported a file I was working on
      (from DNA Toolkit import *)
      I happened to be using jupyter notebook, so I overlooked the idea :P (I'm a biostatistician, so coding is a secondary skill lol)

    • @rebelScience
      @rebelScience  4 года назад +1

      If you follow every video from 1st to last you should have a good idea of what we are doing. I am not sure how to import additional files onto Notebooks. Try searching for it on the internet.

    • @TragoudistrosMPH
      @TragoudistrosMPH 4 года назад

      @@rebelScience no worries! I was reporting back that I figured it out and that you were correct :)

  • @kaansimsek7986
    @kaansimsek7986 4 года назад +2

    hello i am a biomedical engineering student. I chose DNA analysis with python as my last thesis and can you help with software? I really need it. thank you.

  • @nazaninrahimirad7344
    @nazaninrahimirad7344 4 года назад +2

    I couldn't see the exact codes. I think it was better to zoom in your screen or you have used a high contrast theme

  • @gowrang456
    @gowrang456 4 года назад +1

    Great content now I am able to understand how to apply python in bioinformatics. For the random joining of the nucleotide sequence does the nucleotide arrangement happen in a defined way or there is no pattern for the generation of nucleotide?

    • @rebelScience
      @rebelScience  4 года назад +1

      Hi! Well, Randomness is exactly what it is - random generation of characters. If we would what you call "a pattern" or "defines way", that would not be randomness, right? We use that just for tests.

  • @zeination
    @zeination 4 года назад +1

    I'm a Computer Science student and im so interested in the field of Bioinformatics.
    Its just that i'm lost from where should i start first to catch with your videos
    Thank you so much

    • @rebelScience
      @rebelScience  4 года назад +1

      Hey. Join our chat and check out my last article as it is about your question.

    • @zeination
      @zeination 4 года назад +1

      @@rebelScience thank you so much!

    • @cognosagedev
      @cognosagedev 3 года назад

      @@rebelScience plz mention that article i also have the same case?

  • @HanhNguyen-ue6oq
    @HanhNguyen-ue6oq 3 года назад

    Really helpful! Thanks a lot

  • @diegoavendanohernandez9908
    @diegoavendanohernandez9908 3 года назад

    awesome content, grate channel

  • @محمداحمدی-ر3ش6ج
    @محمداحمدی-ر3ش6ج Год назад

    thank you very much

  • @jaswanthchotu6068
    @jaswanthchotu6068 Год назад

    Please do more useful information about bioinformatics

  • @lizixiao9316
    @lizixiao9316 3 года назад +1

    How is your cursor line highlighted?

    • @rebelScience
      @rebelScience  3 года назад

      It is just an extension for the code editor, called line highlighter. Try search for it in the extensions library

  • @apoorvwatsky
    @apoorvwatsky 4 года назад +3

    Amazing content! Looking forward. :)
    I'd prefer not to cast Counter object as dictionary, and use them as it is. Whatever operations you can perform on dictionaries, you can do them on Counters too. They are mutable, fast and already come with out of the box features like most_common etc.

    • @LynnWinx
      @LynnWinx 4 года назад +1

      Yeah, but each time you operate on them you risk changing the order (Counter.update disrespects the original order) or losing zeros (Counter.__add__ decides to remove keys when values reach zero). Furthermore, even though Counter dicts have implicit zeros for __getitem__, they break equality with dictionaries that have implicit zeros, so testing is a mess. The behavior of Counter object is too chaotic for me: I want to rely of the promise of OrderedDict, I want equality to work, and I don't want the zeros to disappear for no reason. So I use {n:seq.count(n) for n in NUCLEOTIDES}.

  • @MirjamGrebenc
    @MirjamGrebenc 4 года назад +1

    What is the software you use? I have been using jupyter but prefer the layout you have

    • @rebelScience
      @rebelScience  4 года назад +2

      Hey. I have a video on that. Search for Development Tools in my videos.

    • @MirjamGrebenc
      @MirjamGrebenc 4 года назад

      @@rebelScience thank you so much

  • @MehranKhan-he1lh
    @MehranKhan-he1lh Год назад

    Link for structuring the project/access to the project (step in your video at the moment (1;:10 minute)? Please

    • @rebelScience
      @rebelScience  Год назад

      "Link for structuring the project/access to the project"
      1. We are creating this project, so if you follow the video series, you will see how we are structuring it.
      2. If you are looking for a git repository for it, it is in the video description.

    • @MehranKhan-he1lh
      @MehranKhan-he1lh Год назад

      @@rebelScience Thanks

  • @1973vgc
    @1973vgc 4 года назад

    Thank you for this!!!

  • @TTy5361
    @TTy5361 3 года назад +1

    Super dumb question but what IDE are you using to run these python scripts?

    • @rebelScience
      @rebelScience  3 года назад

      I have a video on my channel, titled Development Tools. It has all the answers ;)

  • @tekomichael2667
    @tekomichael2667 3 года назад +1

    Thank you man it's by far great video from what I saw, though texts on the screen are very small & sometimes hard to read.

    • @rebelScience
      @rebelScience  3 года назад

      Thanks! Are you watching on a mobile device ? I adjusted the size of the font and tested on small screens in next videos so it should be better.

    • @tekomichael2667
      @tekomichael2667 3 года назад

      @@rebelScience Thank you for responding so fast, you're a man of your words. Yes, you're right usually I watch on mobile device, coz I don't have pc where I mostly stay & work for the time been.

  • @daltonham2821
    @daltonham2821 4 года назад +1

    What if you want to include N's which represents any of the four nucleotides?

    • @rebelScience
      @rebelScience  4 года назад +2

      In our case, we are working with standard Nucleotides for now as most of the raw data will be in this format. Adding a lot of other variants and logic would make our first lesson overcomplicated. This is a beginner level set of tutorials. We will be adding a lot of cool and complex stuff in our next series "Genome Toolkit", which will use "DNA Toolkit"! Stay tuned.

  • @divz2646
    @divz2646 2 года назад

    how you create such functions, and hav you downloaded the module?

    • @rebelScience
      @rebelScience  2 года назад

      No, we do not use any modules. We create DNA Tooolkit module from scratch in this series of videos.

  • @aysha.h5608
    @aysha.h5608 4 года назад +1

    did you use pycharm?

    • @rebelScience
      @rebelScience  4 года назад +1

      Yes, I have used PyCharm. It is a good code editor for Python. I use VSCode as I think it is the best one. Also, if you are writing in more than one programming language, which I do, VSCode is perfect. It supports any language, while PyCharm editor is strictly Python. The important point is, if you are familiar with the tool (PyCharm in your case?), and it does everything you need, just keep using it. Also, I have an article and a video about that here: rebelscience.club/2020/04/lets-set-up-a-code-editor-for-python-and-bioinformatics/

  • @is44ct37
    @is44ct37 8 месяцев назад

    I get the error: no module named DNAToolkit - I tried installing the DNA toolkit from PIP, and I thought it would work, but still giving me the same error. I copied the code, from what I could tell, exactly. Any thoughts?

  • @md.mahfuzurrahmanbhuyan9351
    @md.mahfuzurrahmanbhuyan9351 4 года назад

    What software are u using

    • @rebelScience
      @rebelScience  4 года назад

      Hey, sorry what do you mean? Are you asking about the Code Editor I use? It was mentioned in the introduction video and I also have a Development Tools video where I show how to set it up.

  • @sujanmahmud1038
    @sujanmahmud1038 Год назад +1

    what ide is this?

    • @rebelScience
      @rebelScience  Год назад

      Hey! You should start with the Introduction video, where I explain what you need to work with this series of video, including the code editor. I also have a video of how to set it up.

  • @Jonix-redhat
    @Jonix-redhat 4 года назад +1

    Thank you for the great video! I know this is a newbie question because I just started to learn bioinformatics with python (I'm a biomedicine master student), but anyway: why do you use "[" and "]" in join([random.choice(Nucleotides)... and not just "(" and ")"?

    • @rebelScience
      @rebelScience  4 года назад

      [random.choice("ACGT") for x in range(10)] is a list comprehension
      Then we pass it to a join method and all methods/functions have () - join()
      Try this: test = [random.choice("ACGT") for x in range(10)]
      and this: test = random.choice("ACGT") for x in range(10)
      You can run it the way you suggested and it seems that Python 3.6 and up recognizes that it is a list comprehension and allows for this: seq = ''.join(random.choice("ACGT") for x in range(10))
      But it is a bad practice as you should make sure your code is readable and [ ] is a list comprehension.

    • @Jonix-redhat
      @Jonix-redhat 4 года назад

      @@rebelScience Ok! thanks a lot for the answer, I understand! looking forward to more good videos with bioinformatics! take care!

  • @cristianperalta5022
    @cristianperalta5022 5 лет назад +6

    Hi!, First of all, thank you very much for this kind of videos, they are absolutely fascinating.
    I have a problem, I was following your instrucctions and then, suddenly, the code wouldn't work. The problem is something related to the module: from DNAToolkit import *. The error says the following: "ModuleNotFoundError: No module named 'DNAToolkit'.
    Something absolutely hilarious, because a few minutes before the code was functionating. I'm using sublime text, please help me.
    Thanks

    • @rebelScience
      @rebelScience  5 лет назад +1

      Hey! Thanks! I enjoy sharing this information very much.
      About the error: it looks like Code Editor (Sublime Text) problem or file naming problem. Hard to tell what it is without looking at logs. I would suggest joining our chat on Telegram or Matrix (links are in video description) so you can share screenshots and output information.
      For now, make sure all of your files are named correctly (DNAToolkit.py or dnatoolkit.py)
      Try creating a new folder for the project, copy all files, make sure names are correct and try running the code again.
      Where it says "ModuleNotFoundError", does is say something about temp file?

    • @cristianperalta5022
      @cristianperalta5022 5 лет назад

      @@rebelScience I've even deleted all the files and created it again, though, I realized that a file named "__pycache__" was created. It said nothing about "temp file".
      I might try VS Code as a Code Editor.

    • @not_him...1
      @not_him...1 2 года назад

      @@rebelScience thanks Sir, I have the same problem. I just couldn’t get to import the files from DNAtoolkit. I really don’t know what to do. I really enjoy your explanations and I’m sure I understand them, but I have a problem importing the tools to work on bioinformatics.

    • @not_him...1
      @not_him...1 2 года назад

      @@rebelScience I’ll be really glad if you can reply as soon as you’re able, I am eager to learn more but I can’t if I cannot practice myself.

    • @not_him...1
      @not_him...1 2 года назад

      @@rebelScience yeah, I just joined the platform on telegram. I can’t drop questions there too. So please, you help’s needed 🙏

  • @mariasira5808
    @mariasira5808 3 года назад

    Can sb who has studied bioinformatics work in the laboratory , or is it more of a computer/coding job ?

  • @et504383
    @et504383 4 года назад

    I made the python code on Jupyter Notebook, but it can not work well.

  • @nardineharrab
    @nardineharrab 9 месяцев назад

    Hello, i have this problem, it gave me this output {'A': 16, 'C': 12, 'T': 6, 'G': 16} while I entered the dictionary in this order {"A": 0, "C": 0, "G": 0, "T": 0} why it switches the T and the G order in the output ?? please help me cuz I'm stuck here
    thanks

  • @what_the_really
    @what_the_really Год назад

    I'm studying with your videos. but when I print result, It show different result when I runned. If the random result tart from G, dictionry's result shows also G. Is it OK? If I use join list comprehension, how could I know which one is A or G or C or T ?? If I like to make a dic list start from A, C, G and T , how to make a code...?

    • @what_the_really
      @what_the_really Год назад

      also... If i "print(' '.join([str(val) for key, val in result.items()]))" this one, when I print dic. it has blank value and key, should I '' (without black), instead ' '(with black) ? but If I use '' , the print resule show 20121721 no 20 12 17 21... I couldn't find out what is different with yours..

  • @varisingermany
    @varisingermany 4 года назад

    i cant run the file why tho ? I wrote all the things like you but cant run it

    • @rebelScience
      @rebelScience  4 года назад +1

      Well, you would need to do two things: add more details of what OS/Editor/code runner/Plugins you use and how you are trying to run things, or join our chat in Telegram/Matrix and share some screenshots with above information.
      Have you set-up your code editor like we did here: rebelscience.club/2020/04/lets-set-up-a-code-editor-for-python-and-bioinformatics/

  • @mohammedahmedjalloh531
    @mohammedahmedjalloh531 4 года назад

    You didn’t give any instructions on how to install the toolkit which is difficult for some of us to even start

    • @rebelScience
      @rebelScience  4 года назад +1

      Hello. We are not installing any toolkit. We are developing it from scratch in plain Python in this series of videos. Did you watch my introduction video?

    • @mohammedahmedjalloh531
      @mohammedahmedjalloh531 4 года назад

      @@rebelScience Hello, thanks for the instant reply. I am a beginner in this course, both python and Bioinformatics... I just want to ask if I can use Pycharm as my code editor.

    • @rebelScience
      @rebelScience  4 года назад +1

      Yes. You can use whatever you want. Whatever is easier for you. I talk about that in the Introduction video.

  • @bhatwasim6741
    @bhatwasim6741 3 года назад

    Why these codes r not running for me...em copying exactly

  • @marzijahan5502
    @marzijahan5502 3 года назад

    which environemt are writing. It is not python. Also I have problem with space character. When I use space in python there are some errors!

    • @rebelScience
      @rebelScience  3 года назад

      Sorry, what do you mean by it is not Python ? I have Introduction video, and the other one is called Development tools. You should watch those two videos to understand what environment I use and how to set it up.

    • @marzijahan5502
      @marzijahan5502 3 года назад

      @@rebelScience Ok. TnQ

    • @marzijahan5502
      @marzijahan5502 3 года назад

      @@rebelScience I could not find the Development tools among yr videos! :(

    • @rebelScience
      @rebelScience  3 года назад

      I only have 27 videos ;) ruclips.net/video/81Eb_YXmV4g/видео.html

  • @irinalaivina8664
    @irinalaivina8664 5 лет назад +1

    Very interesting! I like it!

  • @rajarshimondal
    @rajarshimondal 4 года назад +1

    Sir I just joined the telegram channel mentioned in the chat box. I'm a Biotechnology student and I want to learn bioinformatics. So I joined the telegram channel. I think without any reason I'm banned.

    • @rajarshimondal
      @rajarshimondal 4 года назад

      Please invite me back bro.🥺

    • @rebelScience
      @rebelScience  4 года назад

      Hey. When you join, Bot asks you a question you need to type an answer for in the chat. If you don't, Bot kicks you. This is a Spam protection. Try joining again and see what Bot asks you.

    • @rajarshimondal
      @rajarshimondal 4 года назад

      @@rebelScience ok thanks

    • @rajarshimondal
      @rajarshimondal 4 года назад

      It's saying chat is no longer accessible. Please unban me. I'm BABU GUDDU.

  • @ethan_bodybuilder
    @ethan_bodybuilder 4 года назад +1

    def validate_seq(seq):
    for nuc in seq:
    if nuc not in nucleotides:
    print("Invalid sequence. Only A, C, T, and G are accepted characters.
    ")
    randomvschoice()
    return seq
    Hi rebelCoder,
    I have this coded for a more user friendly version. I get a strange result though.
    When I input a sequence with a mixture of correct and incorrect nucleotides it
    ignores this statement. But when I input only
    incorrect characters it works fine. I am not sure why please help.

    • @Rossboe1
      @Rossboe1 4 года назад

      I have the same problem

    • @juanmaruizrobles5867
      @juanmaruizrobles5867 4 года назад +3

      @@Rossboe1 just check the identation of the last line...I suggest it could be just in the level of the "for", not in the "if"

  • @alexandergapak4883
    @alexandergapak4883 4 года назад +2

    Good! Русские есть?

  • @marzijahan5502
    @marzijahan5502 3 года назад

    I am totally beginner for python. should I memorize these functions????'

    • @rebelScience
      @rebelScience  3 года назад

      Please watch Interdiction video. I explain everything in that video. Yes, you should be good with Python before watching these videos. Make sure you learn Python first. My Introduction video has links and suggestions.

  • @linuxuser1234
    @linuxuser1234 3 года назад +1

    What python software did you use

    • @rebelScience
      @rebelScience  3 года назад

      Hey! Sorry, I am not sure what you mean. Are you asking about what code editor I use to write Python code in?

    • @linuxuser1234
      @linuxuser1234 3 года назад +1

      @@rebelScience yes

    • @rebelScience
      @rebelScience  3 года назад

      I have a video about the code editor here: ruclips.net/video/81Eb_YXmV4g/видео.html

    • @linuxuser1234
      @linuxuser1234 3 года назад +1

      @@rebelScience thanks