I spent two days figuring out how to count nucleotides by converting strings to lists and using an overly complicated list of if/else commands. You just showed me a far superior method in less than 10 minutes. I am already thankful and can't wait to watch the entire series.
Awesome! I know the feeling. We will try diving deeper in more complex but more interesting stuff soon. It is important to make sure you understand Python fundamentals to be able to use it effectively. Make sure you watch Cory's video series. You will be surprised when we make 2-3 Pythonic lines of code out of 10.
I just wanted to add that you can join our community chat and we will try helping you next time, so you save 2 days of figuring things out on your own.
I am an MD and I want to become a bioinformatician, I have zero exp in programming and I found your video very easy to understand and digest. Thank you very much, you earnd a new sub.
@@rebelScience Sure, as a Bioinformatician, I often come across situations where I have to compare multiple files (sometimes, 100's of gb's in size) based on genomic coordinates to create new file/files. Would be nice to see something similar in one of the lessons. Also, methods of code optimization for quick file comparisons for bigger size files would be great! :-)
@@amitrupani9898 Sounds interesting! I will be covering memory optimizations, speed optimizations and multi threaded approach too. I plan to cover writing super fast routines in C++ or Rust and hooking into them from Python. It was hard to figure out where to start this series, and I decided to go with the basics first and build up. There is so much to cover...
Hey! Thank you. I have configured my theme a long time ago and interestingly enough, it was changing on its own by becoming darker. I think extensions I was using for my theme kept getting updated and that is why it changed for me with the time. I will try figuring out my config and share it with you as a few other people were interested in this.
Hi, I noticed your DNAtoolkit file is not on your gitlab folder DNA Toolset. I clicked the history and found the file. I noticed that when I tried to import the file. Hopefully that helps (and I'm not foolishly misunderstanding anything haha)
@@rebelScience I see, DNA Toolkit is in the history, and you were importing it while writing it. I had never imported a file I was working on (from DNA Toolkit import *) I happened to be using jupyter notebook, so I overlooked the idea :P (I'm a biostatistician, so coding is a secondary skill lol)
If you follow every video from 1st to last you should have a good idea of what we are doing. I am not sure how to import additional files onto Notebooks. Try searching for it on the internet.
hello i am a biomedical engineering student. I chose DNA analysis with python as my last thesis and can you help with software? I really need it. thank you.
Great content now I am able to understand how to apply python in bioinformatics. For the random joining of the nucleotide sequence does the nucleotide arrangement happen in a defined way or there is no pattern for the generation of nucleotide?
Hi! Well, Randomness is exactly what it is - random generation of characters. If we would what you call "a pattern" or "defines way", that would not be randomness, right? We use that just for tests.
I'm a Computer Science student and im so interested in the field of Bioinformatics. Its just that i'm lost from where should i start first to catch with your videos Thank you so much
Amazing content! Looking forward. :) I'd prefer not to cast Counter object as dictionary, and use them as it is. Whatever operations you can perform on dictionaries, you can do them on Counters too. They are mutable, fast and already come with out of the box features like most_common etc.
Yeah, but each time you operate on them you risk changing the order (Counter.update disrespects the original order) or losing zeros (Counter.__add__ decides to remove keys when values reach zero). Furthermore, even though Counter dicts have implicit zeros for __getitem__, they break equality with dictionaries that have implicit zeros, so testing is a mess. The behavior of Counter object is too chaotic for me: I want to rely of the promise of OrderedDict, I want equality to work, and I don't want the zeros to disappear for no reason. So I use {n:seq.count(n) for n in NUCLEOTIDES}.
"Link for structuring the project/access to the project" 1. We are creating this project, so if you follow the video series, you will see how we are structuring it. 2. If you are looking for a git repository for it, it is in the video description.
@@rebelScience Thank you for responding so fast, you're a man of your words. Yes, you're right usually I watch on mobile device, coz I don't have pc where I mostly stay & work for the time been.
In our case, we are working with standard Nucleotides for now as most of the raw data will be in this format. Adding a lot of other variants and logic would make our first lesson overcomplicated. This is a beginner level set of tutorials. We will be adding a lot of cool and complex stuff in our next series "Genome Toolkit", which will use "DNA Toolkit"! Stay tuned.
Yes, I have used PyCharm. It is a good code editor for Python. I use VSCode as I think it is the best one. Also, if you are writing in more than one programming language, which I do, VSCode is perfect. It supports any language, while PyCharm editor is strictly Python. The important point is, if you are familiar with the tool (PyCharm in your case?), and it does everything you need, just keep using it. Also, I have an article and a video about that here: rebelscience.club/2020/04/lets-set-up-a-code-editor-for-python-and-bioinformatics/
I get the error: no module named DNAToolkit - I tried installing the DNA toolkit from PIP, and I thought it would work, but still giving me the same error. I copied the code, from what I could tell, exactly. Any thoughts?
Hey, sorry what do you mean? Are you asking about the Code Editor I use? It was mentioned in the introduction video and I also have a Development Tools video where I show how to set it up.
Hey! You should start with the Introduction video, where I explain what you need to work with this series of video, including the code editor. I also have a video of how to set it up.
Thank you for the great video! I know this is a newbie question because I just started to learn bioinformatics with python (I'm a biomedicine master student), but anyway: why do you use "[" and "]" in join([random.choice(Nucleotides)... and not just "(" and ")"?
[random.choice("ACGT") for x in range(10)] is a list comprehension Then we pass it to a join method and all methods/functions have () - join() Try this: test = [random.choice("ACGT") for x in range(10)] and this: test = random.choice("ACGT") for x in range(10) You can run it the way you suggested and it seems that Python 3.6 and up recognizes that it is a list comprehension and allows for this: seq = ''.join(random.choice("ACGT") for x in range(10)) But it is a bad practice as you should make sure your code is readable and [ ] is a list comprehension.
Hi!, First of all, thank you very much for this kind of videos, they are absolutely fascinating. I have a problem, I was following your instrucctions and then, suddenly, the code wouldn't work. The problem is something related to the module: from DNAToolkit import *. The error says the following: "ModuleNotFoundError: No module named 'DNAToolkit'. Something absolutely hilarious, because a few minutes before the code was functionating. I'm using sublime text, please help me. Thanks
Hey! Thanks! I enjoy sharing this information very much. About the error: it looks like Code Editor (Sublime Text) problem or file naming problem. Hard to tell what it is without looking at logs. I would suggest joining our chat on Telegram or Matrix (links are in video description) so you can share screenshots and output information. For now, make sure all of your files are named correctly (DNAToolkit.py or dnatoolkit.py) Try creating a new folder for the project, copy all files, make sure names are correct and try running the code again. Where it says "ModuleNotFoundError", does is say something about temp file?
@@rebelScience I've even deleted all the files and created it again, though, I realized that a file named "__pycache__" was created. It said nothing about "temp file". I might try VS Code as a Code Editor.
@@rebelScience thanks Sir, I have the same problem. I just couldn’t get to import the files from DNAtoolkit. I really don’t know what to do. I really enjoy your explanations and I’m sure I understand them, but I have a problem importing the tools to work on bioinformatics.
Hello, i have this problem, it gave me this output {'A': 16, 'C': 12, 'T': 6, 'G': 16} while I entered the dictionary in this order {"A": 0, "C": 0, "G": 0, "T": 0} why it switches the T and the G order in the output ?? please help me cuz I'm stuck here thanks
I'm studying with your videos. but when I print result, It show different result when I runned. If the random result tart from G, dictionry's result shows also G. Is it OK? If I use join list comprehension, how could I know which one is A or G or C or T ?? If I like to make a dic list start from A, C, G and T , how to make a code...?
also... If i "print(' '.join([str(val) for key, val in result.items()]))" this one, when I print dic. it has blank value and key, should I '' (without black), instead ' '(with black) ? but If I use '' , the print resule show 20121721 no 20 12 17 21... I couldn't find out what is different with yours..
Well, you would need to do two things: add more details of what OS/Editor/code runner/Plugins you use and how you are trying to run things, or join our chat in Telegram/Matrix and share some screenshots with above information. Have you set-up your code editor like we did here: rebelscience.club/2020/04/lets-set-up-a-code-editor-for-python-and-bioinformatics/
Hello. We are not installing any toolkit. We are developing it from scratch in plain Python in this series of videos. Did you watch my introduction video?
@@rebelScience Hello, thanks for the instant reply. I am a beginner in this course, both python and Bioinformatics... I just want to ask if I can use Pycharm as my code editor.
Sorry, what do you mean by it is not Python ? I have Introduction video, and the other one is called Development tools. You should watch those two videos to understand what environment I use and how to set it up.
Sir I just joined the telegram channel mentioned in the chat box. I'm a Biotechnology student and I want to learn bioinformatics. So I joined the telegram channel. I think without any reason I'm banned.
Hey. When you join, Bot asks you a question you need to type an answer for in the chat. If you don't, Bot kicks you. This is a Spam protection. Try joining again and see what Bot asks you.
def validate_seq(seq): for nuc in seq: if nuc not in nucleotides: print("Invalid sequence. Only A, C, T, and G are accepted characters. ") randomvschoice() return seq Hi rebelCoder, I have this coded for a more user friendly version. I get a strange result though. When I input a sequence with a mixture of correct and incorrect nucleotides it ignores this statement. But when I input only incorrect characters it works fine. I am not sure why please help.
Please watch Interdiction video. I explain everything in that video. Yes, you should be good with Python before watching these videos. Make sure you learn Python first. My Introduction video has links and suggestions.
I spent two days figuring out how to count nucleotides by converting strings to lists and using an overly complicated list of if/else commands. You just showed me a far superior method in less than 10 minutes. I am already thankful and can't wait to watch the entire series.
Awesome! I know the feeling. We will try diving deeper in more complex but more interesting stuff soon. It is important to make sure you understand Python fundamentals to be able to use it effectively. Make sure you watch Cory's video series. You will be surprised when we make 2-3 Pythonic lines of code out of 10.
I just wanted to add that you can join our community chat and we will try helping you next time, so you save 2 days of figuring things out on your own.
I am an MD and I want to become a bioinformatician, I have zero exp in programming and I found your video very easy to understand and digest.
Thank you very much, you earnd a new sub.
Glad it was helpful!
Thank you, I like the pace of the teaching and the relevant examples used for bioinformatics.
FYI: at 3:25 it is ASCII Table: t.ly/jyGG8. In ASCII table 'a' = 97 and 'A' = 65. So 97 != 65 and 'a' != 'A'
i'll go through all ur work on bioinformatics in python, i'm on both
exactly.
jesus, i'm in love with this playlist. thank you so much
Glad you enjoy it!
Amazing! Love to see and try new stuff about the topic 💪🏻
Thanks! We will cover some very exiting and interesting research in the future.
your recommendations are always the best part of your videos!
great content...really helpful and well explained
Thanks rebelCoder! Enjoyed learning from this lesson. Look forward to upcoming lessons!
Thank you for watching! I am glad you liked it. We are just getting started! We will cover some complex stuff after we cover all the basics.
And please feel free to comment and suggest things as I want to have an open and collaborative approach!
@@rebelScience Sure, as a Bioinformatician, I often come across situations where I have to compare multiple files (sometimes, 100's of gb's in size) based on genomic coordinates to create new file/files.
Would be nice to see something similar in one of the lessons. Also, methods of code optimization for quick file comparisons for bigger size files would be great!
:-)
@@amitrupani9898 Sounds interesting! I will be covering memory optimizations, speed optimizations and multi threaded approach too. I plan to cover writing super fast routines in C++ or Rust and hooking into them from Python. It was hard to figure out where to start this series, and I decided to go with the basics first and build up. There is so much to cover...
@@rebelScience I think its a great way to start (given basic programming skills are a prerequisite). Look forward to a great leaning experience! :-)
Currently running all these code in a decentralised Cartesi VM for a side project. Thanks for these videos
Sounds amazing. I know the Cartesi Blockchain project.
@@rebelScienceI will share with you when I am done next week
So cool. Thanks for this video!
Agreed Corey's videos are really good.
What I know of Python I got much from him.
at 1:12 should we have downloaded something other than python and our IDE? or should we make those files/projects exactly as we see on the video?
Hi. We don't download anything in our videos. We create everything from scratch. I have a video on setting up the code editor also.
Great content!
Thank you!
Thank you a lot for the video!
Thanks for this beautiful tutorial, But please how do i incoorporate tis in jupyter notebook
really great! Thank you for that
Your videos are so good man! Thank you. Btw, which vs code theme is that? I love the color scheme!
Hey! Thank you. I have configured my theme a long time ago and interestingly enough, it was changing on its own by becoming darker. I think extensions I was using for my theme kept getting updated and that is why it changed for me with the time. I will try figuring out my config and share it with you as a few other people were interested in this.
@@rebelScience hey, any updates on that? the theme looks really good, and i couldn't find any information about it from you
please share the gitlab link.
Hi, I noticed your DNAtoolkit file is not on your gitlab folder DNA Toolset. I clicked the history and found the file.
I noticed that when I tried to import the file.
Hopefully that helps (and I'm not foolishly misunderstanding anything haha)
Hey. We are not importing anything. DNA Toolkit is not a Python module. DNA Toolkit is a tool we write from scratch in Python.
@@rebelScience I see, DNA Toolkit is in the history, and you were importing it while writing it. I had never imported a file I was working on
(from DNA Toolkit import *)
I happened to be using jupyter notebook, so I overlooked the idea :P (I'm a biostatistician, so coding is a secondary skill lol)
If you follow every video from 1st to last you should have a good idea of what we are doing. I am not sure how to import additional files onto Notebooks. Try searching for it on the internet.
@@rebelScience no worries! I was reporting back that I figured it out and that you were correct :)
hello i am a biomedical engineering student. I chose DNA analysis with python as my last thesis and can you help with software? I really need it. thank you.
I couldn't see the exact codes. I think it was better to zoom in your screen or you have used a high contrast theme
Great content now I am able to understand how to apply python in bioinformatics. For the random joining of the nucleotide sequence does the nucleotide arrangement happen in a defined way or there is no pattern for the generation of nucleotide?
Hi! Well, Randomness is exactly what it is - random generation of characters. If we would what you call "a pattern" or "defines way", that would not be randomness, right? We use that just for tests.
I'm a Computer Science student and im so interested in the field of Bioinformatics.
Its just that i'm lost from where should i start first to catch with your videos
Thank you so much
Hey. Join our chat and check out my last article as it is about your question.
@@rebelScience thank you so much!
@@rebelScience plz mention that article i also have the same case?
Really helpful! Thanks a lot
awesome content, grate channel
thank you very much
Please do more useful information about bioinformatics
How is your cursor line highlighted?
It is just an extension for the code editor, called line highlighter. Try search for it in the extensions library
Amazing content! Looking forward. :)
I'd prefer not to cast Counter object as dictionary, and use them as it is. Whatever operations you can perform on dictionaries, you can do them on Counters too. They are mutable, fast and already come with out of the box features like most_common etc.
Yeah, but each time you operate on them you risk changing the order (Counter.update disrespects the original order) or losing zeros (Counter.__add__ decides to remove keys when values reach zero). Furthermore, even though Counter dicts have implicit zeros for __getitem__, they break equality with dictionaries that have implicit zeros, so testing is a mess. The behavior of Counter object is too chaotic for me: I want to rely of the promise of OrderedDict, I want equality to work, and I don't want the zeros to disappear for no reason. So I use {n:seq.count(n) for n in NUCLEOTIDES}.
What is the software you use? I have been using jupyter but prefer the layout you have
Hey. I have a video on that. Search for Development Tools in my videos.
@@rebelScience thank you so much
Link for structuring the project/access to the project (step in your video at the moment (1;:10 minute)? Please
"Link for structuring the project/access to the project"
1. We are creating this project, so if you follow the video series, you will see how we are structuring it.
2. If you are looking for a git repository for it, it is in the video description.
@@rebelScience Thanks
Thank you for this!!!
Super dumb question but what IDE are you using to run these python scripts?
I have a video on my channel, titled Development Tools. It has all the answers ;)
Thank you man it's by far great video from what I saw, though texts on the screen are very small & sometimes hard to read.
Thanks! Are you watching on a mobile device ? I adjusted the size of the font and tested on small screens in next videos so it should be better.
@@rebelScience Thank you for responding so fast, you're a man of your words. Yes, you're right usually I watch on mobile device, coz I don't have pc where I mostly stay & work for the time been.
What if you want to include N's which represents any of the four nucleotides?
In our case, we are working with standard Nucleotides for now as most of the raw data will be in this format. Adding a lot of other variants and logic would make our first lesson overcomplicated. This is a beginner level set of tutorials. We will be adding a lot of cool and complex stuff in our next series "Genome Toolkit", which will use "DNA Toolkit"! Stay tuned.
how you create such functions, and hav you downloaded the module?
No, we do not use any modules. We create DNA Tooolkit module from scratch in this series of videos.
did you use pycharm?
Yes, I have used PyCharm. It is a good code editor for Python. I use VSCode as I think it is the best one. Also, if you are writing in more than one programming language, which I do, VSCode is perfect. It supports any language, while PyCharm editor is strictly Python. The important point is, if you are familiar with the tool (PyCharm in your case?), and it does everything you need, just keep using it. Also, I have an article and a video about that here: rebelscience.club/2020/04/lets-set-up-a-code-editor-for-python-and-bioinformatics/
I get the error: no module named DNAToolkit - I tried installing the DNA toolkit from PIP, and I thought it would work, but still giving me the same error. I copied the code, from what I could tell, exactly. Any thoughts?
What software are u using
Hey, sorry what do you mean? Are you asking about the Code Editor I use? It was mentioned in the introduction video and I also have a Development Tools video where I show how to set it up.
what ide is this?
Hey! You should start with the Introduction video, where I explain what you need to work with this series of video, including the code editor. I also have a video of how to set it up.
Thank you for the great video! I know this is a newbie question because I just started to learn bioinformatics with python (I'm a biomedicine master student), but anyway: why do you use "[" and "]" in join([random.choice(Nucleotides)... and not just "(" and ")"?
[random.choice("ACGT") for x in range(10)] is a list comprehension
Then we pass it to a join method and all methods/functions have () - join()
Try this: test = [random.choice("ACGT") for x in range(10)]
and this: test = random.choice("ACGT") for x in range(10)
You can run it the way you suggested and it seems that Python 3.6 and up recognizes that it is a list comprehension and allows for this: seq = ''.join(random.choice("ACGT") for x in range(10))
But it is a bad practice as you should make sure your code is readable and [ ] is a list comprehension.
@@rebelScience Ok! thanks a lot for the answer, I understand! looking forward to more good videos with bioinformatics! take care!
Hi!, First of all, thank you very much for this kind of videos, they are absolutely fascinating.
I have a problem, I was following your instrucctions and then, suddenly, the code wouldn't work. The problem is something related to the module: from DNAToolkit import *. The error says the following: "ModuleNotFoundError: No module named 'DNAToolkit'.
Something absolutely hilarious, because a few minutes before the code was functionating. I'm using sublime text, please help me.
Thanks
Hey! Thanks! I enjoy sharing this information very much.
About the error: it looks like Code Editor (Sublime Text) problem or file naming problem. Hard to tell what it is without looking at logs. I would suggest joining our chat on Telegram or Matrix (links are in video description) so you can share screenshots and output information.
For now, make sure all of your files are named correctly (DNAToolkit.py or dnatoolkit.py)
Try creating a new folder for the project, copy all files, make sure names are correct and try running the code again.
Where it says "ModuleNotFoundError", does is say something about temp file?
@@rebelScience I've even deleted all the files and created it again, though, I realized that a file named "__pycache__" was created. It said nothing about "temp file".
I might try VS Code as a Code Editor.
@@rebelScience thanks Sir, I have the same problem. I just couldn’t get to import the files from DNAtoolkit. I really don’t know what to do. I really enjoy your explanations and I’m sure I understand them, but I have a problem importing the tools to work on bioinformatics.
@@rebelScience I’ll be really glad if you can reply as soon as you’re able, I am eager to learn more but I can’t if I cannot practice myself.
@@rebelScience yeah, I just joined the platform on telegram. I can’t drop questions there too. So please, you help’s needed 🙏
Can sb who has studied bioinformatics work in the laboratory , or is it more of a computer/coding job ?
I made the python code on Jupyter Notebook, but it can not work well.
Hello, i have this problem, it gave me this output {'A': 16, 'C': 12, 'T': 6, 'G': 16} while I entered the dictionary in this order {"A": 0, "C": 0, "G": 0, "T": 0} why it switches the T and the G order in the output ?? please help me cuz I'm stuck here
thanks
I'm studying with your videos. but when I print result, It show different result when I runned. If the random result tart from G, dictionry's result shows also G. Is it OK? If I use join list comprehension, how could I know which one is A or G or C or T ?? If I like to make a dic list start from A, C, G and T , how to make a code...?
also... If i "print(' '.join([str(val) for key, val in result.items()]))" this one, when I print dic. it has blank value and key, should I '' (without black), instead ' '(with black) ? but If I use '' , the print resule show 20121721 no 20 12 17 21... I couldn't find out what is different with yours..
i cant run the file why tho ? I wrote all the things like you but cant run it
Well, you would need to do two things: add more details of what OS/Editor/code runner/Plugins you use and how you are trying to run things, or join our chat in Telegram/Matrix and share some screenshots with above information.
Have you set-up your code editor like we did here: rebelscience.club/2020/04/lets-set-up-a-code-editor-for-python-and-bioinformatics/
You didn’t give any instructions on how to install the toolkit which is difficult for some of us to even start
Hello. We are not installing any toolkit. We are developing it from scratch in plain Python in this series of videos. Did you watch my introduction video?
@@rebelScience Hello, thanks for the instant reply. I am a beginner in this course, both python and Bioinformatics... I just want to ask if I can use Pycharm as my code editor.
Yes. You can use whatever you want. Whatever is easier for you. I talk about that in the Introduction video.
Why these codes r not running for me...em copying exactly
which environemt are writing. It is not python. Also I have problem with space character. When I use space in python there are some errors!
Sorry, what do you mean by it is not Python ? I have Introduction video, and the other one is called Development tools. You should watch those two videos to understand what environment I use and how to set it up.
@@rebelScience Ok. TnQ
@@rebelScience I could not find the Development tools among yr videos! :(
I only have 27 videos ;) ruclips.net/video/81Eb_YXmV4g/видео.html
Very interesting! I like it!
Sir I just joined the telegram channel mentioned in the chat box. I'm a Biotechnology student and I want to learn bioinformatics. So I joined the telegram channel. I think without any reason I'm banned.
Please invite me back bro.🥺
Hey. When you join, Bot asks you a question you need to type an answer for in the chat. If you don't, Bot kicks you. This is a Spam protection. Try joining again and see what Bot asks you.
@@rebelScience ok thanks
It's saying chat is no longer accessible. Please unban me. I'm BABU GUDDU.
def validate_seq(seq):
for nuc in seq:
if nuc not in nucleotides:
print("Invalid sequence. Only A, C, T, and G are accepted characters.
")
randomvschoice()
return seq
Hi rebelCoder,
I have this coded for a more user friendly version. I get a strange result though.
When I input a sequence with a mixture of correct and incorrect nucleotides it
ignores this statement. But when I input only
incorrect characters it works fine. I am not sure why please help.
I have the same problem
@@Rossboe1 just check the identation of the last line...I suggest it could be just in the level of the "for", not in the "if"
Good! Русские есть?
I am totally beginner for python. should I memorize these functions????'
Please watch Interdiction video. I explain everything in that video. Yes, you should be good with Python before watching these videos. Make sure you learn Python first. My Introduction video has links and suggestions.
What python software did you use
Hey! Sorry, I am not sure what you mean. Are you asking about what code editor I use to write Python code in?
@@rebelScience yes
I have a video about the code editor here: ruclips.net/video/81Eb_YXmV4g/видео.html
@@rebelScience thanks