idk if its because he's using make but If you are using windows like I am, make sure that when you use the "print" function, make sure to use parenthesis Ex: (EXACTLY LIKE THIS) print("number of g's " + str(g)")
I had a little trouble finding the correct Nucleotide, To save time here is the ref. # for the example in the video: NCBI Reference Sequence: NG_031859.1
In a nutshell...YES. But in addition to analysis, they can use programming for drug discovery therapeutics. They can use programming for predictive analytics to see if something will switch a gene on or turn it off before administering it experimenting with it to save time and money.
more pythonic would by to get rif of nested loop and just use build in string function count(): for line in gene: g += line.count('g'); a += line.count('a'); c += line.count('c'); t += line.count('t');
Hello, Thanks for this video! I was wondering if we could use the difflib program to do comparative genomics for two different files and create a report of differences?
Hi, So I wrote the same program on PyCharm I tried opening this in Bash Shell and I get told "not a directory". I switched directories to ensure I was in the right folder. Does anyone have suggestions?
sir i am using windows 7 operating system, python and instead of coda i am using sublime text 2. i have followed everything until the TERMINAL option. it is not there in windows. can u tell me the equivalent one. so that i can finish the last step. waiting for your reply sir. thank you
Because you have to initialize variables to zero before you add a number to it ( g+=1 => g = g + 1), if you don't initialize variables to zero, your variable has seme thrash value, and you won't have a valid result. First time it enters 'if' with 'g', g is going to be zero, so g = 0 + 1 = 1, if you don't initialize, it will be g = #$#@$+ 1 = ?. Hope that helps :)
anyone know the answer ? what ,if u take the fasta format without head ,can u get rid of that gene.readline() ? And when the counter are named with A,C,T,G string, can u get rid of that line.lower() ? TQ 4 any suggestions .
+willie ekaputra yeah that just skips the line, so if the line is not there you don't need to skip it, but if you remove it then it's no longer a fasta file. Either way, this is not how an advanced Bioinformatician would solve this task.I think Blake is showing that you can make the string lower case. It usually is upper case so you don't need to be converting you don't need that line.
I have other question, can u then make this code a fct . with Def ... () :, so that u can open ANY Fasta saved files in yer PC and count its GC Content ?
Thanks but I had problem while running. I used windows bash and I got " print "number of g's " + str(g) ^ SyntaxError: invalid syntax error. Even though I did the same thing that you did. Please help me
+Suleyman Bozkurt That maybe because you are using python 3+ where the syntax for print statement is print("number of g's "+ str(g)) [Notice the parentheses], whereas in python 2+ the syntax for print is as mentioned in the video[ print "number of g's " + str(g) ] Hope it helped! :)
Awesome, my first python program to know the gc content... I have a question, What is the gc content for? What does it tell me exactly? Did not understand that very well. BTW I used this squence Rattus norvegicus BRCA1 mRNA, complete cds gc content: 0.460014
@@georgegrevera7000 The problem is scale. Had the gene sequences been longer, this would be exponentially inefficient. I'm coming from a computer science background though, where efficiency is hammered into our heads due to scalability
The best video i have seen on bioinformatics
This tutorial is brilliant, please create more!
I very much enjoyed this video. I like the fact that, by the end, I'm working with real data and doing something useful. Thanks!
Why not write it in the Python IDLE?
Thanks a lot, it was really helpful. You haven't put any other videos on this subject since 2013, though.
invalid syntax on the second quote of print "number of g's " + str(g)
idk if its because he's using make but If you are using windows like I am, make sure that when you use the "print" function, make sure to use parenthesis
Ex: (EXACTLY LIKE THIS)
print("number of g's " + str(g)")
Same here, thank you so much for the advice I am going to try it
I think the syntax changed in Python3
Why not use count() or regular expressions?
poor Python skills
No, poor programming skills.
I had a little trouble finding the correct Nucleotide, To save time here is the ref. # for the example in the video:
NCBI Reference Sequence: NG_031859.1
Now this doesn't work! :(
Pls provide the exact link for dataset download in description
So I have to create a folder first then create another folder to put the file inside of it?
Just small question. Is this what bioinformatics mostly do? Sequence genes then use a programming language for analysis?
In a nutshell...YES. But in addition to analysis, they can use programming for drug discovery therapeutics. They can use programming for predictive analytics to see if something will switch a gene on or turn it off before administering it experimenting with it to save time and money.
any more advanced python scripts to use for the analysis of sequencing data
which python book could be better for references ? This is nice!
My problem so far is saving the folder as a plain txt file. My macbook will not give me the option when I select the drop down list.
yes even mine
this is outstanding iam hoping you can show more examples in jupyter notebook
more pythonic would by to get rif of nested loop and just use build in string function count():
for line in gene: g += line.count('g'); a += line.count('a'); c += line.count('c'); t += line.count('t');
Thanks a lot for a nice turotial! But have you tried TextWrangler instead of Textedit?
this is for python 2.7.x right? it doesnt work with my 3.3.x
Noel Tanner,
Thanks for the Reference sequence, i was having hard time finding the correct nucleotide.
Really helpful!
I love Python!
Hello, Thanks for this video! I was wondering if we could use the difflib program to do comparative genomics for two different files and create a report of differences?
Hi,
So I wrote the same program on PyCharm
I tried opening this in Bash Shell and I get told "not a directory". I switched directories to ensure I was in the right folder. Does anyone have suggestions?
thanks it is very good information
Very clear and informative - thanks! Do you mind if I post/share?
This was really informative and interesting!
sir i am using windows 7 operating system, python and instead of coda i am using sublime text 2. i have followed everything until the TERMINAL option. it is not there in windows. can u tell me the equivalent one. so that i can finish the last step. waiting for your reply sir. thank you
windows command line, or now powershell. you will have to add python to the path to run python from the command line
What version of python did you use?
Hi! Thanks for the video!! However, can you please explain why you set the g, a, t and c at 0 in the beginning?
Thanks!
Because you have to initialize variables to zero before you add a number to it ( g+=1 => g = g + 1), if you don't initialize variables to zero, your variable has seme thrash value, and you won't have a valid result. First time it enters 'if' with 'g', g is going to be zero, so g = 0 + 1 = 1, if you don't initialize, it will be g = #$#@$+ 1 = ?. Hope that helps :)
Very cool! I need to learn Python ASAP!
anyone know the answer ? what ,if u take the fasta format without head ,can u get rid of that gene.readline() ?
And when the counter are named with A,C,T,G string, can u get rid of that line.lower() ?
TQ 4 any suggestions .
+willie ekaputra yeah that just skips the line, so if the line is not there you don't need to skip it, but if you remove it then it's no longer a fasta file. Either way, this is not how an advanced Bioinformatician would solve this task.I think Blake is showing that you can make the string lower case. It usually is upper case so you don't need to be converting you don't need that line.
I have other question, can u then make this code a fct . with Def ... () :, so that u can open ANY Fasta saved files in yer PC and count its GC Content ?
Your web page is down, can you let me download this. Your channel blocks it from being able to download.
Thanks but I had problem while running. I used windows bash and I got "
print "number of g's " + str(g)
^
SyntaxError: invalid syntax
error. Even though I did the same thing that you did. Please help me
+Suleyman Bozkurt
That maybe because you are using python 3+ where the syntax for print statement is print("number of g's "+ str(g)) [Notice the parentheses], whereas in python 2+ the syntax for print is as mentioned in the video[ print "number of g's " + str(g) ]
Hope it helped! :)
thnx boss. resolved my issues.
Awesome, my first python program to know the gc content... I have a question, What is the gc content for? What does it tell me exactly? Did not understand that very well.
BTW I used this squence Rattus norvegicus BRCA1 mRNA, complete cds
gc content: 0.460014
Very informative. Thank you for providing this example.
My syntax is always error in
If char == "g" :
Usually in (if) and in (g)
Help me why
'g', you need to check for a char not a string
very straight forward tutorial, thanks
It was helpful..thank you.keep adding
Very informative!
Nice cool intro to bioinfo
Thank you it works very well
THIS IS AMAZING.
FANTASTIC! Thank you!
Good video, just wish it was more streamlined
dude why does this not work at all using windows
did you install python?
yes
Matt saying it doesn't work "at all" isn't really a helpfull comment.
9:00 variable*
Спасибо тебе большое за этот разбор!
Very Helpful!
Thank You!
1.75x Speed would be really appreciated for this video :D
good work
Very Nice!
thanks so much i'll definitely be coming back
for beginners only
Why never start with the code this man?
very nice
I live python. Great tutorial!
super inneficient code. use the count() funcion which is WAY faster!
I timed both ways on a file of 117k bases. His way used 0.02 sec. Using count() used 0.005 sec. Both are fast enough for me.
@@georgegrevera7000 The problem is scale. Had the gene sequences been longer, this would be exponentially inefficient. I'm coming from a computer science background though, where efficiency is hammered into our heads due to scalability
oh man, wow thanx
👏👏👏
Someone should re-do these videos in Windows.
awesome
poor video making quality
bevkuff