NOTE: RPKM values at 11:23 were calculated using scaling factor 10. RPKM values at 14:40 were calculated using scaling factor 1 million. Apologies for not using the same table as 11:23 for consistency.
Found your explanation very good. Can you clarify about the new scaling factor of 1 million ? I am not getting the value 6.66 even when using scaling factor 1 million for gene A Technical Replicate 1. Did you use the scaling factor of 1 million by considering the entire table and not only gene A and gene B ?
I had an query.. In the 2nd and 3rd lecture you have taken FPKM normalized data and demonstrated the gene expression between the samples and in this video lecture you are demonstrating that FPKM can't be used for Differential gene expression analysis. I am confused between the concepts.
Thank you very much for all your videos. It has helped me a lot to understand the analysis better since you explain it in a very didactic way. Please always continue with the channel. I would like to ask and clarify a doubt about normalization. I have RNAseq data made with selection by PolyA and other RNA-seq Total data. I would like to join this data in order to increase the sampling within some subtypes that I have few samples. Do you know any method or normalization process that makes this joining of RNA-seq PolyA and Total possible? I looked for this information in a lot of articles that work with multiple types of data, but they don´t detail how they did. Thank you very much
not able to determine FPKM in R, I have the mean fragment length, rawcount annotated with gene symbol and I have the gene length. Will you please help me with this? The problem is the same gene has multiple transcripts and each transcript has a separate mean length value.
What is count matrix data we put into the DESeq2? I'm confused for the term "raw counts". What are some common tools people used to get the raw count data mapping to the ref genome?
I have explained what raw counts are in this initial section of this video: ruclips.net/video/2RFYKTvCXHs/видео.html I have explained how to get raw counts from aligned reads here: ruclips.net/video/lG11JjovJHE/видео.html These videos should help clear your doubts.
Very nice explanation mam. thank you so much. Can you please clarify what's the difference between RPKM and FPKM. And is that possible to determine the fold changes from fpkm data and compare between control and test samples.
FPKM is analogous to RPKM and is used specifically in paired-end data. You can calculate fold changes from FPKM by taking a ratio of FPKM from test/FPKM from control.
Thank you so much for your explanation! I wonder if I want to visualize a gene's expression across samples, which value should I use? In theory, TPM is the best, right? But since I only compare this gene, I do not need to consider about the gene length. The normalized counts from DESeq2 could also be used, right?
I had the same question. To compare genes between samples, you need to use normalized raw counts, and not TPM. In your case, even if it is only one gene, I think you still need to use normalized raw counts, because the other genes are still influencing your data. TPM is ok only if you want to compare gene expression within one sample (Kallisto is a good mapping software which uses pseudoallignments to obtain TPM values). In my case, for comparation between samples, I converted TPM in log2CPM (counts per milion), filtered (removed genes with 0 expression) and normalized (trough the TMM method using edgeR). Then, I used those values for comparation by plotting a heatmap with the genes of interest. Hope I could help 🙂
Thank you very much for such a nice and detailed information. I am looking for information on batch normalization of RNA sequencing data, I have observed bias in mapping rate.
Hi I am supposed to perform a TPM normalisation of my counts Matrix. Can I use the steps explained here as it is or should I use any tool/ package for doing so
many tnx for your videos, have a question, if i want to use RNA seq data that is downloaded from TCGA for train the model can i use one of these three normalized method data if i can not please tell me what should i do?
Hi, 11:23 you show the RPKM values table with a gene length of 1.5kb, 2kb side by side, my question is a few seconds before you need to find out the 1kb value but why do you mention the gene length differently at the whole RPKM values 11:23
thx a lot, plz can you make a video to teach us how to get gene lengths (width) b/c sometimes data don't have gene lengths (width) b/c i want to get cpm/rpkm,tpm
Thank you for bringing it to my notice. RPKM values at 11:23 were calculated using scaling factor 10. RPKM values at 14:40 were calculated using scaling factor 1 million. I should have used the same table as 11:23 for consistency. My bad. I'll leave a note in the description.
NOTE: RPKM values at 11:23 were calculated using scaling factor 10. RPKM values at 14:40 were calculated using scaling factor 1 million. Apologies for not using the same table as 11:23 for consistency.
Found your explanation very good. Can you clarify about the new scaling factor of 1 million ? I am not getting the value 6.66 even when using scaling factor 1 million for gene A Technical Replicate 1. Did you use the scaling factor of 1 million by considering the entire table and not only gene A and gene B ?
Me too, I have the same question.@@devraj1989 @Bioinformagician
She actually used scaling factor 10 here.
Very concise and easy to understand, especially for beginners. Thanks!
The best explanation that I ever have! Thanks a lot!!!
Very good explanation, perfect for begginers!
A perfect copy of another tutorial
ruclips.net/video/TTUrtCY2k-w/видео.html
thank you for your video. Your explanation is very easy to follow.
Amazing! I just thought you missed explaining the concepts of sequencing depth. thanks
I shall note it down to explain it the next time I am covering any concept that involves sequencing depth. Thanks for bringing it to my notice :)
This is very illuminating. Thank you!
You're awesome please keep doing this work, I'll support you from my end.
A perfect copy of another tutorial
ruclips.net/video/TTUrtCY2k-w/видео.html
have been searching this foe a soooooooo long time ...thankyouuu sooooooo much
Thank you so much! It's very easy to understand!!!
fantastic .. Loved the explanation
Thank you for you videos. How can we do the RPKM for counts using R ?
You are great at explaining! Thank you a lot!!!
Why can't you use TPM values for differential gene expression analysis?
Amazing explanation! Thanks
I had an query.. In the 2nd and 3rd lecture you have taken FPKM normalized data and demonstrated the gene expression between the samples and in this video lecture you are demonstrating that FPKM can't be used for Differential gene expression analysis.
I am confused between the concepts.
Crystal clear! Thanks.
Thanks for making this video and Very very good explanation!
Thank you sooooo much!! Super helpful! :)
Thank you very much for all your videos. It has helped me a lot to understand the analysis better since you explain it in a very didactic way. Please always continue with the channel.
I would like to ask and clarify a doubt about normalization. I have RNAseq data made with selection by PolyA and other RNA-seq Total data. I would like to join this data in order to increase the sampling within some subtypes that I have few samples. Do you know any method or normalization process that makes this joining of RNA-seq PolyA and Total possible? I looked for this information in a lot of articles that work with multiple types of data, but they don´t detail how they did.
Thank you very much
very clear, thank you !
nice video, please how did u arrive at the final RPKM value because from your teaching we didn't get this value u used
not able to determine FPKM in R, I have the mean fragment length, rawcount annotated with gene symbol and I have the gene length. Will you please help me with this? The problem is the same gene has multiple transcripts and each transcript has a separate mean length value.
thnaks for the explaination, very useful
What is count matrix data we put into the DESeq2? I'm confused for the term "raw counts". What are some common tools people used to get the raw count data mapping to the ref genome?
I have explained what raw counts are in this initial section of this video: ruclips.net/video/2RFYKTvCXHs/видео.html
I have explained how to get raw counts from aligned reads here: ruclips.net/video/lG11JjovJHE/видео.html
These videos should help clear your doubts.
Very nice explanation mam. thank you so much. Can you please clarify what's the difference between RPKM and FPKM. And is that possible to determine the fold changes from fpkm data and compare between control and test samples.
FPKM is analogous to RPKM and is used specifically in paired-end data.
You can calculate fold changes from FPKM by taking a ratio of FPKM from test/FPKM from control.
Thank you so much for your explanation! I wonder if I want to visualize a gene's expression across samples, which value should I use? In theory, TPM is the best, right? But since I only compare this gene, I do not need to consider about the gene length. The normalized counts from DESeq2 could also be used, right?
I had the same question. To compare genes between samples, you need to use normalized raw counts, and not TPM. In your case, even if it is only one gene, I think you still need to use normalized raw counts, because the other genes are still influencing your data. TPM is ok only if you want to compare gene expression within one sample (Kallisto is a good mapping software which uses pseudoallignments to obtain TPM values).
In my case, for comparation between samples, I converted TPM in log2CPM (counts per milion), filtered (removed genes with 0 expression) and normalized (trough the TMM method using edgeR). Then, I used those values for comparation by plotting a heatmap with the genes of interest.
Hope I could help 🙂
Thank you very much for such a nice and detailed information. I am looking for information on batch normalization of RNA sequencing data, I have observed bias in mapping rate.
Hi
I am supposed to perform a TPM normalisation of my counts Matrix. Can I use the steps explained here as it is or should I use any tool/ package for doing so
Is there any problem applying TMP normalization in metagenomic paired-end sequencing data?
Thank you for this video !!
many tnx for your videos, have a question, if i want to use RNA seq data that is downloaded from TCGA for train the model can i use one of these three normalized method data if i can not please tell me what should i do?
Could you please make a video to identify FPKM from DESeq2
thank you..... you are impressive
Hi, 11:23 you show the RPKM values table with a gene length of 1.5kb, 2kb side by side, my question is a few seconds before you need to find out the 1kb value but why do you mention the gene length differently at the whole RPKM values 11:23
Do I need to do alignment before counting?
how to get the gene length value for each gene? In this it is directly taken as 1.5 kb and 2 kb
Thanks a lot🙂👍
thx a lot, plz can you make a video to teach us how to get gene lengths (width) b/c sometimes data don't have gene lengths (width) b/c i want to get cpm/rpkm,tpm
You can use biomart to get gene lengths. I have explained how to use biomart in one of my videos: ruclips.net/video/cWe359VnfaY/видео.html
@@Bioinformagician thank you so much
very god explanation for beginners, thank you!!! but why the RPKM table at 11:23 different from the one at 14:40?
Thank you for bringing it to my notice. RPKM values at 11:23 were calculated using scaling factor 10. RPKM values at 14:40 were calculated using scaling factor 1 million. I should have used the same table as 11:23 for consistency. My bad. I'll leave a note in the description.
Thank you madam
if all the three are not suitable for gene expression in DEseq2 and edgeR, which method should i use
They have their own normalization methods. Please refer to DESeq2 videos on my channel where I explain that in detail.
perfect, thanks