My teacher just dropped a heat map coding project on us without explanation, before our exam! Thank you for this video, it makes the project seem less scary for tomorrow ❤
Hi! Great video, thank you very much for your work, it helps a lot. Also, I have a beginner question I hope you can answer. You say that the values of gene expression in the table (1:29) is the fold change value of that gene in the sample. However, is the foldchange regarding to what? In my understanding you obtain the Foldchange when comparing two conditions. Sorry if it is a silly question. And again, thank you very much for your videos!
Hi! Not at all, that's a great question. You can actually use heatmaps to just show gene expression data (to CPM, for example) or log2FC. Log2FC is usually the way to go as it is a more interpretable and standardized way to compare expression levels between conditions/samples (otherwise gene expression counts may vary a lot). Not sure what is your case but if you have multiple samples, you would calculate thelog2FC for each sample relative to a reference condition. Your reference condition might be the 'control', or if you have normal and tumour samples, you might use the average expression in normal samples as your reference. Or, if you are comparing liver vs kidney samples, you could have log2FC(kidney/liver) or log2FC(liver/kidney). It really depends on the question you want to answer, but basically log2FC compares one condition vs another. Hope this makes sense!
@@biostatsquid would you please make a video on how to calculate the log2FC for each sample? I had the same question as his. Also, what causes unexpected groupings in the heatmap? is it the presence of false positives or false negatives in the dataset? Thank you
I agree, and your question wasn't answered, she describes the data in the table as a fold change (relative data) yet she is showing the separate absolute data points for experimental and control (the two data points you need to find the fold change) so I am confused as well. I think that was just a mistake
Aaaand if you are more of a reader, here is the post where I explain how to interpret a heatmap for gene expression data: biostatsquid.com/heatmaps-simply-explained/
I don't understand the rationale behind the specific clustering shown in the video. What is the basis for organizing the dendrograms in the manner shown versus some other way? As shown, to me it looks arbitrary. What am I missing?
Great video. But i got question how does the heatmap get that diagonal dont you usually have to do a operation on that data to get a heatmap like that. For my data i did with and without spearman correlation heatmap and it only diagonal with the operation before its just random
Hi, thanks for your comment! Yes, you can first compute pairwise correlations between samples, using for example spearman correlation, and then plot that. Or you can also do the same thing using the clustering arguments with pheatmap (cluster_cols and/or cluster_rows set to TRUE, and then choose clustering_method = 'spearman') . If you don't cluster your data you most likely won't see a pattern (the 'diagonal' you are referring to). Does this answer your question?
@biostatsquid yes it does thank you so much for answering. For heatmap I didn't know you can do spearman in its parameters I use python for my data analysis
Hi! Thanks for your comment, great suggestion. I don't have that much experience analysing proteomics datasets, are there any specific topics in particular you would like me to cover?
My teacher just dropped a heat map coding project on us without explanation, before our exam! Thank you for this video, it makes the project seem less scary for tomorrow ❤
Very useful video ,cleared all my concepts thank you
Very nice and to the point, thank you. Next time when possible, plz show an example: from GEO to R to heatmap.
This is really helpful , thank you!
Hi! Great video, thank you very much for your work, it helps a lot. Also, I have a beginner question I hope you can answer. You say that the values of gene expression in the table (1:29) is the fold change value of that gene in the sample. However, is the foldchange regarding to what? In my understanding you obtain the Foldchange when comparing two conditions. Sorry if it is a silly question. And again, thank you very much for your videos!
Hi! Not at all, that's a great question. You can actually use heatmaps to just show gene expression data (to CPM, for example) or log2FC. Log2FC is usually the way to go as it is a more interpretable and standardized way to compare expression levels between conditions/samples (otherwise gene expression counts may vary a lot). Not sure what is your case but if you have multiple samples, you would calculate thelog2FC for each sample relative to a reference condition. Your reference condition might be the 'control', or if you have normal and tumour samples, you might use the average expression in normal samples as your reference. Or, if you are comparing liver vs kidney samples, you could have log2FC(kidney/liver) or log2FC(liver/kidney). It really depends on the question you want to answer, but basically log2FC compares one condition vs another.
Hope this makes sense!
@@biostatsquid would you please make a video on how to calculate the log2FC for each sample? I had the same question as his.
Also, what causes unexpected groupings in the heatmap? is it the presence of false positives or false negatives in the dataset?
Thank you
I agree, and your question wasn't answered, she describes the data in the table as a fold change (relative data) yet she is showing the separate absolute data points for experimental and control (the two data points you need to find the fold change) so I am confused as well. I think that was just a mistake
It was a simple and great explanation.
Thank you so much!!
this video was very usefull, thak you very much, i love it. I undestood now perfectly!
Nice!
Thank you!
Aaaand if you are more of a reader, here is the post where I explain how to interpret a heatmap for gene expression data: biostatsquid.com/heatmaps-simply-explained/
Amazing
I don't understand the rationale behind the specific clustering shown in the video. What is the basis for organizing the dendrograms in the manner shown versus some other way? As shown, to me it looks arbitrary. What am I missing?
Great video. But i got question how does the heatmap get that diagonal dont you usually have to do a operation on that data to get a heatmap like that. For my data i did with and without spearman correlation heatmap and it only diagonal with the operation before its just random
Hi, thanks for your comment! Yes, you can first compute pairwise correlations between samples, using for example spearman correlation, and then plot that. Or you can also do the same thing using the clustering arguments with pheatmap (cluster_cols and/or cluster_rows set to TRUE, and then choose clustering_method = 'spearman') . If you don't cluster your data you most likely won't see a pattern (the 'diagonal' you are referring to). Does this answer your question?
@biostatsquid yes it does thank you so much for answering. For heatmap I didn't know you can do spearman in its parameters I use python for my data analysis
Proteomics please
Hi! Thanks for your comment, great suggestion. I don't have that much experience analysing proteomics datasets, are there any specific topics in particular you would like me to cover?
nice explanation