Phylogenetic Analysis of ITS sequences in R

Поделиться
HTML-код
  • Опубликовано: 4 янв 2025

Комментарии • 68

  • @shayan9882
    @shayan9882 3 года назад +2

    I can't believe how much easier this was in comparison to my attempts with the msa package, thank you !

  • @joydeepnag885
    @joydeepnag885 Год назад

    Thank you very much!!! If only people kept it short and sweet such as this.. Kudos... :)

  • @marksanda9786
    @marksanda9786 4 года назад +2

    Thank you Russ, this made my day.

  • @TomasDuqueAcosta
    @TomasDuqueAcosta 2 года назад

    Thanks for the detailed explanation, I could create a tree really easy with your video

  • @ruddhidavidwans292
    @ruddhidavidwans292 3 года назад +1

    Thank you so much for this detailed video....It helped me a lot with my analysis. It will be of great help if you can also show how to analyze publically available RNAseq from NCBI GEO.

  • @SvaraMandira
    @SvaraMandira 6 месяцев назад

    Thanks for the detailed explanation, how do you run the boot strap.

    • @RJG_Ecology
      @RJG_Ecology  6 месяцев назад

      You can use the msa package and a few others for the Bayesian and ML analyses you would usually see in softwares like MEGA. Here's an rbubs blog that goes through the basic process:
      rpubs.com/mvillalobos/L01_Phylogeny

  • @Hekateras
    @Hekateras 3 года назад +2

    Very helpful guide. Question: Why use neighbor-joining instead of something like Maximum Likelihood to build your tree?

    • @RJG_Ecology
      @RJG_Ecology  3 года назад +1

      Just using defaults for the tutorial, the packages have multiple different methods that can be applied, I'm not sure if this one has Bayesian inference or not, but I agree that or MLE would probably be optimal!

    • @nobodyreally1634
      @nobodyreally1634 2 года назад +1

      Does anyone know how to build the tree using Maximum Likelihood instead of neighbor-joining?

  • @drali87
    @drali87 Год назад

    How do we define the node values?

  • @michellecheng6549
    @michellecheng6549 Год назад

    Thanks!! How can we group sequences into different colors based on their taxonomic group?

    • @RJG_Ecology
      @RJG_Ecology  Год назад

      Can you elaborate a bit more on what you want to do?
      In the meantime here is the ggtree documentation, it might have what you're looking for.
      4va.github.io/biodatasci/r-ggtree.html

  • @nailagulzar4328
    @nailagulzar4328 2 года назад

    Hi. It was easy. Thanks.
    Can you please provide some information that can allow me do diversity estimation using phylogenetic trees ( I don’t have any count matrix. I only have sequences from hiv patients). What R package can do that? Is there a GUI tool that can do diversity estimation and statistical test (t-test) ?

    • @RJG_Ecology
      @RJG_Ecology  2 года назад

      Tons of phylogenetics R packages. For tree building and visualization I would say phylotools, phytools, ape, and ggtree package are the most helpful. RevGadgets package has a mix of everything. see their paper here:
      besjournals.onlinelibrary.wiley.com/doi/epdf/10.1111/2041-210X.13750

  • @dr.ahmedelaswad5453
    @dr.ahmedelaswad5453 4 года назад +1

    Great job! How do you get the values for the nodes?

    • @RJG_Ecology
      @RJG_Ecology  4 года назад +1

      Hey Ahmed, you can find node values within the phylo object (which I named "tre" in the tutorial) by using the function nodepath(). In this case you would run nodepath(tre), and it will show the initial node first (the entire tree) the secondary node (in this case my three secondary nodes are 17, 19, and 20), where the first branches are rooted, and so on...

    • @dr.ahmedelaswad5453
      @dr.ahmedelaswad5453 4 года назад

      @@RJG_Ecology Thank you very much, Russ.

  • @MrAraxon
    @MrAraxon 4 года назад

    Great job!
    I am developing an algorithm via the R program to create phylogenetic
    trees and calculate values that interest me like homoplasy, CI, RI etc. On 2019 I had used a function
    called ''matord'' but I can't find it anymore. Specifically I needed it for
    calculation of two matrices for CI and RI.
    Is there any way to know something about this function ?
    The packages that I used to complete the creation of phylogenetic trees and calculate the homoplasy and the distance are: phangorn, ape, ade4, graphics, and seqinr.
    Nicely explained! Thank you very much!

    • @RJG_Ecology
      @RJG_Ecology  4 года назад

      Hey Nic, the function matord doesn't ring any bells for me... do you know specifically what package it was from, or do you know what the function does? If the purpose is as the name suggests, to order a matrix, there is simple ways to do that in R depending on what way you're trying to order values.
      There seems to be a custom object within a function of the ClusterSeq package with the name "matord" but that's about all I could find
      rdrr.io/bioc/clusterSeq/src/R/associatePosteriors.R

    • @RJG_Ecology
      @RJG_Ecology  4 года назад

      Also, there's this custom function
      gist.github.com/pedroj/1872314

    • @MrAraxon
      @MrAraxon 4 года назад

      @@RJG_Ecology In order to test the relation between distance and homoplasy I create this algorithm. The general concept of algorithm is to look for the most central strain of a given group of strains. This strain is the one that minimizes the average distance within a square distance matrix. Once the most central strain has been found, the other strains are sorted in increasing distance order. Adding one strain at a time, it is possible to have an increasing number of strains coming into play. At each addition, homoplasy and average distance of the strains from the most central strains are calculated and plotted. This procedure allows to consider carefully the trend of homoplasy and distance, as well as the Rescaled Index.

    • @RJG_Ecology
      @RJG_Ecology  4 года назад

      @@MrAraxon Not sure if you've seen this package yet, but maybe it has some helpful functionality?
      www.ncbi.nlm.nih.gov/pmc/articles/PMC6412054/

  • @ticklishpineapples
    @ticklishpineapples 2 года назад

    Do you have any suggestions for renaming the tip labels from GenBank accession numbers to genus names?

    • @RJG_Ecology
      @RJG_Ecology  2 года назад

      Hi Pamela,
      Yes actually!
      The taxonomizr package (see tutorial here: cran.r-project.org/web/packages/taxonomizr/readme/README.html) has a two step process for this purpose with functions "accessionToTaxa", which convert accession numbers to taxonomic IDs, and then "getTaxonomy" convert taxonomic IDs to taxonomy. They have examples of how to do so in the link.
      Let me know if you run into any issues!

  • @vasilikiskiada2332
    @vasilikiskiada2332 4 года назад

    Nicely explained! Thank you

  • @margauxk952
    @margauxk952 4 года назад

    Great video! Do you have any recommendations of packages or code for MLST analysis in R?

    • @RJG_Ecology
      @RJG_Ecology  4 года назад

      Hey Margaux, yes there are two packages that are used for MLSR in R:
      1) MLSTar
      bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2887-1
      github.com/iferres/MLSTar
      and
      2) STRAIN
      bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2887-1
      I'm not very familiar with it but also
      3) mlstverse
      github.com/ymatsumoto/mlstverse
      and
      4) StrainR
      github.com/jbisanz/StrainR
      This blog may be helpful too:
      www.r-bloggers.com/2017/01/descriptive-analysis-of-mlst-data-for-mrsa/

  • @mariachalsev9219
    @mariachalsev9219 2 года назад

    I keep getting this error :
    in f(p.profile[, anchors[2, n - 1]:anchors[1, n], drop = FALSE], :
    Alignment larger (16,317,302,694) than the maximum allowable size (2,147,483,647)
    Could you help me understand why? I've already tried DECIPHER in two different versions: 2.20.0 and 2.22.0

    • @RJG_Ecology
      @RJG_Ecology  2 года назад

      What line of code are you getting this error and with what data?

  • @siffo10
    @siffo10 4 года назад

    Really good stuff. Is there an attr() like function that will allow one to pull the geographic location of each of the sequence? Is that included in the metadata?

    • @siffo10
      @siffo10 3 года назад

      @Joe Partington-Smith Geographic location.

  • @jimmychurchward890
    @jimmychurchward890 2 года назад

    Such a big help thank you!!

  • @andreacassarino1342
    @andreacassarino1342 4 года назад +1

    Nice job

  • @vasilikiskiada2332
    @vasilikiskiada2332 4 года назад

    Hello, I would like to add bootstrap values in my tree. Any idea how to do that? Thank you.

    • @RJG_Ecology
      @RJG_Ecology  4 года назад

      Hey Vasilik, yes!
      So bootstrap values need to be appended to the phylo object itself as node labels, and then called in the ggtree as geom_nodetext. The top answer on the stackoverflow question addresses this in detail as well as how you can apply it yourself with coded examples:
      stackoverflow.com/questions/22749634/how-to-append-bootstrapped-values-of-clusters-tree-nodes-in-newick-format-in

    • @RJG_Ecology
      @RJG_Ecology  4 года назад

      Check this guys response too:
      www.researchgate.net/post/SOLVED_How_do_you_export_bootstrap_node_support_in_Rs_ape_package

    • @vasilikiskiada2332
      @vasilikiskiada2332 4 года назад

      @@RJG_Ecology thank you very much. I may have found a solution by calculating bootstrap values with boot.phylo() and assigning them to the phylo object with full_join() but I will also take a look at the page you are suggesting.

    • @Andi-mg2eh
      @Andi-mg2eh 3 года назад

      @@vasilikiskiada2332 would you mind sharing your solution?

  • @Sunny-China3
    @Sunny-China3 3 года назад

    Hi sir, it was very really informative R function, can i apply this function on Tree data?

    • @RJG_Ecology
      @RJG_Ecology  3 года назад

      With tree data do you mean ".tre" files? You can just read those in with the read.tree function and combine them with merge_tree if they have common variables

    • @Sunny-China3
      @Sunny-China3 3 года назад

      @@RJG_Ecology thank you sir for reply. Actually my teacher told me to calculate phylogenetic diversity from tree data he send me i can use R but it makes me more confuse since last week i m trying did not find anyway how to do it. If you can guide me about phylogenetic diversity would be very appreciated. Thank you sir.

  • @abubakarbashir7951
    @abubakarbashir7951 4 года назад

    Nice job, keep it up.

  • @Gayensubrata89
    @Gayensubrata89 3 года назад

    I am getting an Error in gray(valgris[numclass]) : invalid gray level, must be in [0,1]. how to solve that?

    • @RJG_Ecology
      @RJG_Ecology  3 года назад

      What lines of code are you running when you get the error?

    • @Gayensubrata89
      @Gayensubrata89 3 года назад

      @@RJG_Ecology temp

    • @RJG_Ecology
      @RJG_Ecology  3 года назад

      @@Gayensubrata89 it looks like the ade4 "table.paint" function has updated and removed the argument "cleg", just remove that and it should work. i.e.
      table.paint(temp, clabel.row=.4, clabel.col=.4)+
      scale_color_viridis()

    • @Gayensubrata89
      @Gayensubrata89 3 года назад

      @@RJG_Ecology No it is not working. Still having the error-
      Error in gray(valgris[numclass]) : invalid gray level, must be in [0,1].

    • @RJG_Ecology
      @RJG_Ecology  3 года назад

      @@Gayensubrata89 Please show me the code you ran to get that error.

  • @DG-xg8vg
    @DG-xg8vg 4 года назад

    Good job thanks for sharing!

  • @uguremre3287
    @uguremre3287 2 года назад

    could not find function "OrientNucleotides" I got this error. Could you pls help me guys

    • @RJG_Ecology
      @RJG_Ecology  2 года назад

      Hey Ugur, the reason for this error in R is that you have not opened the function library. In this case, the function library is "DECIPHER". Make sure you have installed DECIPHER using:
      if (!requireNamespace("BiocManager", quietly = TRUE))
      install.packages("BiocManager")
      BiocManager::install("DECIPHER")
      and then open it using:
      library(DECIPHER)

    • @uguremre3287
      @uguremre3287 2 года назад

      @@RJG_Ecology Thank you for replying Russ. But I got new error like this:
      in f(p.profile[, anchors[2, n - 1]:anchors[1, n], drop = FALSE], :
      Alignment larger (9,174,518,227) than the maximum allowable size (2,147,483,647).
      How can I fix it?

    • @RJG_Ecology
      @RJG_Ecology  2 года назад

      @@uguremre3287 The maximum allowable size for alignments with DECIPHER alignseqs() is 2,147,483,647. Therefore anything larger will need to use a different alignment function such as FindSynteny() followed by AlignSynteny().

    • @uguremre3287
      @uguremre3287 2 года назад

      @@RJG_Ecology I tried to run from AlignSynteny() but I couldn't figure out it:(

    • @uguremre3287
      @uguremre3287 2 года назад

      Error in AlignSynteny(apricot) :
      synteny must be an object of class 'Synteny'

  • @judithestherbairdlujano1493
    @judithestherbairdlujano1493 4 года назад

    Thanks for the video Russ! Does your Udemy course includes how to run phylogenetic analysis using the maximum likelihood method?

    • @RJG_Ecology
      @RJG_Ecology  4 года назад +1

      It does not, but I think I may add this in the near future. If you're already a student of the course, you can add a question regaurding this on the message board and I would be happy to post some code to walk you through it.

  • @ramshaazhar7338
    @ramshaazhar7338 3 года назад

    Can you please share this code .

    • @RJG_Ecology
      @RJG_Ecology  3 года назад

      Link to the code and data is in the description already

  • @archimedemulega7086
    @archimedemulega7086 3 года назад

    Thanks!

  • @alexisjose7515
    @alexisjose7515 4 года назад

    excelente!

  • @gembarry8280
    @gembarry8280 4 года назад

    Your video is good however, the poor visual makes it difficult to follow R commands

  • @agricultureenginner8852
    @agricultureenginner8852 3 года назад

    thanks for share... nice job but the link is not working (github.com/RussellGrayxd/Phylogenetics). where can i find the formulas for rstudio

    • @RJG_Ecology
      @RJG_Ecology  3 года назад

      The link is working fine on my end. Check your browser and firewall settings, could also be connection. Can you access github by itself? github.com/