Genes and geography -- a bioinformatics project

Поделиться
HTML-код
  • Опубликовано: 25 авг 2024

Комментарии • 62

  • @Telemed911
    @Telemed911 2 года назад +31

    She is a great teacher of bioinformatics!!! - This from a retired professor of computational medicine and bioinformatics at Michigan...

  • @lilacspring2556
    @lilacspring2556 2 года назад +9

    I literally have an assignment on this that I have to work on today, you're a godsend!

  • @alessandrogagliardi7470
    @alessandrogagliardi7470 Год назад +4

    I just started my master degree in Computational Biology and these videos are kind of inspiring! Coming from an undergrad in Biotechnology, I have a lot of work to do and I hope I could reach good Bioinformatics skills in the next two years! Thank you again for the content

    • @francescosilvestro2092
      @francescosilvestro2092 Год назад

      Where are you taken the Master degree? I'm a Medical biotech student at Federico II, Naples who has request trainship in bioinformatic (translational genomic). As autodidact, I'm learning PCA and multivariate analysis.

    • @alessandrogagliardi7470
      @alessandrogagliardi7470 Год назад +1

      @@francescosilvestro2092 I'm taking the master degree in Trento. It has very respectable research groups in the field

  • @AbhinavSrivastava-xe7xi
    @AbhinavSrivastava-xe7xi 11 месяцев назад +2

    I'm a computer scientist, very fun to watch these. Will try it out

  • @Kitsune152
    @Kitsune152 Год назад +1

    I'm not a biologist, just here for the really cool bioinformatics videos you do! Thanks

  • @santiagomedina8585
    @santiagomedina8585 2 года назад +5

    Wow, this was a relly awesome video!!. Specially for me doing my phd in pop-gene. Looking forward for more like this.

  • @patricioperez1985
    @patricioperez1985 2 года назад +1

    Thanks Maria, your content is really great!

  • @furkanmtorun
    @furkanmtorun 2 года назад

    Thanks a lot for the great video! I look forward to seeing more such content!

  • @alejandrogonzalesdezavala6930
    @alejandrogonzalesdezavala6930 2 года назад

    This was so satisfying to watch!

  • @balqeesmansour6692
    @balqeesmansour6692 Год назад +2

    very informative video thanks a lot, may you explain how you got the number of SNPs ??

  • @yasintopcu4042
    @yasintopcu4042 2 года назад

    thanks! looking forward to seeing more

  • @latinadna
    @latinadna 2 года назад

    thank you so much! super comprehensive

  • @BrenaCedraz
    @BrenaCedraz 2 года назад

    God bless you, anyway you alreadt is a goddes! Thank youuu

  • @leandronascimento5552
    @leandronascimento5552 2 года назад

    Nice!! Thank you so much!

  • @angezoclanclounon1751
    @angezoclanclounon1751 2 года назад

    Thanks a lot for this nice video.

  • @felipenunezvillena2141
    @felipenunezvillena2141 Год назад

    Dear Maria. Thanks for this video, I think it was very insightful for biologists like me on how we can control RNA-seq data based on subject genotype (i.e: When that info is not available through the metadata). After seeing the video i was thinking why there is no much research on the application of dimensionality reduction techniques on Whole Exome Sequencing (WES) data ??. It won't be also interesting to attempt to stratify gene expression profile based on potential variants-causing diseases?. I would love to hear your opinion on this subject. Cheers

  • @nextgengenomics
    @nextgengenomics 2 года назад

    Very cool!

  • @frankr2007
    @frankr2007 7 месяцев назад

    The file was to big for my virtual box linux, any advice?

  • @user-rt3ms8vm5x
    @user-rt3ms8vm5x 8 месяцев назад

    How do I get a bioinformatics title for my final thesis

  • @onatovonatovic526
    @onatovonatovic526 2 года назад

    Thank you so much!

  • @beeryya
    @beeryya 2 года назад

    Great video.

  • @indologyandindianhistory673
    @indologyandindianhistory673 2 года назад +1

    Hi Maria! Great video, I have my own data in VCF format, is there a way I could plot it together with the rest of the data you've shown here. Look forward to any guidance or tips on how to do that

    • @OMGenomics
      @OMGenomics  2 года назад +1

      You could do the same just swapping out the VCF for your own, then in the colab you could load them both and then pd.concat them. Check pandas documentation for more details.

    • @indologyandindianhistory673
      @indologyandindianhistory673 2 года назад

      @@OMGenomics thanks Maria! I'll try it over the weekend. Will get back to you if I face any issues :)

  • @franciscoromogaray3076
    @franciscoromogaray3076 7 месяцев назад

    How long should it take to download? It's been a reaaaally long time and it's still loading

  • @praveenrathore315
    @praveenrathore315 2 года назад

    Hii Mam this is very important topic

  • @louisvalois3863
    @louisvalois3863 2 года назад +1

    Sorry, I'm an amateur researcher and I study and compare ancient samples and populations. I mainly use GEDMATCH and Mytrueancestry. Maybe you can tell me what data format the MTA uses in its database? Full BAM files downloaded from archives or their minified version?
    Because very strange results usually appear when comparing archaic and recent samples.
    Sorry if I asked a stupid question. I just want to get an answer to whether simple TXT file-based gene samples are suitable for scientific testing.
    The point is that I found the downloadable WGS database of Hungarian medieval rulers and I also want to perform higher-level tests and analyzes with BAM files.

    • @OMGenomics
      @OMGenomics  2 года назад +1

      I’m not actually familiar with MTA or its data format, but I just googled it, and it looks like it takes data from various services. Does that include 23andMe and/or Ancestry? In that case those would be SNP data so you wouldn’t have full bam files because there are no sequencing reads but rather just the SNP genotypes. You can get back and forth between these and a VCF by converting SNP rs IDs to their genomic locations, though I don’t know what tool to use for this off the top of my head….

    • @louisvalois3863
      @louisvalois3863 2 года назад

      @@OMGenomics Thank you very much for your reply, I really appreciate it. This matches what I guessed so far.
      In short, it is about the fact that, depending on the subscription, the MTA makes a certain number of archaic samples available to its subscribers. The maximum is 700 samples. Then I upload my 23andme or FTDNA or Myheritage raw data. And then I can compare myself to this specified 700 ancient people.
      But the problem is that with some people I can match up to 7 segments and 240 centimorgans, which I think is impossible with a person who lived 800 years ago. It's like being a first cousin of a person who lived 25 generations ago.
      Since I am not an IT specialist, I only assume that this contradiction is caused by the different data formats. So I think the matches seen in the MTA are not true

    • @louisvalois3863
      @louisvalois3863 2 года назад

      @@OMGenomics Or, for example, what you say is confirmed when a few days ago King Béla III's mitochondrial DNA was given T2b2b1. It stayed that way for a couple of days until it was upgraded to H1b, which it actually was. So this company is really working with data that lacks essential genetic information

    • @OMGenomics
      @OMGenomics  2 года назад +1

      Interesting! I asked the hive mind on Twitter, so I hope my extended network includes enough ancient DNA experts to help check your concerns.

    • @louisvalois3863
      @louisvalois3863 2 года назад

      @@OMGenomics Thank you very much, it's very cooI, I will be very interested in expert opinions

  • @samifawcett4246
    @samifawcett4246 2 года назад +1

    nice.

  • @zahraazkiar7209
    @zahraazkiar7209 2 месяца назад

    hey i cant open the link provided by 1000 vcf genomes! it says can't connect??

    • @OMGenomics
      @OMGenomics  2 месяца назад

      Hey! I just checked and it was working for me. Can you include the exact command you ran?

  • @MrKasshiff
    @MrKasshiff 2 года назад

    What software you are using for taking notes and writing python script?

    • @OMGenomics
      @OMGenomics  2 года назад +1

      VSCode, longer name is visual studio code

  • @alessiailas4929
    @alessiailas4929 2 года назад

    I got lost at the 2 min mark, because the link doesn't work for me :( do you know how I can fix that? it just gives me a blank page

    • @OMGenomics
      @OMGenomics  2 года назад

      Which link? Btw everything you need is on the github repo I linked in the description.

    • @kevinalexis9886
      @kevinalexis9886 2 года назад

      You can download the vcf files directly from your Bash Terminal. You'll just need to type it in manually as shown here at 3:30
      Also if you visit her repo you'll see she shared the commands there as well.

  • @lilacspring2556
    @lilacspring2556 2 года назад +1

    Would be helpful if the video was broken up into parts so we can click on the bit of the video we're actually interested in

    • @OMGenomics
      @OMGenomics  2 года назад +6

      Yea I didn't have time to do that before, but I just finished adding those time points now. Enjoy!

    • @lilacspring2556
      @lilacspring2556 2 года назад

      @@OMGenomics thanks so much!

  • @zhengyu2763
    @zhengyu2763 2 года назад

    👍👍👍👍

  • @elvisnnaemeka6722
    @elvisnnaemeka6722 2 года назад

    Please be my mentor.

  • @robertb2664
    @robertb2664 2 года назад

    What if your vcf contains variants where some samples have ./. genotypes (no calls) ? The code you posted does not appear to work for this type of data. Any suggestions? Thanks

    • @OMGenomics
      @OMGenomics  2 года назад

      Ah yes, handling missing data. You can assume they are 0/0 or exclude those loci or the samples entirely, depending on the consequences. If it’s only a minority of loci, excluding them might be best. Assuming 0/0 can be a good solution when they’re scattered across most loci and most samples.

    • @robertb2664
      @robertb2664 2 года назад

      @@OMGenomics thanks, great video

  • @saharmosallam3449
    @saharmosallam3449 2 года назад

    Hello thanks for this interesting video, I wanna learn bioinformatics, can I found any help here my friends

    • @islamsalah4314
      @islamsalah4314 Год назад

      Yes, watch Maria videos in order .. 1- What is bioinformatics 2- getting started in bioinformatics 3- Five steps ...

  • @aewe4239
    @aewe4239 2 года назад

    It would be awesome if you could exactly copy what you did on R into Python.

    • @OMGenomics
      @OMGenomics  2 года назад

      What do you mean? Which thing I did in R?

    • @aewe4239
      @aewe4239 2 года назад

      ​@@OMGenomics OMG thank you so much for your reply. I would like to tell you that I am a big fan of your OMGenomics show. I watched all of your R videos and the one called Plotting in R for Biologists is really helpful for beginners. If you have time I would appreciate it if you could teach us plotting in Python for biologists. I personally ask if you could release a video clip on how to deal with batch-effect correction in genomics data analysis. Thanks!!

    • @austinkunch710
      @austinkunch710 2 года назад

      @@aewe4239 w3schools has good intro python stuff

    • @frangarcia1699
      @frangarcia1699 2 года назад

      @@aewe4239 she is working on python all the time on this video.