How to build a protein structure prediction app in Python using ESMFold and Streamlit
HTML-код
- Опубликовано: 1 окт 2024
- In this video, we'll build a web app for predicting the protein structure in Python. Briefly, ESMFold is used as the protein structure prediction engine while Streamlit is used as the web framework. This app will be built in a little over 60 lines of code.
🐙 Code github.com/dat...
🕹️ Demo app esmfold.stream...
App built using the Streamlit app starter kit
📖 Blog blog.streamlit...
📦 App template github.com/str...
Support my work:
👪 Join as Channel Member:
/ @dataprofessor
✉️ Newsletter newsletter.data...
📖 Join Medium to Read my Blogs / membership
☕ Buy me a coffee www.buymeacoff...
Recommended Resources
📚 Books kit.co/datapro...
😎 Taro (Tech Career Mentorship) www.jointaro.c...
📜 Google Data Analytics Professional Certificate click.linksyne...
🤔 Interview Query www.interviewq...
🖥️ Stock photos, graphics and videos used on this channel 1.envato.marke...
Subscribe:
🌟 Coding Professor / @codingprofessor
🌟 Data Professor www.youtube.co...
Disclaimer:
Recommended books and tools are affiliate links that gives me a portion of sales at no cost to you, which will contribute to the improvement of this channel's contents.
#datascience #machinelearning #dataprofessor
the api referenced is getting an authentication error. What can an alternative be?
Please do more Bionformatics videos, your playlist is awesome!
Hi Sir. I m watching ur videos and r very good informative videos. Please guide me in my PhD word. My research work is related to Prediction of AntiCancer Peptides using Deep Learning. I want to extract features through Natural Language Processing Techniques. Please guide me how to extract via NLP.. Thanks Sir
Hi, you need to get embeddings from natural language model like Bert by tokenizer :sequence_w_spaces = ' '.join(list(sequence))
encoded_input = tokenizer(
sequence_w_spaces,
truncation=True,
max_length=len_seq_limit,
padding='max_length',
return_tensors='pt').to(config.device)
output = model(**encoded_input)
output_hidden = output['last_hidden_state'][:,0][0].detach().cpu().numpy(). output_hidden are your embeddings
@@ЛидияШишина-у5ч yes
It will help me a lot on my bioinformatics journey , thank you
Thank you for this great job. Definitely I'm going to use this in my next streamlit app
This is so interesting. Hope you can make a tutorial video on how to design new molecules based on deep learning. Thanks sir
Hello sir, thanks for the very informative video. I tried to use your app placing a protein sequence from Uniprot (P34998 · CRFR1_HUMAN), but it showed no result... is there any restriction in the number of aminoacids used?? thank you
Hi I think there's a limit of 400 amino acids
@@DataProfessor adding ", max_chars=(400)" to the code for the textarea could be helpful then. Besides that, great application
Hi can you please make a video on fine tuning of Protein Large language models and getting sequence embeddings. Thank you for making all these videos, they are really helpful
How do I get streamlit? There is an error when I type in the code into Jupyter, it does not recognize "streamlit"...
Could you please make video on protein mutation prediction server development like polyphen2, mutation assesser SNAP2
Thanks, sir. I watched your most of the videos. Helped a lot.
I would like to know how we can interact with drugs and protein using ML,DL.
Typically that could be done via molecular docking while can also be done via ML/DL so that'll require representing proteins and drugs as quantitative/qualitative (or also image/graph/etc) form that could subsequently be used as input for training.
@@DataProfessorGreat video, I’d really appreciate a video where you demonstrate this as well
do you know of a way to add constraint to the prediction. Let's say I did crosslink mass spec so I know some residue need to be close, can I add that to the prediction?
This is awesome. Thanks for sharing it.
Really want to make a protein ligand docking app like AutoDock Vina. Would you do that next?
Currently, the website is not working for query protein sequence.
Your videos are great. Can you make a video on how to get started with snowpark(snowflakes python api)
Thank you for sharing, Sir.
Excellent! Please select a known drugable enzyme and add an analysis function to estimate the Kd for a candidate set of pharmacophores.
Can we predict protein structure using graph data structures...any code available?
Thanks for sharing. Did you use random amino acid sequences or a known sequence?
Very valuable content!
Glad it was helpful!
@@DataProfessor Hi sir I am a chapter lead with Omdena and we are doing a project using the bioactivity data for breast cancer and I was wondering if you could be a guest speaker 🔊 in the a couple weeks.
@@DataProfessor it is based off of the course you conducted here on RUclips a year ago.
@@Quantumvp Could you email me with further details
Nice one dude, thanks for this!
Keep up the awesome work!!! I am suprised that you haven't researched "promo sm".