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

Part A (From Pranam)

  1. Sign up for HuggingFace (we will be using PepMLM: https://huggingface.co/ChatterjeeLab/PepMLM-650M)
  2. Find the amino acid sequence for SOD1 in UniProt (ID: P00441), a protein when mutated, can cause Amyotrophic lateral sclerosis (ALS). In fact, the A4V (when you change position 4 from Alanine to Valine) causes the most aggressive form of ALS, so make that change in the sequence

Sequence for SOD1 A4V:

MATKVVCVLKGDGPVQGIINFEQKESNGPVKVWGSIKGLTEGLHGFHVHEFGDNTAGCTSAGPHFNPLSRKHGGPKDEERHVGDLGNVTADKDGVADVSIEDSVISLSGDHCIIGRTLVVHEKADDLGKGGNEESTKTGNAGSRLACGVIGIAQ

  1. Enter your mutated SOD1 sequence into the PepMLM inference API and generate 4 peptides of length 12 amino acids (Step 5 takes a while so you can also just pick 1 or 2 peptides)

image.png

Binder Pseudo Perplexity
WRYPAAAAAHKE 6.9570456
WRVPAAAARWKX 8.47874258
WHYGPTAVEHKX 11.9239867
WRYGAAGAAWKK 7.52798743
  1. To your list, add this known SOD1-binding peptide to your list: FLYRWLPSRRGG [from -https://genesdev.csh

    Peptide 1 WRYPAAAAAHKE
    Peptide 2 WRVPAAAARWKX
    Peptide 3 WHYGPTAVEHKX
    Peptide 4 WRYGAAGAAWKK
    Peptide 5 FLYRWLPSRRGG
  2. Go to AlphaFold-Multimer (https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb). This is similar to what you did for homework last week but instead for a protein-peptide complex

    1. Set model_type: alphafold2_multimer_v3 (this model has been shown to recapitulate peptide-protein binding accurately: https://www.frontiersin.org/articles/10.3389/fbinf.2022.959160/full). * Add your query sequence - Its the SOD1Sequence:PeptideSequence.

Peptide 1: WRYPAAAAAHKE

Predictions:

image.png

image.png

image.png

image.png

image.png

3D structure

image.png

Plots:

image.png

image.png