
💍 Macrocycle Design Lab: Engineering Cyclic Peptides¶
Objective: Explore how synth-pdb generates random cyclic peptides and visualize the "closure" of the macrocycle via physics-based minimization.
💊 Why Macrocycles?¶
Macrocycles (cyclic peptides) are the "Goldilocks" of drug discovery. They are larger than small molecules but smaller than proteins, allowing them to bind to difficult targets while remaining stable in the body. They offer three key advantages over linear peptides:
- Improve Stability: Protect the peptide from degradation by proteases in the body.
- Increase Binding Affinity: Reduce the "entropy penalty" of binding by pre-shaping the peptide to match its target.
- Cross Membranes: Many cyclic peptides (like Cyclosporine A) can enter cells more easily than linear ones.
In AI models (like AlphaFold-3), training on cyclic peptides is difficult because they are rare in the Protein Data Bank (PDB). synth-pdb allows you to generate millions of "Correctly Closed" macrocycles to train more robust models.
- Cyclosporine A: A famous immunosuppressant that is a 11-mer cyclic peptide.
- Oxytocin: The "love hormone" is a 9-mer cyclic peptide.
🏗️ The Engineering Challenge¶
How do you "close the ring"? In this lab, we use Forcefield Minimization to pull the N-terminus and C-terminus together into a physically realistic bond.
# @title Setup & Installation { display-mode: "form" }
import os
import sys
from pathlib import Path
# Ensure the local synth_pdb source code is prioritized if running from the repo
try:
current_path = Path(".").resolve()
repo_root = current_path.parent.parent
if (repo_root / "synth_pdb").exists():
if str(repo_root) not in sys.path:
sys.path.insert(0, str(repo_root))
print(f"📌 Added local library to path: {repo_root}")
except Exception:
pass
if 'google.colab' in str(get_ipython()):
if not os.path.exists("installed.marker"):
print("Running on Google Colab. Installing dependencies...")
get_ipython().run_line_magic('pip', 'install synth-pdb py3Dmol')
with open("installed.marker", "w") as f:
f.write("done")
print("🔄 Installation complete. KERNEL RESTARTING AUTOMATICALLY...")
print("⚠️ Please wait 10 seconds, then Run All Cells again.")
os.kill(os.getpid(), 9)
else:
print("✅ Dependencies Ready.")
else:
import synth_pdb
print(f"✅ Running locally. Using synth-pdb version: {synth_pdb.__version__} from {synth_pdb.__file__}")
import numpy as np
import py3Dmol
from synth_pdb.generator import generate_pdb_content
def center_pdb(pdb_str):
lines = pdb_str.splitlines()
coords = []
for line in lines:
if line.startswith("ATOM"):
coords.append([float(line[30:38]), float(line[38:46]), float(line[46:54])])
if not coords: return pdb_str
coords = np.array(coords)
# Robust centroid calculation
center = (coords.min(axis=0) + coords.max(axis=0)) / 2
new_lines = []
for line in lines:
if line.startswith("ATOM"):
x, y, z = float(line[30:38]) - center[0], float(line[38:46]) - center[1], float(line[46:54]) - center[2]
new_lines.append(line[:30] + f"{x:>8.3f}{y:>8.3f}{z:>8.3f}" + line[54:])
else: new_lines.append(line)
return "\n".join(new_lines)
print("Libraries Loaded.")
1. Generating a Macrocycle¶
We use the cyclic=True flag to signal the generator to produce a head-to-tail bond. However, simply placing atoms in space isn't enough—the N-terminus and C-terminus might be far apart.
To solve this, we use Physics-Based Minimization (minimize_energy=True) powered by OpenMM. This pulls the termini together into a physically plausible bond.
sequence = "TRP-SER-GLY-VAL-VAL-ASN-GLY-SER" # A random 8-mer
print("Generating Linear Control...")
linear_pdb = generate_pdb_content(sequence_str=sequence, cyclic=False, minimize_energy=True)
print("Generating Cyclic Macrocycle (Minimized)...")
cyclic_pdb = generate_pdb_content(sequence_str=sequence, cyclic=True, minimize_energy=True)
print("Generation Complete.")
2. Visual Comparison: Linear vs. Cyclic¶
Observe the difference in the "Global Topology". Let's visualize both structures. In the cyclic version, you should see a continuous loop where the first and last residues are bonded, while the linear peptide remains a flexible string.
def view_structures(pdb1, title1, pdb2, title2):
view = py3Dmol.view(width=800, height=400, linked=False, viewergrid=(1, 2))
view.setBackgroundColor("#fdfdfd")
# Centering proteins for a tighter view
pdb1 = center_pdb(pdb1)
pdb2 = center_pdb(pdb2)
# Model 1
view.addModel(pdb1, 'pdb', viewer=(0, 0))
view.setStyle({'stick': {'radius': 0.15}, 'cartoon': {'color': 'spectrum'}}, viewer=(0, 0))
view.addLabel(title1, {'position': {'x': 0, 'y': 20, 'z': 0}, 'backgroundColor': 'white', 'fontColor':'black'}, viewer=(0, 0))
# Model 2
view.addModel(pdb2, 'pdb', viewer=(0, 1))
view.setStyle({'stick': {'radius': 0.15}, 'cartoon': {'color': 'spectrum'}}, viewer=(0, 1))
view.addLabel(title2, {'position': {'x': 0, 'y': 20, 'z': 0}, 'backgroundColor': 'white', 'fontColor':'black'}, viewer=(0, 1))
view.zoomTo()
view.center()
view.zoom(1.2)
view.show()
view_structures(linear_pdb, "Linear Peptide", cyclic_pdb, "Cyclic Macrocycle")
3. Atomic Breakdown: The Closure Bond¶
Look at the CONECT records at the end of the PDB. This is how software knows the ring is closed.
In a linear peptide, the N-terminus has extra hydrogens (or a capping group), and the C-terminus has an Oxygen (OXT). In a Cyclic peptide, these are replaced by a standard Peptide Bond (C-N).
Notice in the PDB output of the cyclic peptide that residue 1 is bonded to residue 8.
print("--- Cyclic PDB Footer (CONECT records for loop closure) ---")
lines = cyclic_pdb.splitlines()
conect_lines = [l for l in lines if l.startswith("CONECT")]
for l in conect_lines[-3:]:
print(l)
4. Scaling the Lab: Random Macrocycle Libraries¶
We can use this to generate a library of diverse macrocycles for ML datasets. By using --minimize, we ensure every structure is a geometrically valid "negative" or "positive" sample for a design model.
# Generate 3 random macrocycles
results = []
for i in range(3):
print(f"Generating Macrocycle {i+1}...")
p = generate_pdb_content(length=7, cyclic=True, minimize_energy=True)
results.append(p)
print("✅ Generated library of 3 unique macrocycles.")
🏆 Next Steps¶
- Try creating a D-Amino Acid macrocycle by adding
D-to your sequence (e.g.,D-ALA-D-VAL). How does the chiral inversion affect the ring shape? 🧪💍 - Try changing the length or adding the
--refine-clashesflag to see how it affects the density of the cyclic loop! 🚀