Advanced NMR Observables: J-Couplings, NOEs, and RDCs¶

In this tutorial, we will dive deeper into three crucial NMR observables that provide distinct structural and dynamic information:

J-Couplings: Scalar couplings that depend on backbone dihedral angles.
NOEs: Nuclear Overhauser Effects providing short-range distance restraints.
RDCs: Residual Dipolar Couplings yielding long-range orientational information relative to an alignment tensor.

A new final section demonstrates how ensemble averaging changes the NOE picture when protein flexibility is taken into account.

1. Setup Environment & Load Structure¶

First, we'll install synth-nmr and load the Ubiquitin structure (1D3Z) using biotite.

In [ ]:

Copied!





!pip install -q synth-nmr biotite matplotlib

!wget -q https://files.rcsb.org/download/1D3Z.pdb -O ubiquitin.pdb

import biotite.structure.io as strucio
import matplotlib.pyplot as plt
import numpy as np

plt.rcParams.update(
    {"figure.dpi": 110, "font.size": 11, "axes.spines.top": False, "axes.spines.right": False}
)

# 1D3Z has 10 NMR conformers; use model 1 for single-structure analyses
stack = strucio.load_structure("ubiquitin.pdb")
structure = stack[0]
print(f"Loaded 1D3Z: {len(stack)} conformers — using model 1 for static analyses.")
!pip install -q synth-nmr biotite matplotlib

!wget -q https://files.rcsb.org/download/1D3Z.pdb -O ubiquitin.pdb

import biotite.structure.io as strucio
import matplotlib.pyplot as plt
import numpy as np

plt.rcParams.update(
    {"figure.dpi": 110, "font.size": 11, "axes.spines.top": False, "axes.spines.right": False}
)

# 1D3Z has 10 NMR conformers; use model 1 for single-structure analyses
stack = strucio.load_structure("ubiquitin.pdb")
structure = stack[0]
print(f"Loaded 1D3Z: {len(stack)} conformers — using model 1 for static analyses.")

2. J-Couplings: Probing Dihedral Angles¶

Scalar couplings ($^3J$) are mediated through chemical bonds. The $^3J_{HN-H\alpha}$ coupling strongly correlates with the backbone $\phi$ dihedral angle via the Karplus equation. By predicting these couplings, we can validate the secondary structure of our models. Let's calculate them for our Ubiquitin structure.

In [ ]:

Copied!





from synth_nmr.j_coupling import calculate_hn_ha_coupling

j_couplings = calculate_hn_ha_coupling(structure)

chain_a_couplings = j_couplings.get("A", {})
residues = sorted(chain_a_couplings.keys())
j_vals = [chain_a_couplings[r] for r in residues]

# Colour by secondary structure context (values around 4 Hz = helix; ~9 Hz = sheet)
j_colors = ["#e74c3c" if v < 5.5 else "#2980b9" if v > 7.5 else "#7f8c8d" for v in j_vals]

plt.figure(figsize=(11, 4))
plt.bar(residues, j_vals, color=j_colors, alpha=0.84, width=0.9)
plt.axhline(5.5, color="#e74c3c", ls="--", lw=1, alpha=0.5, label="\u03b1-helix threshold (~4 Hz)")
plt.axhline(7.5, color="#2980b9", ls="--", lw=1, alpha=0.5, label="\u03b2-sheet threshold (~9 Hz)")
plt.xlabel("Residue Number", fontsize=12)
plt.ylabel("\u00b3J(HN-H\u03b1) Coupling (Hz)", fontsize=12)
plt.title(
    "Predicted Scalar Couplings across Ubiquitin Backbone\n"
    "Red = helix-like, Blue = sheet-like, Grey = intermediate",
    fontsize=12,
)
plt.legend(fontsize=10)
plt.grid(axis="y", linestyle="--", alpha=0.4)
plt.tight_layout()
plt.show()
from synth_nmr.j_coupling import calculate_hn_ha_coupling

j_couplings = calculate_hn_ha_coupling(structure)

chain_a_couplings = j_couplings.get("A", {})
residues = sorted(chain_a_couplings.keys())
j_vals = [chain_a_couplings[r] for r in residues]

# Colour by secondary structure context (values around 4 Hz = helix; ~9 Hz = sheet)
j_colors = ["#e74c3c" if v < 5.5 else "#2980b9" if v > 7.5 else "#7f8c8d" for v in j_vals]

plt.figure(figsize=(11, 4))
plt.bar(residues, j_vals, color=j_colors, alpha=0.84, width=0.9)
plt.axhline(5.5, color="#e74c3c", ls="--", lw=1, alpha=0.5, label="\u03b1-helix threshold (~4 Hz)")
plt.axhline(7.5, color="#2980b9", ls="--", lw=1, alpha=0.5, label="\u03b2-sheet threshold (~9 Hz)")
plt.xlabel("Residue Number", fontsize=12)
plt.ylabel("\u00b3J(HN-H\u03b1) Coupling (Hz)", fontsize=12)
plt.title(
    "Predicted Scalar Couplings across Ubiquitin Backbone\n"
    "Red = helix-like, Blue = sheet-like, Grey = intermediate",
    fontsize=12,
)
plt.legend(fontsize=10)
plt.grid(axis="y", linestyle="--", alpha=0.4)
plt.tight_layout()
plt.show()

Values around 4 Hz typically indicate $\alpha$-helical regions, whereas larger values (~9 Hz) often correspond to $\beta$-sheets.

3. NOEs: Short-Range Distance Restraints¶

Nuclear Overhauser Effects (NOEs) provide distance information between protons. The intensity of an NOE signal is proportional to $r^{-6}$, meaning it drops off extremely rapidly with distance. The two panels below illustrate a subtle but important physical point:

Left: The number of proton pairs in each distance bin increases with distance (geometry: a larger shell has more volume). Most restraints come from pairs near the 5 Å cutoff simply because there are more of them at larger radii.
Right: The total NOE signal power ($\Sigma\, r^{-6}$) in each bin decreases steeply with distance. The closest contacts dominate the spectrum even though they are rarer, because $r^{-6}$ amplifies short distances enormously.

In [ ]:

Copied!





from synth_nmr.nmr import calculate_synthetic_noes

# calculate_synthetic_noes returns a list of dicts, each with keys:
# 'index_1', 'atom_name_1', 'res_name_1', 'index_2', 'atom_name_2',
# 'res_name_2', 'distance', 'upper_limit'
noes = calculate_synthetic_noes(structure, cutoff=5.0, buffer=0.5, exclude_intra_residue=True)

print(f"Generated {len(noes)} inter-residue synthetic NOE distance restraints.\n")

distances = np.array([n["distance"] for n in noes])
signal_r6 = distances**-6.0  # NOE intensity proportional to r^{-6}

bins = np.linspace(distances.min(), distances.max(), 31)
bin_centers = 0.5 * (bins[:-1] + bins[1:])

counts, _ = np.histogram(distances, bins=bins)
signal_sum, _ = np.histogram(distances, bins=bins, weights=signal_r6)

fig, axes = plt.subplots(1, 2, figsize=(13, 4.5))

# Left: pair count per distance bin (increases with distance — geometric effect)
ax = axes[0]
ax.bar(
    bin_centers,
    counts,
    width=(bins[1] - bins[0]) * 0.9,
    color="#16a085",
    alpha=0.82,
    edgecolor="white",
    lw=0.3,
)
ax.set_xlabel("Distance (\u00c5)", fontsize=12)
ax.set_ylabel("Number of NOE Pairs", fontsize=12)
ax.set_title(
    "Pair Count per Distance Bin\nIncreases with distance (geometric shell volume \u221d r\u00b2)",
    fontsize=11,
)
ax.grid(axis="y", ls="--", alpha=0.4)

# Right: cumulative r^{-6} signal power (decreases steeply — the real physics)
ax2 = axes[1]
ax2.bar(
    bin_centers,
    signal_sum,
    width=(bins[1] - bins[0]) * 0.9,
    color="#c0392b",
    alpha=0.82,
    edgecolor="white",
    lw=0.3,
)
ax2.set_xlabel("Distance (\u00c5)", fontsize=12)
ax2.set_ylabel("Cumulative NOE Signal Power  (\u03a3 r\u207b\u2076)", fontsize=12)
ax2.set_title(
    "NOE Signal Power per Distance Bin\n"
    "Steep falloff: short contacts completely dominate  (r\u207b\u2076 effect)",
    fontsize=11,
)
ax2.grid(axis="y", ls="--", alpha=0.4)

fig.suptitle(
    "The r\u207b\u2076 Effect: More Pairs at Long Range, but Short Contacts Drive the Signal",
    fontsize=13,
    fontweight="bold",
    y=1.01,
)
plt.tight_layout()
plt.show()

print("Sample NOE Restraints:")
for noe in noes[:5]:
    print(
        f"  {noe['res_name_1']}{noe['index_1']} {noe['atom_name_1']} <--> "
        f"{noe['res_name_2']}{noe['index_2']} {noe['atom_name_2']}: "
        f"Distance {noe['distance']:.2f} \u00c5 (Upper Limit: {noe['upper_limit']:.2f} \u00c5)"
    )
from synth_nmr.nmr import calculate_synthetic_noes

# calculate_synthetic_noes returns a list of dicts, each with keys:
# 'index_1', 'atom_name_1', 'res_name_1', 'index_2', 'atom_name_2',
# 'res_name_2', 'distance', 'upper_limit'
noes = calculate_synthetic_noes(structure, cutoff=5.0, buffer=0.5, exclude_intra_residue=True)

print(f"Generated {len(noes)} inter-residue synthetic NOE distance restraints.\n")

distances = np.array([n["distance"] for n in noes])
signal_r6 = distances**-6.0  # NOE intensity proportional to r^{-6}

bins = np.linspace(distances.min(), distances.max(), 31)
bin_centers = 0.5 * (bins[:-1] + bins[1:])

counts, _ = np.histogram(distances, bins=bins)
signal_sum, _ = np.histogram(distances, bins=bins, weights=signal_r6)

fig, axes = plt.subplots(1, 2, figsize=(13, 4.5))

# Left: pair count per distance bin (increases with distance — geometric effect)
ax = axes[0]
ax.bar(
    bin_centers,
    counts,
    width=(bins[1] - bins[0]) * 0.9,
    color="#16a085",
    alpha=0.82,
    edgecolor="white",
    lw=0.3,
)
ax.set_xlabel("Distance (\u00c5)", fontsize=12)
ax.set_ylabel("Number of NOE Pairs", fontsize=12)
ax.set_title(
    "Pair Count per Distance Bin\nIncreases with distance (geometric shell volume \u221d r\u00b2)",
    fontsize=11,
)
ax.grid(axis="y", ls="--", alpha=0.4)

# Right: cumulative r^{-6} signal power (decreases steeply — the real physics)
ax2 = axes[1]
ax2.bar(
    bin_centers,
    signal_sum,
    width=(bins[1] - bins[0]) * 0.9,
    color="#c0392b",
    alpha=0.82,
    edgecolor="white",
    lw=0.3,
)
ax2.set_xlabel("Distance (\u00c5)", fontsize=12)
ax2.set_ylabel("Cumulative NOE Signal Power  (\u03a3 r\u207b\u2076)", fontsize=12)
ax2.set_title(
    "NOE Signal Power per Distance Bin\n"
    "Steep falloff: short contacts completely dominate  (r\u207b\u2076 effect)",
    fontsize=11,
)
ax2.grid(axis="y", ls="--", alpha=0.4)

fig.suptitle(
    "The r\u207b\u2076 Effect: More Pairs at Long Range, but Short Contacts Drive the Signal",
    fontsize=13,
    fontweight="bold",
    y=1.01,
)
plt.tight_layout()
plt.show()

print("Sample NOE Restraints:")
for noe in noes[:5]:
    print(
        f"  {noe['res_name_1']}{noe['index_1']} {noe['atom_name_1']} <--> "
        f"{noe['res_name_2']}{noe['index_2']} {noe['atom_name_2']}: "
        f"Distance {noe['distance']:.2f} \u00c5 (Upper Limit: {noe['upper_limit']:.2f} \u00c5)"
    )

4. RDCs: Long-Range Orientational Restraints¶

Residual Dipolar Couplings (RDCs) measure the angle of a bond vector (like N-H) relative to a global alignment tensor. While NOEs give short-range distances, RDCs provide global orientational information, which makes them highly complementary.

We use two parameters to define the alignment tensor's magnitude and asymmetry:

Da: The axial component of the alignment tensor.
R: The rhombicity, defining deviation from axial symmetry.

In [ ]:

Copied!





from synth_nmr.rdc import calculate_rdcs

Da = 10.0
rhombicity = 0.2

rdcs = calculate_rdcs(structure, Da=Da, R=rhombicity)

res_rdc = sorted(rdcs.keys())
rdc_vals = [rdcs[r] for r in res_rdc]

plt.figure(figsize=(11, 4))
plt.bar(
    res_rdc,
    rdc_vals,
    color=["#e74c3c" if v > 0 else "#2980b9" for v in rdc_vals],
    alpha=0.82,
    width=0.9,
)
plt.axhline(0, color="black", lw=0.8, ls="-", alpha=0.4)
plt.xlabel("Residue Number", fontsize=12)
plt.ylabel("D_NH (Hz)", fontsize=12)
plt.title(
    f"Predicted N-H RDCs across Ubiquitin (Da = {Da} Hz, R = {rhombicity})\n"
    "Red = positive coupling, Blue = negative coupling",
    fontsize=12,
)
plt.grid(axis="y", ls="--", alpha=0.4)
plt.tight_layout()
plt.show()

print(f"Calculated {len(rdcs)} backbone N-H RDCs.")
from synth_nmr.rdc import calculate_rdcs

Da = 10.0
rhombicity = 0.2

rdcs = calculate_rdcs(structure, Da=Da, R=rhombicity)

res_rdc = sorted(rdcs.keys())
rdc_vals = [rdcs[r] for r in res_rdc]

plt.figure(figsize=(11, 4))
plt.bar(
    res_rdc,
    rdc_vals,
    color=["#e74c3c" if v > 0 else "#2980b9" for v in rdc_vals],
    alpha=0.82,
    width=0.9,
)
plt.axhline(0, color="black", lw=0.8, ls="-", alpha=0.4)
plt.xlabel("Residue Number", fontsize=12)
plt.ylabel("D_NH (Hz)", fontsize=12)
plt.title(
    f"Predicted N-H RDCs across Ubiquitin (Da = {Da} Hz, R = {rhombicity})\n"
    "Red = positive coupling, Blue = negative coupling",
    fontsize=12,
)
plt.grid(axis="y", ls="--", alpha=0.4)
plt.tight_layout()
plt.show()

print(f"Calculated {len(rdcs)} backbone N-H RDCs.")

5. Ensemble Effects on NOEs: When Flexibility Changes the Picture¶

So far we have used a single static model (conformer 1 of 1D3Z). But Ubiquitin in solution is not static — it samples all 10 conformers and countless intermediate geometries. The NOE signal you observe in a NOESY spectrum is the cumulative average over all conformations:

$$r_\text{eff} = \langle r^{-6} \rangle^{-1/6}$$

Because of the $r^{-6}$ weighting, transient short contacts dominate. A pair of protons that is briefly at 2 Å in one conformer but at 5 Å in nine others will show a strong NOE — closer than the simple geometric mean (3.9 Å) would suggest.

The cells below compare the 10-conformer ensemble NOE distances with the single-model distances, revealing which residues are most affected by flexibility.

In [ ]:

Copied!





from synth_nmr.trajectory import (
    ensemble_average_noes,
    load_trajectory,
)

# Load all 10 conformers into a TrajectoryEnsemble
frames = [stack[i] for i in range(len(stack))]
noe_ensemble = load_trajectory(frames)
print(f"Loaded {len(noe_ensemble)}-conformer TrajectoryEnsemble for NOE analysis.")

# Build per-frame flat dicts: {(res_i, atom_i, res_j, atom_j): distance}
# Note: calculate_synthetic_noes returns a LIST of dicts (one per atom pair),
# not a nested dict — we build the flat dict by iterating that list.
print("Calculating NOEs for each conformer...")
per_frame_noes = []
for frame in noe_ensemble:
    raw = calculate_synthetic_noes(frame, cutoff=5.5, exclude_intra_residue=True)
    flat = {}
    for entry in raw:
        key = (entry["index_1"], entry["atom_name_1"], entry["index_2"], entry["atom_name_2"])
        flat[key] = entry["distance"]
    per_frame_noes.append(flat)

# r⁻⁶-averaged effective distances across the ensemble
eff_noes = ensemble_average_noes(per_frame_noes)

# Arithmetic mean distances (physically wrong for NOEs, shown for comparison)
r_arith = {}
for pair in eff_noes:
    vals = [d[pair] for d in per_frame_noes if pair in d]
    if len(vals) == len(noe_ensemble):
        r_arith[pair] = float(np.mean(vals))

common = [p for p in r_arith if p in eff_noes]
x = [r_arith[p] for p in common]  # arithmetic mean  (overestimates)
y = [eff_noes[p] for p in common]  # r⁻⁶ average   (physically correct)
gap = [a - e for a, e in zip(x, y)]

print(f"{len(common)} NOE pairs present in all {len(noe_ensemble)} conformers.")
from synth_nmr.trajectory import (
    ensemble_average_noes,
    load_trajectory,
)

# Load all 10 conformers into a TrajectoryEnsemble
frames = [stack[i] for i in range(len(stack))]
noe_ensemble = load_trajectory(frames)
print(f"Loaded {len(noe_ensemble)}-conformer TrajectoryEnsemble for NOE analysis.")

# Build per-frame flat dicts: {(res_i, atom_i, res_j, atom_j): distance}
# Note: calculate_synthetic_noes returns a LIST of dicts (one per atom pair),
# not a nested dict — we build the flat dict by iterating that list.
print("Calculating NOEs for each conformer...")
per_frame_noes = []
for frame in noe_ensemble:
    raw = calculate_synthetic_noes(frame, cutoff=5.5, exclude_intra_residue=True)
    flat = {}
    for entry in raw:
        key = (entry["index_1"], entry["atom_name_1"], entry["index_2"], entry["atom_name_2"])
        flat[key] = entry["distance"]
    per_frame_noes.append(flat)

# r⁻⁶-averaged effective distances across the ensemble
eff_noes = ensemble_average_noes(per_frame_noes)

# Arithmetic mean distances (physically wrong for NOEs, shown for comparison)
r_arith = {}
for pair in eff_noes:
    vals = [d[pair] for d in per_frame_noes if pair in d]
    if len(vals) == len(noe_ensemble):
        r_arith[pair] = float(np.mean(vals))

common = [p for p in r_arith if p in eff_noes]
x = [r_arith[p] for p in common]  # arithmetic mean  (overestimates)
y = [eff_noes[p] for p in common]  # r⁻⁶ average   (physically correct)
gap = [a - e for a, e in zip(x, y)]

print(f"{len(common)} NOE pairs present in all {len(noe_ensemble)} conformers.")

In [ ]:

Copied!





fig, axes = plt.subplots(1, 2, figsize=(13, 5))

# Left: scatter r_arith vs r_eff — all points at or below the diagonal
ax = axes[0]
sc = ax.scatter(x, y, c=gap, cmap="YlOrRd", s=18, alpha=0.65, edgecolors="none", vmin=0, zorder=3)
cb = plt.colorbar(sc, ax=ax, label="r_arith \u2212 r_eff  (\u00c5)")
hi = max(*x, *y) + 0.1
ax.plot([1.5, hi], [1.5, hi], "k--", lw=1.2, alpha=0.5, label="r_eff = r_arith (no difference)")
ax.set_xlabel(
    "Arithmetic Mean Distance  (\u00c5)\n(simple average \u2014 ignores r\u207b\u2076 physics)",
    fontsize=11,
)
ax.set_ylabel(
    "r\u207b\u2076-Averaged Distance r_eff  (\u00c5)\n(physically correct for NOEs)", fontsize=11
)
ax.set_title("NOE Averaging Error:\nAll points lie BELOW or ON the diagonal", fontsize=12)
ax.legend(fontsize=10)

# Right: per-residue maximum averaging error
# Keys are (res_i, atom_i, res_j, atom_j) — extract residue numbers
ax2 = axes[1]
res_gap = {}
for (ri, _ai, rj, _aj), g in zip(common, gap):
    res_gap[ri] = max(res_gap.get(ri, 0.0), g)
    res_gap[rj] = max(res_gap.get(rj, 0.0), g)

res_sorted = sorted(res_gap.keys())
max_gap = [res_gap[r] for r in res_sorted]
ax2.bar(res_sorted, max_gap, color="#c0392b", alpha=0.78, width=0.9)
ax2.set_xlabel("Residue Number", fontsize=12)
ax2.set_ylabel("Max. distance overestimate  (\u00c5)", fontsize=12)
ax2.set_title(
    "Per-Residue NOE Error from Simple Averaging\n"
    "Peaks = residues most affected by conformational flexibility",
    fontsize=12,
)
ax2.grid(axis="y", ls="--", alpha=0.35)

plt.tight_layout()
plt.show()

n_sig = sum(1 for g in gap if g > 0.1)
print(f"{n_sig} of {len(common)} NOE pairs: arithmetic mean overestimates by > 0.1 \u00c5")
worst_res = max(res_gap, key=res_gap.get)
print(f"Worst residue: {worst_res}  (max overestimate {res_gap[worst_res]:.3f} \u00c5)")
fig, axes = plt.subplots(1, 2, figsize=(13, 5))

# Left: scatter r_arith vs r_eff — all points at or below the diagonal
ax = axes[0]
sc = ax.scatter(x, y, c=gap, cmap="YlOrRd", s=18, alpha=0.65, edgecolors="none", vmin=0, zorder=3)
cb = plt.colorbar(sc, ax=ax, label="r_arith \u2212 r_eff  (\u00c5)")
hi = max(*x, *y) + 0.1
ax.plot([1.5, hi], [1.5, hi], "k--", lw=1.2, alpha=0.5, label="r_eff = r_arith (no difference)")
ax.set_xlabel(
    "Arithmetic Mean Distance  (\u00c5)\n(simple average \u2014 ignores r\u207b\u2076 physics)",
    fontsize=11,
)
ax.set_ylabel(
    "r\u207b\u2076-Averaged Distance r_eff  (\u00c5)\n(physically correct for NOEs)", fontsize=11
)
ax.set_title("NOE Averaging Error:\nAll points lie BELOW or ON the diagonal", fontsize=12)
ax.legend(fontsize=10)

# Right: per-residue maximum averaging error
# Keys are (res_i, atom_i, res_j, atom_j) — extract residue numbers
ax2 = axes[1]
res_gap = {}
for (ri, _ai, rj, _aj), g in zip(common, gap):
    res_gap[ri] = max(res_gap.get(ri, 0.0), g)
    res_gap[rj] = max(res_gap.get(rj, 0.0), g)

res_sorted = sorted(res_gap.keys())
max_gap = [res_gap[r] for r in res_sorted]
ax2.bar(res_sorted, max_gap, color="#c0392b", alpha=0.78, width=0.9)
ax2.set_xlabel("Residue Number", fontsize=12)
ax2.set_ylabel("Max. distance overestimate  (\u00c5)", fontsize=12)
ax2.set_title(
    "Per-Residue NOE Error from Simple Averaging\n"
    "Peaks = residues most affected by conformational flexibility",
    fontsize=12,
)
ax2.grid(axis="y", ls="--", alpha=0.35)

plt.tight_layout()
plt.show()

n_sig = sum(1 for g in gap if g > 0.1)
print(f"{n_sig} of {len(common)} NOE pairs: arithmetic mean overestimates by > 0.1 \u00c5")
worst_res = max(res_gap, key=res_gap.get)
print(f"Worst residue: {worst_res}  (max overestimate {res_gap[worst_res]:.3f} \u00c5)")

Summary¶

By combining Chemical Shifts (secondary structure), Relaxation Rates (dynamics), J-Couplings (dihedrals), NOEs (distances), and RDCs (orientations), synth-nmr allows you to construct a comprehensive profile of how your theoretical 3D model would behave in an NMR magnet!

The NOE section above highlights two crucial lessons:

Pair count vs. signal power: more pairs exist at larger distances (geometry), but the $r^{-6}$ weighting means the spectrum is dominated by the closest contacts.
Ensemble averaging matters: using the physically correct $r^{-6}$ average instead of a simple geometric mean is essential when the protein is flexible.