Skip to content

quality โ€” Publication Visualization & Structure Quality

The synth_pdb.quality sub-package provides two complementary capabilities:

  1. Publication-ready plots โ€” journal-standard figures for chemical shift correlations, Ramachandran analysis, and SAXS profiles.
  2. Structure quality classification โ€” GNN-based and feature-based classifiers for scoring structural plausibility.

Optional Dependencies

The plotting functions require matplotlib. The correlation plot additionally requires scipy. If either is absent the functions return None and log an error โ€” they do not raise, so scripts can run in headless environments.

pip install synth-pdb[viz]   # installs matplotlib + scipy

Plotting Functions

Import directly from the sub-package:

from synth_pdb.quality import (
    apply_publication_style,
    save_publication_figure,
    plot_chemical_shift_correlation,
    plot_ramachandran_publication,
    plot_saxs_publication,
)

Or from the full module path:

from synth_pdb.quality.plots import plot_ramachandran_publication

apply_publication_style()

Sets journal-standard matplotlib rcParams globally: sans-serif fonts (Arial/Helvetica), 300 DPI save default, cleaned tick sizes.

Side-effect scope

This modifies global rcParams. It intentionally does not set savefig.format โ€” the format is always passed explicitly via save_publication_figure to avoid silently changing the output format of other figures in the same process.

save_publication_figure(fig, path, transparent=False)

Saves a figure at 300 DPI with bbox_inches="tight". The format is derived from the file extension (defaults to PDF when no extension is given).

save_publication_figure(fig, "output/ca_correlation.pdf")
save_publication_figure(fig, "output/figure1")        # โ†’ saves as .pdf
save_publication_figure(fig, "output/thumbnail.png")  # โ†’ saves as .png

plot_chemical_shift_correlation

fig = plot_chemical_shift_correlation(
    exp_data,           # dict[int, dict[str, float]]  residue โ†’ {atom: ppm}
    syn_data,           # dict[int, dict[str, float]]  (same structure)
    atom_type="CA",     # which nucleus to plot
    title=None,         # optional override; auto-generates R value in title
    output_path=None,   # if set, saves the figure
)

Generates a scatter plot of experimental vs synthetic shifts with a diagonal reference line. Annotates Pearson R and RMSD automatically.

plot_ramachandran_publication

fig = plot_ramachandran_publication(
    phi,            # np.ndarray of ฯ† angles in degrees
    psi,            # np.ndarray of ฯˆ angles in degrees
    title="Ramachandran Plot",
    output_path=None,
)

Approximate region shading

The ฮฑ-helical (blue) and ฮฒ-strand (red) shaded regions are simplified rectangular approximations. They are not the probability-density contours from MolProbity or the Richardson Top8000 dataset. Use them for quick visual reference only โ€” do not cite them as quantitative Ramachandran statistics.

plot_saxs_publication

fig = plot_saxs_publication(
    q,              # np.ndarray โ€” scattering vector (ร…โปยน)
    intensity,      # np.ndarray โ€” I(q)
    rg=None,        # float | None โ€” annotates Rg on the plot when provided
    output_path=None,
)

Plots I(q) on a log-linear scale (standard for SAXS publications).


Full API Reference

plots

Publication-Quality Visualization Module for synth-pdb.

This module provides standardized plotting functions designed for academic journals (e.g., Nature, Journal of Molecular Biology, JACS). It focuses on high-DPI export, consistent typography, and scientifically accurate scales.

Functions

apply_publication_style()

Apply global matplotlib settings for publication-ready figures.

Source code in synth_pdb/quality/plots.py
def apply_publication_style() -> None:
    """Apply global matplotlib settings for publication-ready figures."""
    if not HAS_MATPLOTLIB or plt is None:
        return

    # Use a clean, professional style as a base
    plt.style.use("seaborn-v0_8-whitegrid")

    # Customize for academic journals.
    # NOTE: savefig.format is intentionally NOT set here; doing so would change
    # the default save format for every figure in the caller's process.  The
    # format is instead passed explicitly in save_publication_figure().
    params = {
        "font.family": "sans-serif",
        "font.sans-serif": ["Arial", "Helvetica", "DejaVu Sans"],
        "axes.labelsize": 10,
        "axes.titlesize": 11,
        "xtick.labelsize": 9,
        "ytick.labelsize": 9,
        "legend.fontsize": 9,
        "figure.titlesize": 12,
        "axes.linewidth": 1.0,
        "grid.alpha": 0.3,
        "savefig.dpi": 300,
        "savefig.bbox": "tight",
    }
    mpl.rcParams.update(params)

save_publication_figure(fig, path, transparent=False)

Save a figure with journal-standard defaults.

The output format is derived from the file extension (defaults to PDF when the path has no extension). The format is passed explicitly to fig.savefig so that this function never relies on - or mutates - the global savefig.format rcParam.

Source code in synth_pdb/quality/plots.py
def save_publication_figure(fig: Any, path: str, transparent: bool = False) -> None:
    """Save a figure with journal-standard defaults.

    The output format is derived from the file extension (defaults to PDF
    when the path has no extension).  The format is passed explicitly to
    ``fig.savefig`` so that this function never relies on - or mutates -
    the global ``savefig.format`` rcParam.
    """
    ext = os.path.splitext(path)[1].lower().lstrip(".")
    if not ext:
        path += ".pdf"
        ext = "pdf"

    fig.savefig(path, format=ext, dpi=300, transparent=transparent, bbox_inches="tight")
    logger.info(f"Publication figure saved to {path}")

plot_chemical_shift_correlation(exp_data, syn_data, atom_type='CA', title=None, output_path=None)

Generate a high-fidelity correlation plot for chemical shifts.

Source code in synth_pdb/quality/plots.py
def plot_chemical_shift_correlation(
    exp_data: dict[int, dict[str, float]],
    syn_data: dict[int, dict[str, float]],
    atom_type: str = "CA",
    title: str | None = None,
    output_path: str | None = None,
) -> Any:
    """Generate a high-fidelity correlation plot for chemical shifts."""
    if not HAS_MATPLOTLIB or plt is None:
        return None

    apply_publication_style()

    # Data alignment
    common_res = sorted(set(exp_data.keys()) & set(syn_data.keys()))
    x_list = []
    y_list = []
    for r in common_res:
        if atom_type in exp_data[r] and atom_type in syn_data[r]:
            x_list.append(exp_data[r][atom_type])
            y_list.append(syn_data[r][atom_type])

    x = np.array(x_list)
    y = np.array(y_list)

    if len(x) < 2:
        logger.warning(f"Insufficient data for {atom_type} correlation plot.")
        return None

    # Calculate statistics (requires scipy, checked at import time)
    if not HAS_SCIPY:
        logger.error("scipy is required for correlation plots but is not installed.")
        return None

    r_val, _ = _pearsonr(x, y)
    rmsd = np.sqrt(np.mean((x - y) ** 2))

    fig, ax = plt.subplots(figsize=(4.5, 4))

    # Use a professional color (Teal for synth-pdb)
    ax.scatter(x, y, s=25, alpha=0.6, edgecolors="none", color="#008080", label=f"n={len(x)}")

    # Diagonal line - build axis limits as a tuple (required by matplotlib's type signature)
    all_data = np.concatenate([x, y])
    padding = (float(all_data.max()) - float(all_data.min())) * 0.05
    lims: tuple[float, float] = (float(all_data.min()) - padding, float(all_data.max()) + padding)
    ax.plot(list(lims), list(lims), "k--", alpha=0.4, linewidth=1, zorder=0)

    ax.set_aspect("equal")
    ax.set_xlim(lims)
    ax.set_ylim(lims)

    ax.set_xlabel(f"Experimental {atom_type} Shift (ppm)")
    ax.set_ylabel(f"Synthetic {atom_type} Shift (ppm)")

    if title:
        ax.set_title(title)
    else:
        ax.set_title(f"{atom_type} Correlation ($R = {r_val:.3f}$)")

    # Stats annotation
    stats_text = f"RMSD: {rmsd:.2f} ppm\nPearson R: {r_val:.3f}"
    ax.text(
        0.05,
        0.95,
        stats_text,
        transform=ax.transAxes,
        verticalalignment="top",
        fontsize=9,
        bbox={"boxstyle": "round", "facecolor": "white", "alpha": 0.8, "edgecolor": "none"},
    )

    plt.tight_layout()
    if output_path:
        save_publication_figure(fig, output_path)

    return fig

plot_ramachandran_publication(phi, psi, title='Ramachandran Plot', output_path=None)

Generate a publication-quality Ramachandran plot with favored regions.

Source code in synth_pdb/quality/plots.py
def plot_ramachandran_publication(
    phi: np.ndarray,
    psi: np.ndarray,
    title: str = "Ramachandran Plot",
    output_path: str | None = None,
) -> Any:
    """Generate a publication-quality Ramachandran plot with favored regions."""
    if not HAS_MATPLOTLIB or plt is None:
        return None

    apply_publication_style()
    fig, ax = plt.subplots(figsize=(4.5, 4.5))

    # Approximate favored-region shading for general (non-Gly, non-Pro) residues.
    # IMPORTANT: These rectangles are simplified heuristic boundaries, NOT the
    # probability-density contours from MolProbity or the Richardson Top8000 dataset.
    # They are suitable for quick visual reference but should not be cited as
    # quantitative Ramachandran statistics in a publication.
    # Alpha-helical region (approximate)
    ax.add_patch(plt.Rectangle((-180, -120), 150, 180, color="blue", alpha=0.08, zorder=0))
    # Beta-strand region (approximate)
    ax.add_patch(plt.Rectangle((-180, 60), 135, 120, color="red", alpha=0.08, zorder=0))
    # Beta wraparound (lower-left quadrant)
    ax.add_patch(plt.Rectangle((-180, -180), 135, 40, color="red", alpha=0.08, zorder=0))

    # Scatter points
    ax.scatter(phi, psi, s=20, alpha=0.7, edgecolors="black", linewidth=0.5, color="#e74c3c")

    ax.set_xlim(-180, 180)
    ax.set_ylim(-180, 180)
    ax.set_xticks([-180, -90, 0, 90, 180])
    ax.set_yticks([-180, -90, 0, 90, 180])

    ax.axhline(0, color="black", linewidth=0.8, alpha=0.4)
    ax.axvline(0, color="black", linewidth=0.8, alpha=0.4)

    ax.set_xlabel(r"$\phi$ (degrees)")
    ax.set_ylabel(r"$\psi$ (degrees)")
    ax.set_title(title)

    ax.grid(True, linestyle="--", alpha=0.3)

    plt.tight_layout()
    if output_path:
        save_publication_figure(fig, output_path)

    return fig

plot_saxs_publication(q, intensity, rg=None, output_path=None)

Standardized SAXS I(q) vs q plot for papers.

Source code in synth_pdb/quality/plots.py
def plot_saxs_publication(
    q: np.ndarray,
    intensity: np.ndarray,
    rg: float | None = None,
    output_path: str | None = None,
) -> Any:
    """Standardized SAXS I(q) vs q plot for papers."""
    if not HAS_MATPLOTLIB or plt is None:
        return None

    apply_publication_style()
    fig, ax = plt.subplots(figsize=(5, 4))

    # Use log scale for Y as is standard
    ax.semilogy(q, intensity, color="#2c3e50", linewidth=1.5, label="Synthetic $I(q)$")

    ax.set_xlabel(r"$q$ ($\AA^{-1}$)")
    ax.set_ylabel(r"$\log I(q)$")
    ax.set_title("Scattering Profile")

    if rg is not None:
        ax.text(
            0.95,
            0.95,
            rf"$R_g = {rg:.2f} \AA$",
            transform=ax.transAxes,
            verticalalignment="top",
            horizontalalignment="right",
            bbox={"boxstyle": "round", "facecolor": "white", "alpha": 0.8, "edgecolor": "none"},
        )

    ax.grid(True, which="both", linestyle="--", alpha=0.3)

    plt.tight_layout()
    if output_path:
        save_publication_figure(fig, output_path)

    return fig

Complete Workflow Example

import numpy as np
import biotite.structure.io.pdb as pdb_io

from synth_pdb.bmrb_api import BMRBAPI
from synth_pdb.chemical_shifts import predict_chemical_shifts
from synth_pdb.saxs import calculate_saxs_profile, calculate_radius_of_gyration
from synth_pdb.quality import (
    plot_chemical_shift_correlation,
    plot_ramachandran_publication,
    plot_saxs_publication,
)
import biotite.structure as struc

# Load structure
structure = pdb_io.PDBFile.read("examples/1D3Z.pdb").get_structure(model=1)

# Chemical shift correlation
exp_shifts = BMRBAPI.fetch_chemical_shifts("6457")
syn_shifts_full = predict_chemical_shifts(structure)
syn_shifts = list(syn_shifts_full.values())[0]   # first chain

fig = plot_chemical_shift_correlation(
    exp_shifts, syn_shifts, atom_type="CA",
    output_path="figures/ca_correlation.pdf",
)

# Ramachandran plot
phi, psi, _ = struc.dihedral_backbone(structure)
mask = ~np.isnan(phi) & ~np.isnan(psi)
plot_ramachandran_publication(
    np.degrees(phi[mask]), np.degrees(psi[mask]),
    output_path="figures/ramachandran.pdf",
)

# SAXS profile
q, intensity = calculate_saxs_profile(structure, q_max=0.3)
rg = calculate_radius_of_gyration(structure)
plot_saxs_publication(q, intensity, rg=rg, output_path="figures/saxs.pdf")