SHIFTX2 Integration
SHIFTX2 is a state-of-the-art chemical shift prediction tool that combines a
hybrid machine-learning / empirical physics approach to predict backbone and
side-chain NMR chemical shifts from 3D protein coordinates. When installed,
synth-nmr uses it automatically as the primary prediction engine, falling
back to its built-in SPARTA+ empirical model when SHIFTX2 is unavailable.
Why SHIFTX2? The Accuracy Hierarchy
Chemical shift prediction from structure is a hard problem. Three broad approaches exist, in increasing order of accuracy:
| Method | Basis | Typical Cα RMSD | Used in synth-nmr |
|---|---|---|---|
| Random coil + secondary structure offsets | Amino acid type + Ramachandran bin | ~2–3 ppm | Built-in fallback (SPARTA+ style) |
| Database / fragment mining | Nearest-neighbour search in known PDB+BMRB pairs | ~0.9–1.1 ppm | SPARTA+ (Han et al. 2011) |
| SHIFTX2 | Hybrid ML + physics (ensemble of random forests, SVMs, ANNs) | ~0.44 ppm Cα | Primary (auto-detected) |
The ~2× improvement in Cα RMSD between fragment-mining and SHIFTX2 is not merely academic. For structure validation, a 0.5 ppm artefact can mask or manufacture secondary chemical shift signatures. SHIFTX2's per-atom models also cover side-chain nuclei (Hβ, Cγ, etc.) that simpler methods typically approximate or omit entirely.
What makes SHIFTX2 different?
SHIFTX2 (Han et al., 2011) trains separate machine-learning models for each heavy-atom + proton type (HA, CA, CB, C', N, HN …) using a curated reference database of proteins with both known high-resolution structures and experimentally assigned chemical shifts (the BMRB). Each model is a committee of learners trained on descriptor features that encode:
- Backbone torsion angles (φ, ψ, χ₁, χ₂)
- Hydrogen bond geometry (donor/acceptor distance and angle)
- Ring current effects (explicit geometry, not just intensity factors)
- Solvent accessibility (SASA)
- Neighbouring residue identity (i±1, i±2)
This is qualitatively richer than the SPARTA+ implementation in synth-nmr, which
applies per-secondary-structure mean offsets and a point-dipole ring current model.
Automatic Detection and Fallback
predict_chemical_shifts() in synth-nmr follows this logic every time it is called:
┌─────────────────────────────────────────────────┐
│ predict_chemical_shifts(structure) │
│ │
│ 1. Locate 'shiftx2.py' in: │
│ a. System PATH │
│ b. SHIFTX2_DIR environment variable │
│ c. Typical locations (~/shiftx2/, /opt/...) │
│ │
│ 2. If found → run SHIFTX2 subprocess │
│ └─ non-empty output? → return it ✓ │
│ └─ empty / crash? → ⚠ log warning │
│ │
│ 3. Fallback: predict_empirical_shifts() │
│ (SPARTA+ offsets + ring current model) │
└─────────────────────────────────────────────────┘
This design means:
- Zero configuration required — if SHIFTX2 is not installed, everything still works, just with ~2× higher shift RMSDs.
- No silent failure — a
logging.WARNINGis emitted whenever the fallback fires, so the user is never unaware of which engine is running. - Crash-safe — if SHIFTX2 exits with a non-zero return code, the exception is caught and logged; the empirical model takes over.
Installing SHIFTX2
SHIFTX2 is free for non-commercial use. Two installation routes:
Option 1 — Direct download (recommended)
- Go to http://www.shiftx2.ca/download.html
- Download the version matching your OS (Linux/macOS).
- Unpack the archive and make the script executable:
- Configure the location (choose one):
# A. Add to PATH in your shell profile
export PATH="/path/to/shiftx2_dir:$PATH"
# B. OR set the SHIFTX2_DIR environment variable
export SHIFTX2_DIR="/path/to/shiftx2_dir"
# C. OR move it to a typical location that synth-nmr searches automatically
mkdir -p ~/shiftx2
mv shiftx2.py ~/shiftx2/
- Verify installation:
Option 2 — SBGrid (academic institutions)
If your institution is an SBGrid subscriber:
After installation, shiftx2.py will be available in the SBGrid PATH automatically.
Verifying the Integration
from synth_nmr.chemical_shifts import ShiftX2Predictor
predictor = ShiftX2Predictor()
print("SHIFTX2 available:", predictor.is_available())
print("Using executable at:", predictor.executable)
To verify end-to-end:
import logging
import biotite.structure.io as strucio
from synth_nmr import predict_chemical_shifts
logging.basicConfig(level=logging.INFO) # so you can see which engine is used
structure = strucio.load_structure("protein.pdb")
shifts = predict_chemical_shifts(structure)
The log output will say either:
or:
WARNING synth_nmr.chemical_shifts: SHIFTX2 executable not found. Falling back to empirical SPARTA+ model.
To use SHIFTX2, ensure it is in your PATH or set the SHIFTX2_DIR environment variable.
Using SHIFTX2 Directly
You can also call the ShiftX2Predictor class directly if you want the raw SHIFTX2
output without the automatic fallback:
from synth_nmr.chemical_shifts import ShiftX2Predictor
import biotite.structure.io as strucio
structure = strucio.load_structure("protein.pdb")
predictor = ShiftX2Predictor() # default: looks for 'shiftx2.py' in PATH
# predictor = ShiftX2Predictor(executable="/opt/shiftx2/shiftx2.py") # custom path
if predictor.is_available():
shifts = predictor.predict(structure)
# shifts: {"A": {1: {"CA": 52.3, "N": 122.1, ...}, 2: {...}, ...}}
else:
print("SHIFTX2 not installed — see docs for setup instructions")
The returned dictionary format is identical to predict_empirical_shifts(), so all
downstream analysis functions (calculate_csi(), ensemble_average_shifts(), etc.)
work unchanged.
Custom Executable Path
If shiftx2.py is not on your PATH or in a standard location, you can set the SHIFTX2_DIR environment variable to the directory containing shiftx2.py.
Alternatively, you can pass the path explicitly in code:
How synth-nmr Calls SHIFTX2
Under the hood, ShiftX2Predictor.predict() does exactly what you would do manually:
- Writes a temporary PDB from the biotite
AtomArrayto atempfile.TemporaryDirectory. - Runs
shiftx2.py -i input.pdbas a subprocess. - Parses the
.csCSV output file that SHIFTX2 writes next to the input file. - Returns a
{chain: {res_id: {atom: shift}}}dict and deletes the temp directory.
The subprocess output (stdout/stderr) is captured; a non-zero exit code raises
RuntimeError and triggers the fallback in predict_chemical_shifts().
Automated Test Coverage
The integration is tested without requiring SHIFTX2 to be installed, using pytest-mock
to simulate every branch:
| Test | What is verified |
|---|---|
test_shiftx2_is_available_mocked |
is_available() returns True when shutil.which finds the executable |
test_shiftx2_predict_not_available |
predict() raises RuntimeError when executable is missing |
test_shiftx2_predict_subprocess_error |
predict() raises RuntimeError on non-zero subprocess exit |
test_shiftx2_parse_output_missing_file |
_parse_output() raises FileNotFoundError for absent output |
test_shiftx2_parse_output_bad_format |
Malformed CSV lines are silently skipped |
test_predict_chemical_shifts_shiftx2_success |
predict_chemical_shifts() returns SHIFTX2 result when available |
test_predict_chemical_shifts_shiftx2_empty |
Falls back to SPARTA+ when SHIFTX2 returns empty dict |
test_predict_chemical_shifts_shiftx2_exception |
Falls back to SPARTA+ when predict() raises any exception |
test_shiftx2_predictor_predict_success |
Full subprocess + output parsing with realistic mocked CSV |
All tests pass without SHIFTX2 installed and run in CI on every pull request.
Reference
Han, B., Liu, Y., Ginzinger, S., & Wishart, D. (2011).
SHIFTX2: significantly improved protein chemical shift prediction.
Journal of Biomolecular NMR, 50(1), 43–57.
doi:10.1007/s10858-011-9478-4