Skip to content

2d nmr plotting refactor#45

Draft
jkshenton wants to merge 18 commits intomasterfrom
2d-nmr-refactor
Draft

2d nmr plotting refactor#45
jkshenton wants to merge 18 commits intomasterfrom
2d-nmr-refactor

Conversation

@jkshenton
Copy link
Copy Markdown
Member

This pull request introduces a major overhaul to the NMR plotting CLI and its utilities, focusing on improving clarity, flexibility, and export functionality. The most important changes are the renaming and expansion of options for weighting peaks, enhanced heatmap and contour controls, improved documentation and defaults, and the addition of 2D contour data export features.

CLI Option Improvements and Renaming:

  • The --scale-marker-by option has been replaced by --weight-by, which now controls both heatmap/contour intensity and (optionally) marker size. The new option supports more metrics and is backward-compatible with the old name. Marker scaling is now controlled separately by --scale-markers/--fixed-markers. [1] [2] [3] [4] [5]
  • CLI argument names and defaults have been clarified and updated, such as --plot-type (was --plot_type), --rcut defaulting to None instead of 10, and improved help text for marker color, broadening, and legend display. [1] [2] [3] [4]

Heatmap and Contour Controls:

  • New options for controlling heatmap and contour rendering: --heatmap-levels, --contour-range, and improved help text for contour levels and linewidth. The default colormap is now 'bone_r'. [1] [2] [3] [4]
  • Heatmap and contour rendering are more customizable, with clearer defaults and descriptions, and new options are integrated into both CLI and plotting logic. [1] [2]

Export Functionality:

  • Added options to export 2D contour data in multiple formats (.spe, .sim, .npz, .csv, .json) via --export-file and --export-format. Larmor frequencies for axes can be specified for correct unit conversion and JSON export. [1] [2] [3] [4]
  • Export logic is implemented in the plotting script, inferring formats from file extensions and supporting multiple output files.

Documentation and Backward Compatibility:

  • CLI cookbook examples and documentation are updated to use new option names and reflect the improved behavior. [1] [2]
  • Legacy functions such as find_XHn_groups are now thin wrappers re-exporting new implementations, preserving backward compatibility.

Codebase Cleanup and Refactoring:

  • Imports are reorganized for clarity and reduced redundancy. [1] [2]
  • Obsolete variables and unused code are removed or replaced with clearer alternatives. [1] [2] [3]

Kane Shenton added 6 commits February 19, 2026 13:58
NMRData2D / utils (soprano/calculate/nmr/):
- Fix lorentzian(): actual code was identical to gaussian,
- Fix merge_peaks(): used in-place label assignment, mutating input Peak2D
  objects; replaced with dataclasses.replace()
- Fix merge_peaks(): when duplicate keys collide, accumulate multiplicity
  (sum) instead of silently dropping the incoming peak's count; removes
  the redundant unique_peaks list, iterating peak_map.values() instead
- Fix process_pairs()/filter_pairs_by_distance(): returned pair-expanded
  index arrays were overwriting the unique-element idx_x/idx_y
- Remove unreachable second `elif 'custom'` branch in get_correlation_strengths()
- Fix pairs_el_idx length checks to use len(self.pairs) not len(self.elements)
- Add force_recompute parameter to get_peaks()
- Fix _get_contour_data_for_backend(): ax.get_xlim()/ylim() was called
  before any data was plotted, returning matplotlib's default (0, 1);
  now reads directly from plot_settings.xlim/ylim
- Add _resolve_levels() helper and PlotSettings.contour_range field
  (default 10–100%, matching ssNake convention); wire through both
  matplotlib and Plotly backends
- Add Plotly backend (PlotlyBackend) as an alternative to matplotlib,
  producing interactive HTML figures via plotly.graph_objects

Degeneracy / multiplicity (soprano/calculate/nmr/utils.py, nmr.py):
- Add Peak2D.multiplicity: int = 1 field to carry per-peak degeneracy
- Add multiplicities parameter to generate_peaks(); populates
  multiplicity = multiplicities[idx_x] * multiplicities[idx_y] for each
  peak, so CH₃ (multiplicity=3) contributes 3× vs a single-site peak
- generate_contour_map() weights heatmap contribution by
  peak.correlation_strength * peak.multiplicity; xlims/ylims parameters
  removed — grid extent is always derived from peak positions, keeping
  display limits independent of grid computation
- Increase Lorentzian grid padding from 5× to 50× FWHM; at 5× the
  residual at the grid edge is ~1%, at 50× it is ~0.01%
- NMRData2D.get_peaks() extracts the 'multiplicity' array from the
  merged Atoms object (set by merge_tagged_sites) and passes it through
- NMRPlot2D marker sizes now use correlation_strength * multiplicity,
  so markers and heatmap are driven by the same quantity
- PlotSettings.scale_markers: bool = True; when False, all markers are
  drawn at max_marker_size regardless of weight

plotnmr CLI (soprano/scripts/nmr_plot.py, cli_utils.py):
- Replace wildcard import with explicit property imports
- Remove dead variables (combine_rule, sortby, sort_order)
- Fix fig/ax tuple unpack to handle Plotly path (returns None for ax)
- Add KeyError guard for missing reference species in 1D path
- Fix --plot_type → --plot-type (Click convention)
- Remove invalid -cmap multi-character short flag
- Fix rcut default 10 → None (was silently dropping long-range pairs)
- Fix marker_color default 'C0' → None (was preventing auto-colouring)
- Fix typo plot_showconnecors → plot_showconnectors
- Add --contour-range and --heatmap-levels CLI options
- Fix colormap default bone → bone_r
- Replace --scale-marker-by (single option doing two things) with:
  --weight-by (metric: fixed/distance/inversedistance/dipolar/jcoupling,
               affects heatmap intensity and optionally marker size)
  --scale-markers/--fixed-markers (bool: whether marker size tracks weight)

Refactor (soprano/nmr/extract.py):
- Move nmr_extract_multi, nmr_extract_atoms, build_nmr_df, get_ms_summary,
  get_efg_summary, label_atoms, tag_functional_groups, merge_tagged_sites,
  check_equivalent_sites_ms and NMR_COLUMN_ALIASES out of scripts/nmr.py
  into the new soprano.nmr.extract module so they can be used outside the CLI
- Fix tag_functional_groups/label_atoms to accept a logger parameter instead
  of depending on the module-level CLI logger
- Replace `from soprano.properties.nmr import *` with explicit imports
- Keep backwards-compatible re-exports in scripts/nmr.py
…options

User requested option to weight the peaks by the dipolar coupling squared
Add tutorials/tutorial_data/alanine.magres, a magres-abinitio v1.0 file
containing the crystal structure of alanine (H28C12N4O8 unit cell) with
magnetic shielding (ms) and electric field gradient (efg) tensors for all
atoms, for use in NMR tutorials.
Kane Shenton added 12 commits March 25, 2026 11:33
Computes the dipolar RSS for a selected set of atoms (sel_i) against
a specified set of neighbours (sel_j), including periodic images within
a cutoff. sel_j can optionally be expanded to symmetry-equivalent sites
('symmetry') or CIF-label-equivalent sites ('cif_labels').

- Requires explicit sel_i and sel_j (ValueError if either is None)
- expand_j='periodic_images' (default): use sel_j as-is
- expand_j='symmetry': expand via UniqueSites tags
- expand_j='cif_labels': expand to atoms sharing the same CIF label
- Emits a warning with neighbour info when no neighbours are within cutoff
- Exports added to soprano/properties/nmr/__init__.py
- Tests added to tests/nmr_tests.py
- Add `dipolar_rss` to `--weight-by` choices in PLOT_SPECIFIC_OPTIONS
- Add `--rss-cutoff` (float, default 5.0 Å) and `--rss-expand-j`
  (periodic_images | symmetry | cif_labels) click options
- Thread both parameters through `plotnmr()` into `NMRData2D`

Refactor DipolarRSSByAtom expand_j logic:
- Extract `_expand_sel_j_by_equiv()` module-level helper that delegates
  both 'symmetry' and 'cif_labels' expansion to `UniqueSites.get()`,
  removing ~25 lines of duplicated branching in `extract()`

Tests:
- Add `TestDQSQSpectrum` (6 tests, ethanol): verifies DQ/SQ peak
  positions, roof pairs, y-axis label, and self-pair removal
- Add `TestDQSQWithCIFLabels` (4 tests, EDIZUM, Z=4): verifies that
  cif_labels expansion gives identical RSS for all equivalent H site
  pairs and that intra-group coverage exceeds periodic_images only
…port

DipolarRSSByAtom / _expand_sel_j_by_equiv
- Simplify DipolarRSSByAtom.extract() to delegate sel_j expansion to the
  shared _expand_sel_j_by_equiv helper, removing ~30 lines of duplicated code
- Fix override_cif logic: symmetry mode now passes override_cif=True to
  UniqueSites so spglib grouping is used unconditionally, even when CIF labels
  are present; cif_labels mode continues to use the labels as before

NMRData2D — dipolar_rss correlation-strength metric
- Add dipolar_rss entry to MARKER_INFO (label "Dipolar RSS", unit kHz)
- Add rss_cutoff, rss_expand_j, reduce, symprec, and atoms_full parameters
- When reduce=True: call label_atoms → store atoms_full → call nmr_extract_atoms
  to collapse to the asymmetric unit; pairs index into the reduced atoms and
  RSS values are mapped back to atoms_full via CIF labels so that cif_labels/
  symmetry expansion finds all Z copies of each site
- get_correlation_strengths: implement dipolar_rss branch; when atoms_full is
  set and expand_j != 'periodic_images', map reduced-atom labels to the first
  matching index in atoms_full and call DipolarRSSByAtom there; warn when
  expand_j != 'periodic_images' but atoms_full is missing

nmr_plot.py — consistent reduce interface
- Pass reduce=reduce to NMRData2D so the CLI --reduce/--no-reduce flag and the
  Python API are equivalent
- When reduce=True, reload raw (unmerged) atoms from disk via
  reload_as_molecular_crystal before passing to NMRData2D, so atoms_full
  contains the full unit cell for RSS expansion

Docs
- Expand cli-cookbook.md with DQ/SQ dipolar_rss examples covering
  periodic_images, cif_labels, symmetry, heatmap/contour, and dynamic groups

Tests
- Fix test_peak_strength_matches_direct_rss: convert expected Hz → kHz
- Add test_symmetry_and_cif_labels_agree to TestDQSQWithCIFLabels (EDIZUM)
- Add TestNMRData2DReduce: 6 tests covering reduce=True + cif_labels
  (atom counts, pair validity, RSS correctness, reduce=True/False agreement,
  peak-count ordering)
- Add TestNMRData2DSymmetryExpand: 6 tests for expand_j='symmetry' on EDIZUM
  (Z=4, has CIF labels) and ethanol (Z=1, no CIF labels)
- Add TestPlotNMRCLI: 6 end-to-end CLI tests via Click's CliRunner covering
  basic 2D, --no-reduce, periodic_images, cif_labels, symmetry, and both
  reduce modes
- Add tests/test_data/alanine_manual_labels.magres test fixture
* extract plotting configuration and constants into a dedicated config module with validation
* separate 2D data preparation, plotting orchestration, and backend implementations
* move contour export I/O into a dedicated export module and keep compatibility delegation
* preserve backward-compatible imports via re-exports
* expose contour export helper at package level
* add focused export tests and CLI export integration coverage
- add grid_max setting to PlotSettings with validation

- scale generated contour/heatmap grid so max(Z) matches user target

- thread grid_max through NMRData2D.get_contour_data and export API

- add CLI option --grid-max and wire into plotnmr plotting and exports

- document usage in CLI cookbook with ssNake-oriented examples

- add export and CLI tests to verify scaling behavior end-to-end
- introduce PlotSettings.intensity_range as shared default window

- add optional contour_range and heatmap_range overrides with fallback

- wire new options through backends and plotnmr CLI

- add CLI flags --intensity-range and --heatmap-range

- update CLI cookbook and add PlotSettings range behavior test
When reduce=True, user-supplied `pairs` indices referred to the full
cell but were resolved against the reduced (asymmetric-unit) atoms,
silently producing wrong results or crashes.

Now NMRData2D automatically remaps full-cell pair indices to
reduced-cell indices using the mapping from nmr_extract_atoms, and
emits a warning showing the old → new mapping.

Also:
- Add return_index_map parameter to nmr_extract_atoms for retrieving
  the full-cell → reduced-cell index mapping
- Re-export Peak2D from soprano.calculate.nmr
- Fix unicode escape sequences in MARKER_INFO config
- Remove duplicate MPL_TO_PLOTLY constants from nmr.py (already in config.py)
- Add tests for index mapping and pairs remapping with reduce=True
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant