ramanlib.core

Core container primitives for RamanLib.

This module defines GroupedSpectralContainer, a thin wrapper around a pandas.DataFrame whose first-class column is "spectrum" containing ramanspy Spectrum objects (one per row). All other columns are free-form metadata (strings, numbers, categories, etc.) that stay aligned to each spectrum.

The class provides a small, opinionated surface for: - Safe construction from lists of spectra plus metadata rows. - Conversion to a ramanspy SpectralContainer when axes match. - Simple grouped reductions (e.g., group-wise mean spectra and optional stats). - Convenience plotting hooks that defer to ramanlib.plot.

Notes

The design goal is to make typical dataset manipulations ergonomic while keeping the full power of pandas.DataFrame available via the .df attribute. For transformations beyond the light helpers here, operate directly on GroupedSpectralContainer.df and rebuild a container with GroupedSpectralContainer.from_dataframe().

Classes

GroupedSpectralContainer(spectral_list, metadata)

A table of Raman spectra with aligned metadata.

class ramanlib.core.GroupedSpectralContainer(spectral_list, metadata)[source]

Bases: object

A table of Raman spectra with aligned metadata.

Each row contains a ramanspy Spectrum in the "spectrum" column, plus arbitrary metadata columns (e.g., "sample", "region", "label"). The container exposes a minimal API; for advanced operations, use the underlying pandas.DataFrame via df.

Parameters:
  • spectral_list (list of ramanspy.Spectrum) – One spectrum per row.

  • metadata (list of dict) – One metadata mapping per spectrum. Each dict’s keys become columns in the backing DataFrame. The length must match spectral_list.

df

Backing table with column "spectrum" and zero or more metadata columns.

Type:

pandas.DataFrame

Raises:
  • TypeError – If any element of spectral_list is not a ramanspy Spectrum.

  • ValueError – If spectral_list and metadata lengths differ.

See also

GroupedSpectralContainer.from_dataframe

Build from an existing DataFrame.

GroupedSpectralContainer.to_spectral_container

Convert to rp.SpectralContainer.

GroupedSpectralContainer.mean

Group-wise mean spectra.

GroupedSpectralContainer.plot_mean

Plot group means with CIs.

GroupedSpectralContainer.plot_random

Plot random spectra per group.

apply_pipeline(pipeline)[source]

Apply a RamanSPy processing pipeline to each spectrum.

Parameters:

pipeline (object) – Any object exposing an .apply(Spectrum) -> Spectrum method (e.g., a ramanspy pipeline).

Returns:

A new container with transformed spectra and the same metadata.

Return type:

GroupedSpectralContainer

Notes

The operation is row-wise and does not mutate the original container.

copy()[source]

Return a deep copy of the container.

Returns:

A new container whose df is a copy of the original.

Return type:

GroupedSpectralContainer

classmethod from_dataframe(df)[source]

Build a container from an existing DataFrame.

The constructor validates that a 'spectrum' column exists and that each entry is a ramanspy Spectrum. All other columns are treated as metadata and preserved.

Parameters:

df (pandas.DataFrame) – Input table with a 'spectrum' column of ramanspy Spectrum objects and any number of metadata columns.

Returns:

A new container referencing a copy of df’s contents.

Return type:

GroupedSpectralContainer

Raises:
  • ValueError – If the DataFrame lacks a 'spectrum' column.

  • TypeError – If any value in df['spectrum'] is not a ramanspy Spectrum.

mean(by=None, include_stats=False, ddof=1)[source]

Compute mean spectra per group.

Groups the rows by the key(s) in by (or treats the whole table as a single group when by=None), computes the mean spectrum per group, and returns a new container with one row per group. Optionally adds group-level statistics.

Parameters:
  • by (str or list of str or None, optional) – Column name(s) to group by, passed to pandas.DataFrame.groupby(). If None (default), all rows belong to a single group named "all".

  • include_stats (bool, optional) – If True, append the following columns to the result: 'n' (group size), 'var_vector' and 'std_vector' (per-wavenumber variance and standard deviation). Default is False.

  • ddof (int, optional) – Delta degrees of freedom used for variance/std (as in numpy.var()); ddof=1 gives sample variance. Default is 1.

Returns:

A new container where each row holds the group’s mean ramanspy Spectrum in 'spectrum' and the group key(s) as metadata columns.

Return type:

GroupedSpectralContainer

Notes

The mean is computed via ramanspy.SpectralContainer.mean(). The variance/standard deviation vectors, when requested, are aligned to the spectrum’s spectral axis and computed over the stacked intensities.

Examples

>>> means = gsc.mean(by=["sample", "region"], include_stats=True)
>>> list(means.df.columns)
['spectrum', 'sample', 'region', 'n', 'var_vector', 'std_vector']
plot_mean(by=None, interval=None, plot_type='separate', ci_z=1.96, **kwargs)[source]

Plot mean spectra per group.

This is a thin wrapper around ramanlib.plot.mean_per_group().

Parameters:
  • by (str or list of str or None, optional) – Grouping key(s). See GroupedSpectralContainer.mean().

  • interval (tuple of (float, float) or None, optional) – Optional spectral axis range (min, max) to display.

  • plot_type ({"single", "separate", "stacked", "single stacked"}, optional) – Plot style. "separate" draws one subplot per group; "single" overlays all groups; "stacked" create separate plots stacked vertically; "single stacked" overlays spectra in a single plot with vertical offsets. Default is "separate".

  • ci_z (float, optional) – Z-score for confidence intervals (e.g., 1.96 ≈ 95% CI). Default is 1.96.

  • **kwargs – Forwarded to the underlying plotting function/matplotlib.

Returns:

Axes object(s) produced by the plotting backend (RamanSPy).

Return type:

matplotlib.axes.Axes or numpy.ndarray

See also

ramanlib.plot.mean_per_group

Implementation of the plotting logic.

plot_random(by=None, n_samples=3, plot_type='single', seed=None, **kwargs)[source]

Plot a random sample of spectra per group.

This is a thin wrapper around ramanlib.plot.random_per_group().

Parameters:
  • by (str or list of str or None, optional) – Grouping key(s). If None, sample from all rows.

  • n_samples (int, optional) – Number of spectra to sample per group. Default is 3.

  • plot_type ({"single", "separate", "stacked", "single stacked"}, optional) – Plot style. "separate" draws one subplot per group; "single" overlays all groups; "stacked" create separate plots stacked vertically; "single stacked" overlays spectra in a single plot with vertical offsets. Default is "separate".

  • seed (int or None, optional) – Random seed for reproducibility. Default is None.

  • **kwargs – Forwarded to the underlying plotting function/matplotlib.

Returns:

Axes object(s) produced by the plotting backend.

Return type:

matplotlib.axes.Axes or numpy.ndarray

See also

ramanlib.plot.random_per_group

Implementation of the plotting logic.

to_spectral_container()[source]

Convert to a ramanspy SpectralContainer.

All spectra must share an identical spectral axis. The spectra are stacked in their current row order.

Returns:

A spectral container built by stacking the row spectra.

Return type:

ramanspy.SpectralContainer

Raises:

ValueError – If any spectrum has a spectral axis different from the first row’s.