ramanlib.calc
Computations over grouped Raman spectra.
This module provides analysis helpers that operate on a
ramanlib.core.GroupedSpectralContainer (GSC) and return results designed
to feed directly into plotting functions in ramanlib.plot:
Notes
All grouping semantics mirror pandas.DataFrame.groupby(). Unless stated
otherwise, group labels are rendered from the grouping keys (tuples become
comma-separated strings).
Examples
Select outliers and plot them:
results = outliers_per_group(gsc, metric=rp.metrics.euclidean, by="sample", n_spectra=3)
ramanlib.plot.outliers_per_group(gsc, results)
Functions
|
Pearson correlation matrix between per-group mean spectra. |
|
Compute difference of group mean spectra and a normal-approximation CI band. |
|
Select per-group outlier indices according to a pairwise metric vs. |
- ramanlib.calc.mean_correlation_per_group(gsc, by)[source]
Pearson correlation matrix between per-group mean spectra.
- Parameters:
gsc (GroupedSpectralContainer) – Input container.
by (str) – Column name to group by when computing the means.
- Returns:
Square correlation matrix (index and columns are group labels) computed from the stacked intensity vectors of each group’s mean spectrum.
- Return type:
See also
ramanlib.plot.mean_correlation_per_groupHeatmap visualization of the returned matrix.
ramanlib.core.GroupedSpectralContainer.meanComputes per-group mean spectra.
Notes
Groups are ordered according to the key order in
groupby(by). The matrix is computed by forming a DataFrame whose columns are the intensity vectors of each group’s mean spectrum and callingpandas.DataFrame.corr()withmethod="pearson".
- ramanlib.calc.mean_difference(group1_stats, group2_stats, ci_z=1.96)[source]
Compute difference of group mean spectra and a normal-approximation CI band.
- Parameters:
group1_stats (GroupedSpectralContainer) – Container with exactly one row representing group 1’s statistics, as produced by
ramanlib.core.GroupedSpectralContainer.mean()withinclude_stats=True. Must contain columns:"spectrum","n","var_vector", and"std_vector".group2_stats (GroupedSpectralContainer) – Same format/requirements as
group1_statsfor group 2.ci_z (float, optional) – Z-score for a two-sided normal CI (e.g.,
1.96≈ 95%). Default1.96.
- Returns:
rp.Spectrum – The difference spectrum
(group1_mean - group2_mean)with the same spectral axis as the inputs.numpy.ndarray – One-dimensional nonnegative array giving the half-width of the symmetric CI band at each wavenumber, computed as
ci_z * sqrt(var1/n1 + var2/n2).
- Raises:
ValueError – If a stats container is missing required columns or contains more/less than one row.
See also
ramanlib.plot.mean_differencePlot the difference spectrum with its CI band.
ramanlib.core.GroupedSpectralContainer.meanProduces the required stats columns when
include_stats=True.
Notes
This uses the usual normal approximation for a difference of means with independent groups:
Var(diff) = Var(mean1) + Var(mean2) = var1/n1 + var2/n2.
- ramanlib.calc.outliers_per_group(gsc, metric, by=None, n_spectra=3, highest=True)[source]
Select per-group outlier indices according to a pairwise metric vs. the group mean.
For each group (or the entire container if
by is None), compute the mean spectrum, score each row’s spectrum against that mean usingmetric, and return the indices of the top/bottomn_spectraaccording to the scores. Also returns the group’s meanramanspy.Spectrum.- Parameters:
gsc (GroupedSpectralContainer) – Input container with a
'spectrum'column oframanspy.Spectrum.metric (callable) – Pairwise metric with signature
metric(spec_a: rp.Spectrum, spec_b: rp.Spectrum) -> float. Typical choices are inramanspy.metrics(e.g.,MAE,MSE).by (str or list[str] or callable or None, optional) – Grouping key(s) passed to
pandas.DataFrame.groupby(). IfNone, all rows are treated as one group labeled"all".n_spectra (int, optional) – Number of spectra to select per group (clipped to the group size). Default
3.highest (bool, optional) – If
True(default), select the largest metric values; ifFalse, select the smallest.
- Returns:
Mapping
{ group_label: ([row_indices_into_gsc_df], mean_spectrum) }. Indices are global row indices intogsc.df.- Return type:
See also
ramanlib.plot.outliers_per_groupPlot the selected spectra per group and overlay the mean.
Notes
The mean spectrum is computed via
ramanspy.SpectralContainer.mean()after stacking the group’s spectra.