Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
```

Added:
- [PumpProbePattern.pumped_pulses_ratios()][extra.components.PumpProbePattern.pumped_pulses_ratios] determining the ratio of pumped pulses per train (!317).
- [TOFResponse][extra.recipes.TOFResponse] to estimate, deconvolve and denoise the instrumental response in eTOFs (!304).
- [VSLight][extra.recipes.VSLight] to calibrate and deconvolve eTOFs from a continuous monochromator scan (!304).
- [CookieboxCalibration][extra.recipes.CookieboxCalibration] to calibrate data from eTOFs after taking a calibration run (!284).
Expand Down
62 changes: 62 additions & 0 deletions src/extra/components/pulses.py
Original file line number Diff line number Diff line change
Expand Up @@ -1507,6 +1507,59 @@
else:
raise ValueError(f"{field=!r} parameter was not 'fel'/'ppl'/None")

def pumped_pulses_ratios(self, ppl_only_value=np.nan, labelled=True):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

labelled is never used. I don't have a strong preference whether we implement it or remove it.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted. I realized at the end all other methods carry it, but forgot the implementation...

"""Determine ratio of pumped pulses per train.

Args:
ppl_only_value (float, optional): Value for trains only
containing PPL pulses, same value as for pulses without
any pulses (np.nan) by default.
labelled (bool, optional): Whether a labelled pandas Series
(default) or unlabelled numpy array is returned.

Returns:
(pandas.Series or numpy.ndarray): Number of pulses per
train, indexed by train ID if labelled is True.
"""

pids = self.pulse_ids(copy=False)

try:
fel_count = pids[:, :, True, :].groupby('trainId').count()
except KeyError:
fel_count = pd.Series([])

Check warning on line 1530 in src/extra/components/pulses.py

View check run for this annotation

Codecov / codecov/patch

src/extra/components/pulses.py#L1529-L1530

Added lines #L1529 - L1530 were not covered by tests

try:
pumped_count = pids[:, :, True, True].groupby('trainId').count()
except KeyError:
pumped_count = pd.Series([])

Check warning on line 1535 in src/extra/components/pulses.py

View check run for this annotation

Codecov / codecov/patch

src/extra/components/pulses.py#L1534-L1535

Added lines #L1534 - L1535 were not covered by tests

# Compute the ratio for trains with at least one pumped pulse.
ratios = pumped_count / fel_count.loc[pumped_count.index]
Copy link
Copy Markdown
Contributor

@fadybishara fadybishara Apr 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow the logic here, if pumped_count and fel_count have different trains, how do you know that fel_count will have more trains? If not, wouldn't this throw an error?

It's likely I misunderstood something but if not, perhaps a simple solution would be to do an explicit inner merge like

joint_count = pd.concat([fel_count, pumped_count], keys=['fel', 'ppl'], axis=1, join='inner'
ratios = joint_counts.ppl.div(joint_counts.fel)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, never mind, I get it -- all the trains have FEL pulses but not necessarily PPL pulses. Nevertheless, just because it should not happen doesn't mean it cannot happen, no?

(Also, a better way to do what I suggested is with np.intersect1d on the train IDs -- but probably none of what I suggested is necessary.)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite - this line is not about FEL vs PPL pulses, but FEL vs FEL+PPL pulses. The set of trains with pumped pulses is a (not necessarily proper) subset of the set of trains with FEL pulses. It becomes clear when comparing line 1528 and line 1533, the latter makes a stricter indexing.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, right, 1528 vs. 1533 is what I was referring to. [:, :, True, True] is necessarily a subset (not sure what you mean by "proper") of [:, :, True, :] so there is no problem here.


# Extend the series to all expected trains, filling with NaN.
ratios = self._extend_all_trains(ratios, np.nan)

# Set those trains with all unpumped pulses to 0.0.
fel_only_index = fel_count.index.difference(pumped_count.index)
ratios.loc[fel_only_index] = 0.0

# Set those trains with no FEL pulses to the desired fill value.
if ppl_only_value is not np.nan:
# If one only cares for the index labels of a groupby,
# pd.SeriesGroupBy.count() is indeed faster than
# pd.SeriesGroupBy.groups, likely due additional objects
# created by the latter.
try:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to me to be more complicated than necessary, why not do the following?

try:
    ppl_only_index = pids[:, :, False, True].groupby('trainId').count()
except KeyError:
    ppl_only_index = pd.Series([])

ppl_only_index = pids[:, :, :, True].groupby('trainId') \
.count().index.difference(fel_count.index)
except KeyError:
pass

Check warning on line 1557 in src/extra/components/pulses.py

View check run for this annotation

Codecov / codecov/patch

src/extra/components/pulses.py#L1556-L1557

Added lines #L1556 - L1557 were not covered by tests
else:
ratios.loc[ppl_only_index] = ppl_only_value

return ratios if labelled else ratios.to_numpy()


class DldPulses(PulsePattern):
"""An interface to pulses from DLD reconstruction.
Expand Down Expand Up @@ -1629,3 +1682,12 @@
warn("Use triggers() instead of get_triggers()",
DeprecationWarning, stacklevel=2)
return self.triggers(*args, **kwargs)

@wraps(PumpProbePulses.pumped_pulses_ratios)
def pumped_pulses_ratios(self, *args, **kwargs):
pids = self.pulse_ids(copy=False)

Check warning on line 1688 in src/extra/components/pulses.py

View check run for this annotation

Codecov / codecov/patch

src/extra/components/pulses.py#L1688

Added line #L1688 was not covered by tests

if 'ppl' not in pids.index.names:
raise ValueError('only available with PPL information')

Check warning on line 1691 in src/extra/components/pulses.py

View check run for this annotation

Codecov / codecov/patch

src/extra/components/pulses.py#L1690-L1691

Added lines #L1690 - L1691 were not covered by tests

return PumpProbePulses.pumped_pulses_ratios(self, *args, **kwargs)

Check warning on line 1693 in src/extra/components/pulses.py

View check run for this annotation

Codecov / codecov/patch

src/extra/components/pulses.py#L1693

Added line #L1693 was not covered by tests
25 changes: 25 additions & 0 deletions tests/test_components_pulses.py
Original file line number Diff line number Diff line change
Expand Up @@ -740,3 +740,28 @@ def test_pump_probe_specials(mock_spb_aux_run, mock_sqs_remi_run):
names=('trainId', 'pulseIndex', 'fel', 'ppl')))

assert not pulses.is_constant_pattern()

# Test pumped pulses ratios.
pulses = PumpProbePulses(mock_sqs_remi_run[10:], pulse_offset=0)
pulses._get_train_ids = lambda: [1000, 1001, 1002, 1003, 1004]
pulses._pulse_ids = pd.Series(
[300, 310, 300, 300, 300, 310],
index=pd.MultiIndex.from_tuples([
(1000, 0, True, True),
(1000, 0, True, False),
(1001, 0, True, False),
(1002, 0, False, True),
(1003, 0, True, True),
(1003, 0, True, True),
], names=['trainId', 'pulseIndex', 'fel', 'ppl']))
Comment on lines +746 to +756
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I think mocking out bits of the innards of a class like this to test its public API is annoyingly brittle, and it's better to make some suitable input. I won't hold the PR up over it, though. Maybe we need better facilities for making mock data.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I agree, though I found it acceptable for PulsePattern given the explicit caching mechanism.

As you guessed correctly, I wanted to avoid having to creating different mock devices with the current interface to test a very particular scenario. Ideally we would continue to be able to test without Maxwell, and we also cannot rely on runs sticking around forever unless we copy them to a defined place. Maybe some serialized data in the repository that is unpacked into EXDF?


ratios = pulses.pumped_pulses_ratios()
assert np.all(ratios.index == [1000, 1001, 1002, 1003, 1004])
assert np.all(ratios.loc[[1000, 1001, 1003]] == [0.5, 0.0, 1.0])
assert np.all(ratios.loc[[1002, 1004]].isna())

ratios = pulses.pumped_pulses_ratios(np.inf)
assert np.all(ratios.index == [1000, 1001, 1002, 1003, 1004])
assert np.all(
ratios.loc[[1000, 1001, 1002, 1003]] == [0.5, 0.0, np.inf, 1.0])
assert np.all(ratios.loc[[1004]].isna())