Skip to content

Add long alarm messages to PVs from IOCs#153

Open
Monarda wants to merge 5 commits intoepics-base:masterfrom
ISISNeutronMuon:user_alarmmsg
Open

Add long alarm messages to PVs from IOCs#153
Monarda wants to merge 5 commits intoepics-base:masterfrom
ISISNeutronMuon:user_alarmmsg

Conversation

@Monarda
Copy link
Copy Markdown

@Monarda Monarda commented Feb 20, 2026

The development work in this PR was undertaken by @alraco8444 of Mobiis working at our request.

This PR enables long alarm messages in the alarm.message field of IOC PVs created by pvxs as a module. The logic in the PR is straightforward, if the alarm status of the PV is not NO_ALARM then the message field is set to the value in the PV's Q:alarm:msg (if set). Otherwise it is blank or the amsg as appropriate. This means that the alarm message for a PV may be set in the db file using info(Q:alarm:msg, "...").

An example, using an example IOC field is shown here:

$ ./pvget demo:ai1
demo:ai1 2026-02-20 13:22:22.427  7 MINOR DEVICE A long error message as an example of what is possible

$ ./pvmonitor demo:ai1
demo:ai1 2026-02-20 13:21:16.945  0 MAJOR DEVICE A long error message as an example of what is possible
demo:ai1 2026-02-20 13:21:08.658  1 MAJOR DEVICE A long error message as an example of what is possible
demo:ai1 2026-02-20 13:21:09.658  2 MAJOR DEVICE A long error message as an example of what is possible
demo:ai1 2026-02-20 13:21:09.943  3 MINOR DEVICE A long error message as an example of what is possible
demo:ai1 2026-02-20 13:21:10.943  4 MINOR DEVICE A long error message as an example of what is possible
demo:ai1 2026-02-20 13:21:11.944  5
demo:ai1 2026-02-20 13:21:12.943  6 MINOR DEVICE A long error message as an example of what is possible
demo:ai1 2026-02-20 13:21:13.943  7 MINOR DEVICE A long error message as an example of what is possible
demo:ai1 2026-02-20 13:21:14.943  8 MAJOR DEVICE A long error message as an example of what is possible
demo:ai1 2026-02-20 13:21:15.944  9 MAJOR DEVICE A long error message as an example of what is possible

A few practical notes:

  • Unit tests haven't been included, but we'll be happy to develop them if the PR seems acceptable.
  • The changes have all been protected by a macro USER_ALARM_MSG from a surfeit of caution. However, there shouldn't be any changes of behaviour to any existing IOC behaviour unless Q:alarm:msg is set.

Copy link
Copy Markdown
Member

@anjohnson anjohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR seems to ignore the EPICS "theory of alarms" philosophy that was explained in this tech-talk message and its reply a year ago.

The Core Developers will want to discuss this. Adding comments with more information about your use-case and how you plan to use this API might help us understand the change in context, and let us provide alternative approaches that you could use instead if we do decide to reject the PR.

Comment thread ioc/iocsource.cpp Outdated
}

if((info.type==MappingInfo::Scalar || info.type==MappingInfo::Meta) && (change & (UpdateType::Value | UpdateType::Alarm))) {
#if 1//USER_ALARM_MSG
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An #if 1 here was left by mistake, doesn't depend on the macro.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed now.

Comment thread ioc/iocsource.cpp Outdated
Comment thread ioc/iocsource.cpp Outdated
#if DBR_AMSG
if((options & DBR_AMSG) && meta.amsg[0]) {
#if USER_ALARM_MSG
node["alarm.message"] = meta.status ? meta.amsg : "";
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this addition of the conditional a fix/workaround for a bug in Base versions before 7.0.9? We merged PR#566 in that release to fix an IOC bug where the recGblResetAlarms() routine didn't clear the AMSG field after an alarm went away.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we tested againt 7.0.9 and this seems unnecessary in 7.0.10. I'll revert.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this section further, perhaps something like this would make more sense:

        std::string alarm_msg;
#if DBR_AMSG
        if((options & DBR_AMSG) && meta.amsg[0]) {
            alarm_msg = meta.amsg;
        } else
#endif
        {
            alarm_msg = meta.status && stsmsg ? stsmsg : "";
        }
#if USER_ALARM_MSG
        if(!info.alarmMsg.empty()) {
            auto usermsg_with_alarmmsg = info.alarmMsg;
            if (!alarm_msg.empty()) {
                usermsg_with_alarmmsg += " (" + alarm_msg + ")";
            } 
            alarm_msg = meta.status ? usermsg_with_alarmmsg : alarm_msg;
        }
#endif
        node["alarm.message"] = alarm_msg;
        }

I can see that this superficially works and produces results like this:

demo:ai1 2026-02-22 10:16:07.181  2 MAJOR DEVICE A long error message as an example of what is possible (LOLO)
demo:ai1 2026-02-22 10:16:08.181  3 MINOR DEVICE A long error message as an example of what is possible (LOW)
demo:ai1 2026-02-22 10:16:09.181  4 MINOR DEVICE A long error message as an example of what is possible (LOW)
demo:ai1 2026-02-22 10:16:10.181  5
demo:ai1 2026-02-22 10:16:11.181  6 MINOR DEVICE A long error message as an example of what is possible (HIGH)
demo:ai1 2026-02-22 10:16:12.181  7 MINOR DEVICE A long error message as an example of what is possible (HIGH)
demo:ai1 2026-02-22 10:16:13.181  8 MAJOR DEVICE A long error message as an example of what is possible (HIHI)

(Note the status or amsg text is now preserved and appended at the end of the long message.)

This code smells a bit as I've changed the type being used to set node["alarm.message"] from const char* to std::string which needs further investigation to ensure I've not introduced a memory leak.

However, once tidied up, it may represent a middle ground between preserving the engineering notifications while adding the possibility of longer alarm messages.

@Monarda
Copy link
Copy Markdown
Author

Monarda commented Feb 22, 2026

Apologies, I'd forgotten the discussion of the EPICS alarm philosophy in your tech-talk message. That's entirely my fault.

We have been using the longer alarm messages for a few years now and they integrate well with our tools (primarily CS-Studio / Phoebus and its stack).
image

image

The upper image displays a channel derived from our old control system and created as a pvAccess / Normative Type PV using pvapy. The old control system is also the source of the text of the long alarm message. The lower image displays an alarm message from an IOC using pva2pva. We have tools to convert configuration files from the old control system to IOC db files so it would be relatively straightforward for us to port the long alarm messages from the old control system to new as we update these IOCs from pva2pva to pvxs.

Here's a snippet from a Phoebus Alarm Table:
image
It's evident which alarms derive from IOCs and which from our old control system. (As an aside, I have to look up what STATE means almost every time I encounter it!)

Note, as you may be able to see from the image above, even as we port PVs from the old control system to IOCs we haven't been updating the PV names to our new naming convention. Our experience is that managing PV name changes across so many linked systems (screens, alarms, archiver, save-and-restore, etc.) is too much when managing the much smaller number of PVs that change type (e.g. double to int, double to enum) as part of the migration to IOCs is already a headache. My sense is that other sites rely more on their PV names to indicate the system from which an alarm derives, and even the nature of the alarm?

Although we are getting ready to migrate to the Phoebus alarm handler, our operators are currently still using the alarm handler from the old control system. This is the primary place, in practice, where supporting long alarm messages would make our lives easier. But we will eventuallly migrate away from this requirement.

As an example of something we don't currently do, but which Phoebus supports, it is possible to embed the alarm message directly into a HMI:
image
The left-hand indicator is controlled by the alarm severity of the PV (pva://EXAMPLE:PV/alarm/severity), the right-hand text field displays its alarm message (pva://EXAMPLE:PV/alarm/message).

None of these features require an additional high level alarm service, though they could potentially be overridden or supplemented by such a service.

I believe that the Normative Type alarm structure could allow sites to choose which alarm philosophy they wish to follow. EPICS can continue to recommend using ancillary / high level alarm services for interpretation of alarms while not precluding the use of longer alarm messages at the IOC level. You may wish to express that recommendation, even to people who don't read the documentation, by leaving in the compiler flag requirement in this PR.

An example of combining engineering and user-facing alarm messages might be:

demo:ai1 2026-02-22 10:16:09.181  4 MINOR DEVICE A long error message as an example of what is possible (LOW)

or

demo:ai1 2026-02-22 10:16:09.181  4 MINOR DEVICE LOW - A long error message as an example of what is possible

I'll confess part of the motivation for this is that we would like to move away from the "unconventional" pseudo-IOCs we have implemented in Python as we migrate hardware to OPC-UA. Our existing unconventional IOCs allow PLCs to set alarm limits, severities, and the long alarm messages of their associated PVs directly and dynamically. We've learned how to allow dynamic alarm limits in IOCs. We don't plan to support dynamic severities but we would like to continue to support long alarm messages. Dynamic long alarm messages are a challenge for another time!

@mdavidsaver
Copy link
Copy Markdown
Member

Overall, I do not think that QSRV is the correct place to implement an "override" to the alarm message string. The alarm message is meant to provide context to the active alarm status/severity, potentially one of several conditions selected based on severity. Allowing a blanket override seems to me to confuse this.

I think that the alarm.message should only be set from AMSG, which in turn should only be set via recGblSetSevrMsg(), which implements the severity maximization algorithm.

It seems to me that the described usages would be better served through by:

  1. Using an alarm manager service
  2. Placing this text describing a PV in the DESC field (appears to PVA clients as display.description)
  3. Potentially modify record definitions to add message string fields to compliment some of the severity selection fields.

Attention has already been drawn to Phoebus alarm manager.

From the examples given, and the proposed usage of a single string per record, I have the impression that the message would not vary based on severity? This sounds more like DESC, which is meant to allow a database designer to give application specific context for each record.

While it would be a significant project, when thinking in terms of recGblSetSevrMsg() it occurs to me that we could discuss the wisdom of add configurable messages to go along with the cases where a record allows an alarm severity to be configured. eg. biRecord has ZSV. Maybe add ZMSG?

record(mbbi, "some:status") {
    field(ZRST , "oops")
    field(ZRSV, "MINOR")
    field(ZRMG, "Something badish happens") # new?
    field(ONST, "oh no")
    field(ONSV, "MAJOR")
    field(ONMG, "Run!") # new?
    ...

record(ai, "some:temp") {
    field(HIGH, "400")
    field(HSV , "MINOR")
    field(HMG , "Getting a bit warm here") # new?
    field(HIHI, "451")
    field(HHSV, "MAJOR")
    field(HHMG, "Do you smell something burning?") # new?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants