Alarm management is an old topic that has probably been addressed thousands of times over the years. The holy grail of SCADA system alarm management is a system that pinpoints the exact cause of a problem no matter how much hell has broken loose on the plant floor. The problem is that when one thing goes wrong, lots of other things tend to go wrong, and the operator is left looking for a needle in a flaming haystack. There has to be a better way!
Why Alarm Management Strategies Fail
A successful alarm notification system requires that real information be presented to operators in a timely manner. The goal is to assist the operator in preventing or minimizing system downtime. In our experience, alarming systems fall short of this goal in two major ways: Generating meaningless alarms and failing to provide operational context for alarms. Meaningless alarms are a topic on their own, but they can be eliminated fairly easily with well-designed code. Operational context is a much more complex issue though.
All SCADA and HMI software packages provide the ability to give alarms or groups of alarms a priority, and at least one system allows hierarchical classification of alarms. The problem with all of these systems is that they over-simplify the dependency problem. Plant equipment and processes do not fall neatly into these over-simplistic categories. There are many reasons why an alarm might be relevant in one context and irrelevant in another. A few examples of the type of information that must be considered for each alarm are:
- Equipment or loop involved
- Position of that equipment or loop in the larger process
- Physical location of the loop
- Data acquisition chain (I/O, processor, network)
- Power distribution chain (fuses, power supplies, etc.)
For example, suppose a fuse blows that supplies power to a PLC input card. The operator will probably see an alarm for every input, an alarm for a faulted card, and then a bunch of process alarms caused by the PLC thinking loops have just gone crazy. Equipment might shut down cascading into a bunch of other alarms, and soon the operator is staring at a sea of red on the alarm screen wondering just one thing, “which one needs to be fixed to get the plant running again?” A smart alarming system would only notify the operator that the fuse blew.
A Unique Approach
To solve this problem for a client with tens of thousands of alarms spread across hundreds of sites, we decided to tackle the dependency problem. Our approach was to create metadata for each alarm in the database that contained specific information about each of the above categories. We then created a tool in the SCADA interface itself (Inductive Automation Ignition in this case), that allowed the operators to define parent-child relationships or “rule sets” on the fly.
A script runs on a periodic basis that evaluates the metadata associated with each active alarm against the rule sets configured by the operator. If a higher “level” alarm is active in any of the associated data categories, the system suppresses the display of the alarm being evaluated and only shows the parent. This allows the creation of complex rulesets so the operations team can continuously improve the system’s ability to display the most relevant alarm information over time.
To summarize, the metadata for each alarm determines the immediate context in which that alarm should be considered valid. For example, a process alarm might be configured as being in Plant 1 and fed by Breaker A45. The system would know to only display this alarm if the operator is interested in Plant 1 alarms and Breaker A45 is not in alarm.
Display and Notification
Determining the most important alarms is definitely most of the battle, but notifying the operator on screen or through remote notification tools is also an important and necessary step. Read our white paper on SCADA UI design for some valuable tips on maximizing operator awareness through effective screen design.