Network Monitor Examples

Introduction

The RediGate provides a Network Monitor option (beginning June 2018), allowing various actions to be taken based on changes in network conditions. See the RediGate Configuration Manual for general instructions on configuring the Network Monitor object.

This page details several examples of using the Network Monitor for different applications.

Redundancy Application

Primary/Secondary Route Handling (ping)

In this application, a RediGate might have two networks (such as satellite connection via Ethernet port, and a cellular modem connection) for redundancy.

  • The primary path is Ethernet/satellite, but if that fails, the network should switch over to cellular (ppp0).
  • When switching from Ethernet to cellular, or cellular to Ethernet, some static routes need to be added and/or removed.
  • The detection of whether to switch from primary→secondary is determined by loss of pings to a server on the primary network.
  • The detection of whether to switch from secondary→primary is determined by achieving enough consecutive successful pings over the primary network.
  • The SCADA host wants to know that the RediGate is using its primary or secondary path by reporting via an RTDB register.

This example requires two Network Monitor configuration objects

  1. The first NetMon instance does a ping (only when on Path 0, primary) and checks for a number of failed pings on ALL Ping Addresses as the criteria for switching to Path 1.
    The ACTION is one or more 'route' commands that are necessary when on the secondary network.
  2. The second NetMon instance does a ping (only when on Path 1, secondary) and checks for a number of consecutive successful pings to ANY of the Ping Addresses as the criteria for switching back to Path 0.
    The ACTION is, again, one or more 'route' commands that are necessary when on the primary network.
Instance 1Instance 2

 NOTES:

  • The Ping action (Ping Fail or Ping Good) sends one ping every monitoring interval to each of the Ping Addresses.
  • For PING FAIL, the Criteria must be satisfied for all addresses (three consecutive ping fails on all three addresses) before the route will be switched to secondary (Path 1).
  • For PING GOOD ANY, the Criteria must be satisfied for any address (ten consecutive ping success on at least one address) before the route will be switch back to primary (Path 0).
  • The Action Text contains the Linux command line or script to be executed in each case. In this example, it is shown as a few "route add" or "route del" commands.
  • The Monitor Register (40001, 40002) contains the ping counts for the monitoring condition - max number of successful or failed pings.
  • The Action Register (note, register 40003 in both cases) contains either a 0 or 1, depending on which route has been selected. This is important, because both processes run independently, but they both should be controlling the same register. Register 40003 can be reported to SCADA to indicate which route is active.
  • The Ping Addresses (one or more) should be chosen in this case as devices that should normally be reachable over the primary network (Path 0).
  • The Interface or Register ("eth0") may be optional, depending on the routing that is configured. If specified, it will ping on the configured Linux interface. If absent, it will ping over whatever is the best network route, but in any case it should be set up to determine the presence or absence of the primary network.


Primary/Secondary Route Handling (register)

As in the previous application, a RediGate might have two networks (such as satellite connection via Ethernet port, and a cellular modem connection) for redundancy. In this case, the route is switchable by writing a 0 or 1 to a control register.

This application (two additional instances of NetMon configuration objects) might be combined with the previous application to allow pings to primarily control the routing, but also allow SCADA to override the setting by writing to a register.

  1. The first NetMon instance checks the route status register (40003; only when on Path 0).
    If the value has been changed to 1, then run SCRIPT to switch to Path 1.
  2. The second NetMon instance checks the route status (40003; only when on Path 1).
    If the value has been changed to 0, then run SCRIPT to switch back to Path 0.

Instance 1

Instance 2

 NOTES:

  • These two NetMon instances and the two in the previous example all use the same register, 40003, as the Action Register; and these two use it as the register that is written to by SCADA. This allows one register to be the control point for determining which route is active, and to change it manually.
  • All four NetMon instances run independently of each other. But because these instances check the register 40003 value when we currently believe that we are on Path 0 (or Path 1), and if we see that the value in that register is different than what we expected it should be, it indicates that someone else has written to the register to change the route manually.
  • However, because the two Ping instances are still running, the NetMon instances in the previous example will eventually override the manual setting written by SCADA and switch back the primary (or secondary) route, if the pings fail or succeed as indicated.
  • To make this more deterministic, so that SCADA has final control over manually setting the primary/secondary path, changes and/or additional configuration would be needed.

Network Status Monitoring

Check whether TCP port is connected

In this example, the RediGate can check whether or not a TCP socket (such as MQTT or Terminal Client) is currently established. If not, it might indicate a problem with the Ethernet connection, and we can try to restart the Ethernet networking to attempt a recovery.

This example requires two Network Monitor configuration objects:

  1. The first NetMon instance checks the established connections on a numbered port (e.g. 1883). If the number of connections is greater than VALUE, it runs a SCRIPT (just a Linux 'sleep' command), which increments a heartbeat counter in register 40012. The increasing value represents a good condition.
  2. The second NetMon instance checks the heartbeat counter in Channel/RTU, register 40012. If the register stops increasing, then restart the Ethernet ports.
    This instance needs to run at a MONITOR Interval that is more than the interval of the first instance, to ensure the count will have increased normally.
    (To restart just a single Ethernet port instead of all ports, the Instance 2 Action could be a SCRIPT, with Action Text:
    ifdown eth0; sleep 1; ifup eth0   #(use the correct eth_ port number)

Instance 1

Instance 2

(Note that this example could have combined the condition of ESTABLISHED ports with the Restart of Ethernet ports in a single NetMon instance. But that could cause the Ethernet network to restart even when there is a natural reconnection of the TCP socket. By using an increasing heartbeat counter in Instance 1, and configuring Instance 2 to read the counter several times slower than Instance 1, it will avoid premature restarts of the network.)

Also note that a 3rd NetMon instance could be added, which looks at the register value of 40012 (heartbeat counter) at a much less frequent interval, for instance 10 minutes or an hour. If the heartbeat counter doesn't increase more than a small amount, this indicates that the Ethernet restarts (NetMon instance 2) aren't working, and the Action could be to reboot the RediGate as a more drastic effort to recover communication, if possible.