When designing environments, we always think about the high availability of the different components. Two is one, one is none! So when designing a ShareFile environment I want at least two StorageZone Controllers (SZC) for every StorageZone. Because most Citrix environment already contain a NetScaler it’s my preferred method to also use the NetScaler for load balancing the StorageZone Controllers. Nowadays the Citrix NetScaler has some nice wizards build in to assist you deploying the ShareFile configuration.
Two is one, one is none. Load balance the StorageZone Controllers!
The wizard however will use the “tcp-default” monitor to check the service state. This means the NetScaler appliance establishes a 3-way handshake with the monitor destination, and then closes the connection, to check if the destination is up. Although this is configured by the wizard and a best practice, I believe this is a weak spot in my HA setup. I have seen situations where the SZC server was running, but the ShareFile services were not responding correctly. A simple tcp monitor will not notice this!
The NetScaler ShareFile wizard uses the default tcp monitor
As soon as I stop ShareFile on the Storagezone Controller the outage is noticed by ShareFile Control plane. The StorageZone status changes from healthy to warning and the internal SZC server is reported as not reachable.
So somehow it must be possible to do a advanced health check on the SZC servers. After some checking with ShareFile support I was informed to use a HTTP-ECV monitor, which should check the /heartbeat.aspx url. A healthy StorageZone Controller would respond with “***ONLINE***”.
Creating a new HTTP-ECV monitor can be done by the command line or GUI. Don’t forget to bind the newly created HTTP-ECV monitor to the ShareFile Services of Service Groups!
Command line syntax:
add lb monitor SZC-Heartbeat HTTP-ECV -send “GET /heartbeat.aspx” -recv “***ONLINE***” -LRTM ENABLED -interval 30 -resptimeout 5 -successRetries 2 -secure YES
By GUI:
In my opinion using a HTTP-ECV monitor for the internal Storage Zone Controllers is a more reliable method to determine the health status. It checks on the actual message sent to the StorageZone Controller service itself and not only the OS responsiveness.