Overview
When a logical group of two or more Nasuni Edge Appliances (NEAs) is deployed, one of the approaches to offering clients active-standby resiliency that can also be leveraged during planned maintenance is through the use of DNS technologies.
The intent of this guide is to ensure that, during a planned maintenance event, Nasuni Edge Appliance snapshot and sync remain consistent, so that the risk of conflicts and other issues of file integrity are best avoided.
The duration of a planned minimally disruptive upgrade (MDU) partly depends on the process of draining active connections and performing final synchronizing of the volume.
Desired Outcomes of Procedure
The following are the desired outcomes of the procedure:
Windows Clients
Completely idle or disconnected Windows clients should automatically begin operations updated DNS record information. When they begin to perform I/O under failover, normal behavior is expected.
Connected clients with intermittent activity might become briefly affected during the client reset, but Windows has been observed to recover automatically. Example: An open Explorer window to the Nasuni share.
Clients with read or write operations “in-flight” are expected to continue under the previous DNS record until it is interrupted. The Windows CIFS client should eventually utilize the change made in DNS and re-negotiate a new connection to the standby NEA using the original namespace.
Higher level application recovery behavior can vary, depending on its architecture and the type of SMB operations taking place. User actions might be needed to remedy the interruption.
Non-Windows Clients
Linux: Depending on Linux distribution and CIFS client, results can vary from automatically switching to needing to re-connect.
MacOS: MacOS 12 has been shown to attach itself to the primary NEA address. After a timeout of 30 seconds, user intervention to reconnect from Mac Finder was needed to restore connection.
Nasuni Global Locking
The act of disconnecting clients on a given NEA also releases any file locks that a client has placed. The release of the lock also propagates to Nasuni’s Global Locking service.
The outcome from forcefully releasing a lock varies, depending on the workflow.
For example, a MS Office document can continue to be edited and eventually be saved to either the original NEA or the secondary during the MDU. However, a time window is created where the document can be re-opened by another user, risking potential conflict or data corruption.
Important: Nasuni’s Global Locking technology can introduce challenges to restarting an interrupted new write operation. The function of global locking places an exclusive lock on files being created. Therefore, previous interruptions leave a file artifact and a lock that blocks the re-try of the save until the lock can clear.
Nasuni Global File Acceleration
When Global File Acceleration is employed, the ability for a user-defined snapshot process falls under the control of the Nasuni GFA Manager. The frequency of snapshot events determined by the GFA Manager might not align with the intended maintenance window scenario outcome. If administrator control of snapshot creations is necessary, then disable GFA first, perform the snapshots from the NMC as described in the steps, then re-enable GFA after the maintenance event has concluded.
Requirements for Procedure
The following are required for performing the procedure:
One or more shared volumes between NEAs participating in an active-standby capacity.
An established DNS namespace with an active-standby approach for the group or zone of Edge Appliances, which CIFS clients are referencing as their SMB server.
DNS capabilities for allowing a relatively quick failover, including the following:
At minimum, an address (A) or alias (CNAME) record with a relatively low TTL, deemed acceptable for clients to learn of a record update during the MDU.
A variety of cloud provider and DNS software vendors offer the capability of automatically updating DNS during outage scenarios.
Overview of Failover and Update Procedure
Optional: Consider first performing the update on the standby NEAs.
Perform a snapshot, or ensure a recent snapshot has completed, on the active NEA, then synced to the standby NEA.
On the active NEA, make the shares Read Only.
Making the active NEA shares Read Only prevents clients that attempt to re-home to the primary NEA, after having their connection reset, from being able to write new data.Modify the DNS namespace record to begin moving clients to the standby NEA.
Reset CIFS clients on the previously active NEA.
Perform a final snapshot and sync on the active NEA volumes.
Complete the active NEA update.
Overview of Failback after Update
Revert the primary NEA shares to Read/Write.
Perform or ensure a recent snapshot is completed on the passive NEA and synced to the active NEA.
Modify the DNS namespace record to begin moving clients back to the active NEA.
Reset CIFS clients on the standby NEA.
Perform a final snapshot and sync on the standby NEA volumes.
Failover and Update Procedure
Optional: Update on the standby NEAs. Follow the steps of the update procedure in the Edge Appliance Administration Guide.
Perform or ensure a recent snapshot is completed on the active NEA, then synced to the standby NEA, by following these steps:
On the NMC, click Volumes.
Expand the relevant Volumes to view the time of the Last Snapshot.
Manually perform the latest snapshot as needed.
On the active NEA, make the shares Read Only, by following these steps:
From the list of Volumes, select the Volumes shared between the active-standby NEAs.
Click Shares.
Select shares to edit on the active NEA, then click Edit.
Important: Ensure that only the NEA targeted for client disconnecting and then the MDU are selected.Check the Read Only box.
Click Update Share.
Modify the DNS namespace record to begin moving clients to the standby NEA.
Using the DNS service where the original NEA group namespace resides, edit the appropriate record to start referring clients to the standby NEA.
Important: Verify that an appropriate low TTL still exists. If this was not already low, updating during this process does not sufficiently update the client machines.
Reset the CIFS clients on the previously active NEA, by following these steps:
On the NMC, click Filers.
Click CIFS Clients.
Select Reset All Clients.
Select the active NEA and then click Reset All Clients.
Repeat step 2 to perform a final snapshot and sync on the active NEA volumes.
Complete the active NEA update procedure in the Edge Appliance Administration Guide.