Overview
A snapshot is a complete picture of the files and folders in your file system at a specific point in time. Snapshots include new data or data that has changed since the last snapshot. Snapshots offer data protection by enabling you to recover deleted files or directories, or to restore an entire file system. After a snapshot has been taken and is sent to object storage, it is not possible to modify that snapshot.
By default, all snapshots are retained forever in object storage. However, for compliance purposes or your own best practices, you can specify to have already-deleted data from older snapshots permanently removed from object storage. Snapshot retention and removal is the process for retaining wanted data in object storage and removing unwanted data from object storage. This phrase is usually shortened to “Snapshot Retention”.
Note: In this article we use the terms “delete” and “remove” specifically to mean the following:
Delete: Refers to the delete action on files and directories by users and their applications.
Remove: Refers to the action taken by the Snapshot Retention process to permanently remove eligible data from object storage.
Important: When data is removed by Snapshot Retention, it is permanently removed from object storage and cannot be recovered. A POSSIBLE EXCEPTION is when “soft delete” is configured on the object storage platform. In that case, it might be possible for Nasuni Support to recover data that has been removed. For more details, see “Soft delete” policies.
To learn more about how Snapshot Retention performs the removal, see How data removal works within Snapshot Retention.
Eligible Data for Snapshot Retention Removal
The Retention Policy of the volume defines when data can be eligible for removal by Snapshot Retention. For example, a retention policy of 1 year for the volume means that files with the following features are eligible for removal by Snapshot Retention:
Created BEFORE 1 year ago.
Deleted BEFORE 1 year ago.
Given a 1-year retention policy and regularly occurring daily snapshots by the volume-owning NEA, here are some general examples:
A file created 2 years ago and deleted 53 weeks ago. Eligible to be removed because it is not contained in any snapshots over the past 1 year or more.
A file created 1.5 years ago and never changed thereafter, but not deleted. Not eligible, because the file has not been deleted.
A file deleted 9 months ago. Not eligible. Although it was deleted 9 months ago, it still existed within the customer’s retention period of 1 year.
Note: The removal of eligible data by Snapshot Retention is very unlikely to occur exactly on the date/time after the retention period has expired. The removal process is impacted by when snapshots occur and, in particular, a concept called boundary snapshots, which play a critical role in the timing and operation of Snapshot Retention. We cover boundary snapshots next.
About Boundary Snapshots
Recall that a snapshot version holds a complete view of the data as of that point in time. “Boundary” snapshots are a designated subset of all snapshot versions. They are snapshots that have been marked as boundaries in time. The goal is for these boundary snapshots to occur at roughly regular intervals. The purpose of boundary snapshots is to prevent the Snapshot Retention process from removing any data that was deleted more recently than the customer’s desired data retention period, just in case the customer might still want to restore it at some point in the future. Think of the Retention Period as the amount of time during which the customer might wish to still be able to restore deleted data.
During Snapshot Retention processing, the newest existing boundary snapshot that is older than the volume’s defined retention period serves as the “stop sign” for the processing. Snapshot Retention removes any data that is eligible for removal from all snapshots older than, but not including, this boundary. Thus, all data that was still represented in the file system in this boundary snapshot, or in later snapshots, is protected from removal during the Snapshot Retention processing.
How Boundary Snapshots Are Created
The customer determines:
On which NEAs snapshots occur for their volumes.
When snapshots occur for their volumes.
For Snapshot Retention to operate as expected, it is critical that snapshots are being actively taken by the volume-owning NEA.
NOTE: Snapshots must be occurring for Snapshot Retention processing to run. The snapshot process on the volume-owning NEA actually initiates the Snapshot Retention processing.
Boundary snapshots are created in one of 2 ways, depending upon the volume’s configuration: using the Set Number of Snapshots method or using the Snapshots within a Range method.
Boundary Snapshots created using the Set Number of Snapshots method
With the Set Number of Snapshots method, the customer can choose to keep a certain number of the most recent snapshots in object storage. A boundary snapshot is going to be designated every ((X/2) +1) snapshots. For example, suppose that a customer chooses to keep the latest 10 snapshots. A boundary snapshot is then designated every (10/2+1) = 6 snapshots.
Note: The Set Number of Snapshots method can only be used on local volumes, not on volumes shared from another NEA.
Boundary Snapshots created using the Snapshots within a Range method
For Snapshots within a Range boundary snapshots, the customer chooses to keep snapshots in object storage for roughly the Retention Period setting on the volume. The time interval between boundary snapshots depends on the Retention Period, and is calculated in two steps:
First, the preliminary time interval between boundary snapshots is (Retention Period in Days / 5). For example, for a Retention Period of 1 year, the preliminary time interval would be 365/5 = 73 days between boundary snapshots.
Second, the time interval will be at least 1 day and at most 30 days. In our example, the time interval between boundary snapshots would be changed from 73 days to 30 days.
Thus, the time interval between boundary snapshots is always between 1 and 30 days, as long as snapshots are occurring often enough to meet the criteria.
Important: For the purposes of time-based Snapshot Retention:
• One day is defined as 86,400 seconds.• One month is defined as 2,629,743 seconds (30.43 days).
• One year is defined as 31,556,926 seconds (12 months or 365.24 days).
Note: In older versions of the product (before 9.14), the time interval between boundary snapshots was equal to the retention period. However, the maximum time interval between boundary snapshots was 6 months.
For example, if the customer set the retention period for the volume to 1 year, then new boundary snapshots would be created roughly every 6 months.
Regardless of when and how any particular boundary snapshot is marked, the Snapshot Retention process uses the boundary snapshot in the same way.
How Snapshot Retention Works
We have seen how data becomes eligible for removal by Snapshot Retention and how some snapshots are marked as boundaries. This section shows how Snapshot Retention uses the volume’s Retention Period and boundary snapshots to identify and remove eligible data during a Snapshot Retention run.
When the customer configures the retention policy for a volume, the next snapshot taken is marked as the first boundary snapshot. The Snapshot Retention process is initiated as part of each snapshot taken on the volume-owning NEA.
Note: The Snapshot Retention process only runs on the Edge Appliance that owns the volume and can only run on one volume at a time. If there are multiple volumes owned by an Edge Appliance, the process runs sequentially on each volume with each volume’s own Snapshot Retention policy.
Important: Every Edge Appliance connected to a volume must be online and synced before the Snapshot Retention process can run. Otherwise, data might be removed that other Edge Appliances connected to that volume might need.
Here are two examples of snapshot boundary creation in conjunction with a run of Snapshot Retention.
Example: Set Number of Snapshots method – Keep 10 Latest Snapshots
Suppose that a customer chooses to keep the latest 10 snapshots. A boundary snapshot is going to be designated every ((X/2) +1) snapshots. For 10 snapshots, a boundary snapshot is designated every 6th snapshot.
Suppose that we create a Snapshot Retention policy to keep the latest 10 snapshots, when the volume already has 45 snapshot versions.
Version 46 is immediately designated as the first boundary snapshot.
Version 52 is the 6th snapshot version after the previous boundary, so snapshot version 52 is designated as a boundary snapshot. No versions are removed because there are only 6 versions between now and the first (oldest) boundary snapshot. That is less than the 10 versions of snapshots that the retention policy requires.
Version 56 is the 10th version after the oldest boundary snapshot version 46. At this point, snapshot versions are removed up to but not including boundary snapshot version 46 (namely, snapshot versions 1-45). Versions 46-56 are retained. Any data deleted by users or applications in snapshot versions 1-45 is removed from object storage.
Version 58 is the 6th snapshot version after the previous boundary snapshot, so version 58 is designated as a boundary snapshot. No snapshot versions are removed because removing up to boundary snapshot 52 would only leave 6 snapshot versions, not the minimum of 10 snapshot versions required by the retention policy.
Version 62 is the 10th version after the oldest remaining boundary snapshot version 52. Snapshot versions are removed up to but not including version 52 (namely, snapshot versions 46-51). Versions 52-62 are retained.
Any data deleted by users or applications in snapshot versions 1-51 is removed from object storage.Version 64 is the 6th snapshot version after the previous boundary snapshot, so snapshot version 64 is designated as a boundary. No snapshot versions are removed because removing up to boundary snapshot 58 would only leave 6 versions remaining, not the minimum of 10 required by the retention policy.
Version 68 is the 10th version after boundary snapshot version 58. Snapshot versions are removed up to but not including snapshot version 58 (namely, versions 52-57). Versions 58-68 are retained.
Any data deleted by users or applications in snapshot versions 1-57 is removed from object storage.The pattern continues from there.
Example: Snapshots within a Range method
In this example, Snapshot Retention was enabled on January 1, 2024, with a Retention Period of 1 year. This means that no data is eligible for removal until 1 year after January 1, 2025. The time interval between boundary snapshots should be (365 / 5) = 73 days, with a maximum of 30 days. Therefore, the time interval between boundary snapshots for this volume is 30 days. A time-based boundary snapshot is placed on January 1, 2024.
A time boundary snapshot is placed 30 days later, on January 31, 2024.
On September 1, 2024 (8 months later), for example, since Snapshot Retention has only been enabled for 8 months, no data is eligible for removal.
On January 1, 2025, some data might now be eligible for removal. Check for deleted data that has not changed in more than 1 year.
This table gives this example in detail:
Date | Snapshot version | Boundary placed? | Snapshot versions eligible for removal | Comment |
1/1/2020 | 1 | No: Snapshot Retention not yet enabled. | None | First snapshot for volume. |
1/1/2024 | 10000 | Yes. Latest snapshot when Snapshot Retention enabled. | None, retention not yet expired. | Snapshot Retention just enabled Boundary snapshot target: Every 30 days |
1/31/2024 | 11000 | Yes. 30 days boundary snapshot interval. | None, retention not yet expired. | |
3/1/2024 | 12000 | Yes. 30 days boundary snapshot interval. | None, retention not yet expired. | |
3/31/2024 | 13000 | Yes. 30 days boundary snapshot interval. | None, retention not yet expired. | |
4/30/2024 | 14000 | Yes. 30 days boundary snapshot interval. | None, retention not yet expired. | |
Pattern continues with new boundary snapshot marked every 30 days | None, retention not yet expired. | |||
12/25/2024 | 21000 | Yes. 30 Days boundary snapshot interval. | None, retention not yet expired. | |
1/24/2025 | 22000 | Yes. 30 Days boundary snapshot interval. | 1-9,999. Leaves boundary snapshot version 10000. | 1 year since boundary snapshot version 10000 (1/1/2024).
|
2/23/2025 | 23000 | Yes. 30 Days boundary snapshot interval. | 10000-10,999. Leaves boundary snapshot 11000. | 1 year since boundary snapshot version 11000 (1/31/2024).
|
Configuring Snapshot Retention
You can create a Snapshot Retention policy using the NMC (Volumes --> Snapshot Retention. The Snapshot Retention process runs on the volume-owning Edge Appliance only.
The Snapshot Retention process can potentially impact user activity performance on the Nasuni Edge Appliance; therefore, by default, the Snapshot Retention process is given a lower priority than other Nasuni Edge Appliance processes. To explore alternatives that reduce the impact on user activity performance or that improve the speed of the Snapshot Retention process, contact Nasuni Technical Support.
Tip: Snapshots cannot be removed if snapshots are disabled and Global File Acceleration is disabled. Snapshots must be occurring for Snapshot Retention to run.
When configuring Snapshot Retention, the following options are available:
Volume: You select the local “owned” volume for which to define Snapshot Retention parameters.
Snapshots to retain: You select which snapshots to retain, from the following options:
All snapshots: (This is the default setting.) Retains all snapshots indefinitely. Snapshot Retention does not remove any snapshots or data. If you require removing older snapshots for compliance or other reasons, do not select this option.
Set number of snapshots: (This option is not available if the selected volume has Remote Access enabled.) You specify the number of the most recent snapshots to retain, from 1 to 1 billion (1,000,000,000).
For example, if you choose to keep 100 snapshots, then the 100 most recent snapshots are retained, and the rest are eligible for removal by Snapshot Retention.Snapshots within a range: Enter the number of Years, Months, or Days for which you want to retain snapshots.
For example, if you choose to keep two months’ worth of snapshots, then snapshots that were taken before then are removed automatically. See How Snapshot Retention Works for details on data removal eligibility.
Changing the Snapshot Retention retention time
Changing from longer retention time to shorter retention time
When changing from a longer retention time (such as 1 year) to a shorter retention time (such as 1 month), snapshots that are between 1 month and 1 year become eligible for removal after 1 month. This presumes that the 1-year retention was defined and in operation for 1 or more years before changing to the 1-month retention. If the 1-year retention was defined some N number of months fewer than 1 year before the change to 1 month retention, then only the period between 1 month and N months ago becomes eligible for removal after 1 month.
Changing from shorter retention time to longer retention time
Similarly, when changing from a shorter retention time (such as 1 month) to a longer retention time (such as 1 year), snapshots that are between 1 month and 1 year become eligible for removal when the first boundary after that change becomes older than 1 year.
How data removal works within Snapshot Retention
Snapshot Retention runs in several phases.
1. In the first phase, the Snapshot Retention process scans all the versions that are being retained, based on the Snapshot Retention policy, and builds a list of all objects to keep. This is known as the “keep list”. After building this “keep list”, the process hides all versions that are set to be removed.
2. In the second phase, the Snapshot Retention process scans all of these hidden versions and removes from object storage any objects that are not on the ”keep list”.
3. Finally, the hidden versions are removed from the system.
Using the “keep” list
When the Snapshot Retention process begins, it creates a “keep” list, based on your configuration for Snapshot Retention. For example, if you have specified to retain a certain number of the most recent snapshots, the object storage objects comprising that snapshot would be part of the “keep” list. Similarly, if you have specified to retain snapshots more recent than a specified date, those object storage objects would also be part of the “keep” list. The “keep” list is made of all the object storage objects belonging to the snapshots that need to be retained.
Building the “keep” list requires bringing the metadata objects for all versions being kept into the cache of the Edge Appliance that owns the volume. This should use very little bandwidth due to the small size of the metadata objects.
Next, the Snapshot Retention process starts going through the snapshot versions eligible for removal. For example, these items might include older versions of files that have been updated, or files that have been deleted from the current filesystem and are no longer in any versions set to be retained.
The Snapshot Retention process compares the snapshot versions eligible for removal to the “keep” list. If an object is on the “keep” list, it is no longer a candidate for removal.
Removal process
After the Snapshot Retention process establishes a candidate object for removal, it begins the removal process. For each candidate object, the Snapshot Retention process makes API calls to your object storage provider, directing the object storage provider to remove the specified object.
When the Snapshot Retention process receives a success response from your object storage provider that the specified object has been successfully removed, the Snapshot Retention process records that object as removed. For an exception to this, see “Soft delete” policies.
If the Snapshot Retention process receives an indication that your object storage provider did not successfully remove the specified object, it considers the removal process to have failed. A notification is created in your NMC.
Note that, depending on how many versions need to be reviewed in each phase, and the number of objects in each version, the Snapshot Retention process can take a long time.
“Soft delete” policies
If you have enabled a “soft delete” policy with your object storage provider, then the object storage objects removed by Snapshot Retention are not actually deleted immediately. These objects continue to exist in the object store marked for deletion for the amount of time that you have configured in your soft delete policy (typically, 30 days).
Nasuni highly recommends that customers do enable a “soft delete” policy in order to guard against inadvertent or malicious data deletion directly from the object store bucket or container.
However, from Nasuni’s point of view, this data that has been removed through our Snapshot Retention process no longer exists and cannot be recovered by ordinary means.
The exception is that, if “soft delete” is configured on the object storage platform, it might be possible for Nasuni Support to recover data that has been removed.