Nasuni Edge Appliance Eviction Algorithm

Prev Next

Understanding the Eviction Algorithm

This describes what protected data is kept in the cache, and what protected data is evicted if space is needed.

As users access files in the UniFS™ filesystem, the time and date of each access of each file is updated in the database. The eviction algorithm is mostly a simple “least recently used” (LRU) check across all volumes, using this access time.

Across all volumes, we sort all evictable files by last access time, with oldest first, then sort secondarily by size within an hour. The oldest files are evicted first. But if files are the same age (namely, within the same hour), the bigger files are evicted before smaller files of the same age.

To be evictable, a file cannot be open and must already be protected in the cloud.

Here are some examples:

  • Files last accessed a month ago appear in the eviction list before files last accessed a week ago, because the month-ago files are older.

  • Within any given hour of the list, the files are also sorted by size. So bigger files that were last accessed between 1 PM and 2 PM on 1/1/2024 are evicted before smaller files that were also last accessed between 1 PM and 2 PM on the same date.

  • As another example, a 2 KB file last accessed between 1 PM and 2 PM on 1/1/2024 is evicted before a 2 GB file last accessed between 3 PM and 4 PM on 1/1/2024, because the 2 KB file is slightly older than the 2 GB file.

Some definitions:

  • Eviction Target: Eviction continues until the cache usage is less than or equal to the eviction target. By default, the eviction target is 70 percent, meaning that eviction continues until the cache usage is less than or equal to 70 percent.

  • Eviction Threshold: Eviction begins when the cache usage is greater than the eviction threshold. By default, the eviction threshold is 85 percent, meaning that eviction begins when the cache usage is greater than 85 percent.

The Edge Appliance evicts using the method described above in the following order until the eviction target (default: 70 percent) is met:

  • Manifests.

  • Snapshot files: files in the cache from previous versions.

  • Snapshot directories: namely, previous metadata.

  • Current files: current version of files in the cache.

  • Current directories: namely, current metadata.

If the cache is more full than the eviction threshold (default: 85 percent), data is evicted until the cache is less full than the eviction target (default: 70 percent).

Notifications occur when eviction starts, when eviction is complete, and if performing the eviction did not reduce cache usage below the eviction target.

Protecting unprotected data before eviction from cache

Regardless of the cache reserve setting, if the cache data is unprotected, the Edge Appliance cannot evict the data. The Edge Appliance must protect the unprotected cache data in the cloud first, and then evict the data and its metadata from the cache. This prevents data loss.

Note: If the metadata push phase processing of unprotected metadata is failing (see Snapshot Processing), you might need to wait until the metadata push phase completes. This can take hours, depending on how much data has been ingested.

Tip: To verify that a snapshot has been completed (both data phase and metadata phase), see Appendix: Verifying Snapshots on page 3.

The Edge Appliance tries to evict data as fast as it can, but this process is not faster than data ingestion.

Also, after the data eviction is complete, metadata eviction for the evicted data then proceeds. This process is slower than data eviction because metadata consists of many small files that must be processed. If the metadata stage is still happening, this can delay the next data eviction from starting.

Tip: To verify that a snapshot has been completed (both data phase and metadata phase), see Appendix: Verifying Snapshots on page 3.

Monitoring cache utilization is especially important during data ingestion. If cache utilization exceeds 85 percent, you can pause data ingestion until cache utilization is reduced. This can help processing to proceed more efficiently.

In addition, consider changing the cache reserve setting before performing data ingestion. The cache reserve setting is the percentage of local cache space reserved for new, incoming data.

For example, suppose that you are seeking to ingest 1 TB of data and that your cache is 5 TB. This means that you want at least 20 percent (1 TB / 5 TB) of the cache to be available for data ingestion. Since the eviction threshold is 85 percent, the maximum practical level of utilization is 65 percent (85 – 20) to prevent eviction from running while data is being ingested. If current cache usage is above the target value of 65 percent, use this formula to calculate the value of Cache Reserve you need to set.

    Cache Reserve = 100 - min((target + 15), (current cache usage - 1))

Therefore, if current cache usage were 70, you would set Cache Reserve to
100 - (70-1) = 31.

Or, if current cache usage were 82, you would set Cache Reserve to
100 - (65+15) = 20.

Eviction would then begin, and you could then wait for the “Eviction complete” notification.

When cache utilization is at or below 65 percent, you could then reset the Cache Reserve setting to Automatic and begin data ingestion again.

This ensures that eviction does not need to run during data ingestion.

Appendix: Verifying Snapshots

A snapshot is a complete picture of the files and directories in your file system at a specific point in time. Snapshots are either manually initiated, or automatically performed as part of a Snapshot Schedule that you specify.

The snapshot process includes saving both the data and the associated metadata to cloud object storage. For this reason, a snapshot consists of both a data phase (sometimes called “phase 1”) and a metadata phase (sometimes called “phase 2”). To be sure that data is protected in the cloud, both phases of each snapshot must complete successfully. Only then can you be certain that no unprotected data remains in the cache.

Various procedures, including the recovery of an Edge Appliance, require you to perform a snapshot, and to then verify that the snapshot has completed successfully. This ensures that no unprotected data remains in the cache.

This section describes how to verify that a snapshot has completed successfully.

Verifying that a snapshot completed successfully

To verify that a snapshot has completed successfully, follow these steps:

  1. Log in to the NMC.

  2. Click the bell-shaped Notifications icon at the top right.
     

  3. Click View all Notifications. The Notifications page appears.

  4. In the Filter text box, type “snapshot”, then click "Apply Filter".
    The list is limited to notifications that include the word “snapshot”.

  1. For the most recent snapshot, find the “Snapshot started” notification for your Edge Appliance and for your volume that contains the label “Data”.
    For that notification, find the corresponding “Snapshot completed” notification for the same Edge Appliance, volume, and version number.
    This verifies that the data phase of this snapshot completed.

  2. Similarly, for the most recent snapshot, find the “Snapshot started” notification for your Edge Appliance and for your volume that contains the label “Metadata”.
    For that notification, find the corresponding “Snapshot completed” notification for the same Edge Appliance, volume, and version number.
    This verifies that the metadata phase of this snapshot completed.

Unprotected Files list

The Unprotected Files list on the Edge Appliance UI or the NMC is not sufficient verification that a snapshot has completed. The files in the Unprotected Files list are not yet protected, so any snapshots containing any of those files have not completed.

However, even if the Unprotected Files list has no files in it, that does not mean that all snapshots have completed. It could be, for example, that the data phase of a snapshot has completed, but that the metadata phase has not completed.

“New Data in Cache (not yet protected)” chart

The “New Data in Cache (not yet protected)” chart on the Edge Appliance UI is not sufficient verification that a snapshot has completed. The files in the “New Data in Cache (not yet protected)” chart are not yet protected, so any snapshots containing any of those files have not completed.

However, even if the “New Data in Cache (not yet protected)” chart has no files in it, that does not mean that all snapshots have completed. It could be, for example, that the data phase of a snapshot has completed, but that the metadata phase has not completed.

Copyright © 2010-2024 Nasuni Corporation. All rights reserved.