Data Migration Best Practices

Overview

This document presents recommendations for migrating data to Nasuni. It includes commonly used tools and strategies for various migration scenarios.

Introduction

This document contains information regarding the migration of unstructured data from existing customer storage systems to Nasuni. It includes information about the tools that can be used, preparation work to be performed, and migration strategies for common scenarios. The document assumes a working knowledge of the Nasuni Edge Appliance and Nasuni Management Console (NMC).

Migrating data from the original source to management in the Nasuni system is straightforward but not as simple as copy and paste. An analogy might be provisioning a public library. You do not simply acquire books and put them on the shelves. Each incoming book must be registered, cataloged according to some system, characterized by size and usage, allocated space in coordination with other books, and finally stored in such a way that it is safe and easy to access.

The same is true with data migration. Certain processing is necessary for each file that is migrated. This document is intended to guide you in the considerations necessary for the fastest, and safest, data migration process.

Note: While many customers successfully perform data migrations on their own, some data migration projects can be complicated, either because of the nature of the migrated data or because of how the Edge Appliances are configured or used. In these situations, we recommend contacting Nasuni Professional Services for assistance. Our team has extensive experience with many different kinds of deployments.

Tips for successful data migration

Here are some of the most important aspects of data migration to keep in mind.

Important: Monitor the cache carefully during data migration. Ensure that the cache does not approach being full.

General overview

Using a separate Edge Appliance is recommended for data migration. In particular, DO NOT use a production Edge Appliance for data migration, because this can impact user access to data.
Here are some sample workflows:
- If migrating data to version 9.9 or before systems:
  For larger caches (larger than 1 TB): Set the Reserved Space Percentage to Automatic. Ingest data until the cache is 85 percent full, then pause ingestion until the cache returns to 70 percent full. Resume ingestion.
  If migrating to version 9.10 or later systems, this is not necessary.
- If migrating data to version 9.9 or before systems:
  For smaller caches (1 TB or smaller): Free up cache space by setting a high Reserved Space Percentage. Start migration using a normal snapshot schedule. When the cache reaches 50 percent full, slow or pause the migration.
  If migrating to version 9.10 or later systems, this is not necessary.
Nasuni has the most experience with Robocopy. Other tools are available, but Nasuni could not provide the highest level of support on those. For example, rsync is probably the most popular non-Microsoft tool for NFS. FastCopy and Beyond Compare are also alternatives. Most of this document applies to non-Robocopy tools.

Tip: Before using Robocopy to copy data, be aware of these recommendations:
• The user running the Robocopy command should be a CIFS Administrator. This helps ensure that the user has the privileges to write to the folders. By making that user a CIFS Administrator, they can bypass NTFS permissions and run the Robocopy command without issue. It also helps to avoid certain error messages.
• The user should copy only the folder structure first, then set permissions on the folders, and then copy the data without permissions. This helps ensure the highest likelihood of success when copying data to Nasuni.

Make sure that you understand the tool and the options you use if you are not relying on Nasuni. If you need assistance using your tools, contact Nasuni Support or Professional Services.
When creating volumes, if you only require Windows (SMB) support, NTFS Exclusive mode is recommended. If you need NFS or FTP support, Nasuni recommends NTFS Compatible mode.

Before performing data migration

Before migrating data, perform cleanup, re-ACL, and organizational change operations.
The cache size must be larger than the largest file that you plan to migrate.
The COW (copy-on-write or snapshot disk) should be at least 1/4 of the size of the cache disk or 250 GiB, whichever is smaller. Do not attempt data ingestion during snapshots unless the data size is less than the COW size or you risk COW overflow.
VMDK files require their configured size, not their current disk utilized size, for cache considerations.
For large migrations, disable Global File Lock (GFL) on the target directory.

Note: You can enable and disable Global File Lock using the NMC API. For details, see Nasuni Labs.

Remove an Edge Appliance from Global File Acceleration (GFA) to avoid impacting other sites' operations. You can use the GFA schedule to remove an Edge Appliance from GFA during a specific period of time.

Note: To snap the new data, you must put the Edge Appliance back into consideration (during non-working hours). For more information, refer to Global File Acceleration (GFA).

During data migration

Copying SMB data to the Edge Appliance should represent approximately 80 percent of the bandwidth to the cloud. Some tools, such as Robocopy, make it possible to throttle the copy job.
Small files suffer most from the overhead of file creation, file cataloging, and network latency. Throughput is a good measure of copy speed. However, small files require more overhead, and your transfer speed might be significantly lower than the pipe throughput.

Commonly Used Tools

Several tools can be used to perform file migration operations. The most commonly used tool is the Robocopy utility, written by Microsoft and included in all of the latest versions of the Windows operating system.

Each tool has its strengths and weaknesses. Careful thought should be given to the customer's requirements and environment before selecting the appropriate tool.

Nasuni has a great deal of experience with Robocopy and can provide the most in-depth guidance around its usage.

Robocopy

Robocopy is an industry-standard tool provided by Microsoft for migrating Windows File Data. When used correctly, it can efficiently move large amounts of data in a reasonable amount of time. This command-line utility copies files from storage directly attached to the server it is run on or from an SMB share that the Windows computer connects to.

Strengths

Robocopy is a powerful utility that offers excellent flexibility for defining what data is copied and how it is copied. Its powerful file selection options can be used to create specific filter rules that can be advantageous when attempting to accomplish migration goals, such as ensuring that the most recently modified data is cache-resident after the migration. Later versions of Robocopy (included since Windows Server 2008 R2 and Windows 7) support the creation of multi-threaded copy jobs that can greatly speed up the rate of data migration.

Considerations

* Be aware that Robocopy does not suspend copy operations while a Nasuni Edge Appliance runs a snapshot.

* Robocopy does not suspend copy operations when the cache of the Nasuni Edge Appliance nears capacity.

* Robocopy is a command-line utility that does not provide a native GUI for configuration. (There are additional third-party utilities that do provide a front-end GUI for Robocopy.)

* Robocopy does not provide scheduling capabilities for migration jobs.

* Robocopy only runs on Microsoft Windows operating systems. If the source file server is not running Windows, an additional server might be necessary for the migration process.

Tip: Before using Robocopy to copy data, be aware of these recommendations:
• The user running the Robocopy command should be a CIFS Administrator. This helps ensure that the user has the privileges to write to the folders. By making that user a CIFS Administrator, they can bypass NTFS permissions and run the Robocopy command without issue. It also helps to avoid certain error messages.
• The user should copy only the folder structure first, then set permissions on the folders, and then copy the data without permissions. This helps ensure the highest likelihood of success when copying data to Nasuni.

Robocopy details

Robocopy offers many different options for customizing the copy process. For complete information on Robocopy, see robocopy.

A good starting point for configuring a Robocopy job is a general-purpose command such as:

robocopy <source> <dest> /COPY:DATSO /DCOPY:T /V /MIR /TEE /R:0 /W:0 /E /FFT /XF .DS_Store /MT:20 /NDL /NFL /NP /log+:<name-of-log-file-on-another-drive.log>

This command contains several switches that you should customize for your scenario. These switches provide the following functionality:

/COPY - specifies what aspects of a file to copy.
- D: copies the data contents of the file
- A: copies the DOS attributes of a file
- T: copies the timestamps of a file
- S: copies the security information of a file, namely the NTFS access control list
- O: copies the owner information of a file

Important: Do not use the /COPYALL switch, or the /COPY switch with the U (auditing) option (namely, /COPY:U). Nasuni does not support NTFS Auditing information. Instead, Nasuni recommends using the /COPY:DATSO switch to copy all of the relevant data without the NTFS Auditing information.

/DCOPY - specifies what aspects of a directory to copy.
- T: timestamps of a directory. Usually useful to include.
- A: attributes of a directory. Usually not necessary.
- D: copies the data of a directory. Usually not necessary.
/V - verbose output; shows all skipped files
/MIR - mirrors a directory tree, including all permissions.
/TEE - writes status output to console as well as log. Omit to increase performance.
/R:0 – number of retries if a file is in use. It is faster and more efficient to have no retries, and instead pick the file up on a subsequent robocopy job.
If you want to include retries, a value of 3 to 10 retries is recommended.
/W:0 – seconds to wait between retries if a file is in use. It is faster and more efficient to have no retries, and instead pick the file up on a subsequent robocopy job.
If you want to include retries, a value of 10 seconds is recommended.
/E - copies subdirectories, does not overwrite existing destination directory security.
/FFT - assume FAT file times.
/XF .DS_Store – exclude the file .DS_Store. Additional files can be added to the list; supports wildcards.
/MT:20 – multiple threads. You can adjust this depending on the available client and network resources. Remove this on subsequent delta copies because it slows delta copies.
/NDL - specifies to not log directory names. It is faster to not log directory names. Do not use this if you specifically need to log directory names.
/NFL - specifies to not log file names. It is faster to not log file names. Do not use this if you specifically need to log file names.
/NP - specifies not to log the progress of the copying operation (the number of files or directories copied so far). It is faster not to log progress. Do not use this if you specifically need to log progress.
/log+: - writes specified results to a specified log file. Ideally, the log file is not on the destination of the copy operation. The “+” indicates that results should be appended to an existing log file, .

Other examples

Here are some other examples of using robocopy.

Basic copying of data

See the example in Robocopy details on page 6:

robocopy <source> <dest> /COPY:DATSO /DCOPY:T /V /MIR /TEE /R:0 /W:0 /E /FFT /XF .DS_Store /MT:20 /NDL /NFL /NP /log+:<name-of-log-file-on-another-drive.log>

Copy data, but do not copy NTFS security or ownership

This command copies data but does not copy each file's NTFS security or ownership information . This command enables the copied files to start fresh without, for example, extraneous ACLS that might have accumulated in the past.

robocopy <source> <dest> /COPY:DAT /DCOPY:T /V /MIR /TEE /R:0 /W:0 /E /FFT /XF .DS_Store /MT:20 /NDL /NFL /NP /log+:<name-of-log-file-on-another-drive.log>

Specific switches for this purpose include the following:

/COPY - does not include the S (security) or O (ownership) values.

Copy directory structure only

This command copies the directory structure, but does not copy the file into that directory structure.

robocopy <source> <dest> /COPY:DATSO /DCOPY:T /V /MIR /TEE /R:0 /W:0 /E /FFT /Lev:3 /XF *.* /NDL /NFL /NP /log+:<name-of-log-file-on-another-drive.log>

Specific switches for this purpose include the following:

/COPY - can include the S (security) or O (ownership) values if permissions and ownership information are necessary.

/Lev:3 - copies only the top 3 levels of the source directory tree.
/XF *.* - excludes all files.
No /MT needed because multi-threading is not necessary when copying only directories.

Copying newest files or directories first

Often, you want to copy the newest files or directories first, then add in older files later. To do this, use the /MAXAGE switch. For example, /MAXAGE:90 specifies that the maximum age to copy a file is 90 days or less.

Common directory exclusions

These directories are commonly excluded during copies:

.etc

.TemporaryItems

.Trashes

$Recycle.Bin

~snapshot (Naming may vary depending on Storage Platform)

lost+found

AppData

DfsrPrivate

System Volume Information

Do not exclude directories needed for your app or use case.

Common file exclusions

These files are commonly excluded during copies:

.apdisk

.DS_Store

~*.*

*.pst

*.tmp

autorun.inf

desktop.ini

pagefile.sys

swapfile.sys

Thumbs.db (If you do exclude Thumbs.db files, they might be automatically recreated, which could affect cache operation.)

Do not exclude files needed for your app or use case.

If you are blocking certain files in your share definition, you probably want to also exclude those files during copies.

Before Migrating

There are several considerations to address before performing the data migration.

Data Considerations

The nature and extent of the data being migrated can influence how easily and quickly the migration proceeds. You should be aware of these facts:

Migrations that include many small data objects take longer than migrations with a few large data objects. This is because Nasuni not only stores each object (which depends only on its size), but catalogs and tracks each object for later retrieval (which depends on the number of objects). For example, a 1-GB file requires far less processing than 1,000 1-MB files, even though they might be the same size.
For this reason, when planning data migrations, you should know the total size of the migrated data and the number and type of data files.

Source file server discovery

Performing a data discovery scan of your network can be very helpful.

In the planning phase, consider your data sources, the required permissions, and how the data will be imported into Nasuni.

Data Hygiene

Nasuni recommends preparing the source data before migrating. Additional considerations include the following:

Clean up existing data.
If needed, reorganize data.
Review and, if required, update permissions.

Data Preparation

Nasuni recommends configuring new volumes to use the “NTFS Exclusive Mode” Permissions Policy. This mode is suitable for SMB-only workloads and produces full NTFS permissions providing the greatest compatibility with Windows clients.

If multiple protocols are required for the volume, the Permissions Policy can be set to “NTFS Compatible Mode.” Before migrating data to a volume running in “NTFS Compatible Mode,” there are several steps customers should take to prepare the data. These preparation steps include the examination of the existing ACL structure to ensure access permissions are copied properly.

Bypass/Traverse Permissions: Unlike Windows, Unix does not natively allow bypass/traverse permissions for users when volume is NTFS Compatible mode. This is not required for NTFS Exclusive volumes, but is required for NTFS Compatible volumes. Traverse permissions are required at the root of the volume and intermediate levels for browsing. Directly accessing a subfolder does require traverse permissions.

Explicit Denies: When an object has an inherited Deny permission and a non-inherited AlIow permission, it might be necessary to make the Deny permission explicit. This is not required for NTFS Exclusive volumes, but is required for NTFS Compatible volumes.

Local Groups: Permissions granted to Local groups do not transfer to Nasuni. Converting Local groups to domain groups is required with NTFS Exclusive and NTFS Compatible.

BUILTIN Groups: Converting BUILTIN groups to domain groups with NTFS Exclusive and NTFS Compatible is not required, but is recommended.

Additive Permissions: Nasuni does not add all rights from ACLs. Fixing additive permissions is not required for NTFS Exclusive volumes, but is required for NTFS Compatible volumes.

Owner Rights: This is a well-known SID that allows a file or directory Owner the ability to change permissions. It is optional to set Owner rights to Modify to prevent Owners from changing permissions.

Orphaned SIDs

In Windows, a Security Identifier (SID) is a unique ID assigned to a security principal such as a user or group. The other attributes of a security principal, like its name, are associated with the SID. If a security principal listed in an ACE is removed from the Active Directory (for instance, if a former employee’s user account is deleted, but the ACE referencing that security principal is not removed from the ACL), the SID appears as “orphaned” when viewing the ACL. That is, you see a raw SID that cannot be resolved to the name of a security principal.

Copying orphaned SIDs across to Nasuni does not cause any difficulties with permissions on NTFS Compatible or on NTFS Exclusive. Removing orphaned SIDs is optional. If you decide to remove orphaned SIDs from an environment, you can use a utility such as SetACL from Helge Klein.

Using SetACL

SetACL manages permissions, auditing, and ownership information. It does everything that the tools built into Windows do and much more. It is inherently automatable and scriptable.

SetACL is available for free as a command-line utility at https://helgeklein.com/setacl/.

Just as with SubInACL, SetACL can be used to identify and remove orphaned SIDs.

To list the orphaned SIDs present in the directory structure at a given path, run a command like the following:

setacl.exe -on C:\DIRECTORYPATH\ -ot file -actn list
-lst oo:y;f:tab -rec cont

-on: path on which to operate
-ot: object type; setting this to “file” includes files and directories
-actn: action; set this to “list” to view permissions
-lst: what to list; “oo:y” means OrphanedOnly, “f:tab” means tabular format
-rec: recursion; “cont” means directories only

To remove the orphaned SIDs from a given directory path, run a command like the following:

setacl.exe -on C:\DIRECTORYPATH\ -ot file
-actn delorphanedsids -os dacl -rec cont_obj

-on: path on which to operate
-ot: object type; setting this to “file” includes files and directories
-actn: action; delete orphaned SIDs found on the path specified by -on

Historical SIDs

Historical SIDs can lead to confused permissions, such as users either having unexpected access to an item or lacking expected access to an item. Cleaning up historical SIDs is not required for NTFS Exclusive volumes but is required for NTFS Compatible volumes.

ACL Issues

Besides cleaning up orphaned SIDs from existing ACLs, there are several other things to be aware of when migrating data.

Local Accounts

An ACL can include ACEs that refer to security principals stored locally on the server instead of centrally in Active Directory. These local accounts are only valid within the context of the server on which they are created. Thus, ACEs that reference these local accounts should be removed before migrating data to the Nasuni appliance.

Deny Rules

There are differences in the way that Windows and POSIX systems order permissions. Windows operating systems require a canonical ordering of permissions where non-inherited ACEs are listed first, followed by inherited ACEs. Further, “deny” ACEs must be listed first within their group (non-inherited or inherited). POSIX systems allow for “deny” and “allow” permissions to be interleaved.

If the ACL is written in the incorrect order, users might receive unexpected permissions to objects. For this reason, it is best to avoid using “deny” ACEs. When it is impossible to avoid using a “deny” ACE, folders with “deny” ACEs should be carefully evaluated to ensure that the ACL is ordered properly and that users receive the expected permissions.

Network and Internet

The speed of the network and Internet connection affects the speed of the data migration. It is advisable to remain aware of your connection speed from the data origin to the Edge Appliance, and from the Edge Appliance to the cloud.

You can use free tools to estimate how long it would take to transfer a given data set, such as File-Transfer Time Calculator

Permissions

For details about Nasuni permissions, see Permissions Best Practices Guide.

Configuring Nasuni

Before performing the data migration, several Nasuni configuration options must be considered.

Cache considerations

The cache is the local storage of a Nasuni Edge Appliance. When running a Nasuni Edge Appliance on a virtual platform, you can configure the size of the cache disk and the copy-on-write (COW) disk. On Nasuni hardware appliances, the sizes of the cache disk and the copy-on-write (COW) disk are pre-configured, and there is no need to configure them.

The cache serves two main purposes:

Any data written to a volume is first staged in the local cache.
All data and metadata that is accessed regularly is kept locally in the cache (“cache resident”).

Cache size

Because data written to a volume stages in the local cache, we recommend using a cache size large enough to accommodate the largest file in the incoming data set.

If the cache is too small to accommodate the largest file in the data set, you can change the cache size using the virtual machine tools. After the data migration, you can change the cache size to its original size. For details, see Resizing Cache and Snapshot Disks.

Note: If you change the cache size, you might need to perform a recovery procedure afterward.

Alternatively, if a file exceeds the cache size and you use a migration tool that can accommodate resuming where it left off after a full disk error, you can migrate a file in multiple attempts. The incomplete file is snapshotted, and the cache is evicted to make space. The file copy tool resumes where it left off and writes the remainder of the file. This file is complete once in the cloud.

Reserved space in the cache

By default, a Nasuni Edge Appliance automatically manages the percentage of local cache space reserved for new, incoming data, using an advanced algorithm to optimize cache usage. The remainder of the cache retains the data that users most likely need locally. However, you can override the percentage reserved for new, incoming data. The percentage that you set applies to all volumes on a Nasuni Edge Appliance. The maximum percentage of the cache reserved for new, incoming data is 90 percent of the cache size. The minimum percentage is 5 percent.

By setting the amount of reserved space, you disable the automatic management of this value.

On a Nasuni Edge Appliance, click Configuration, then select Cache Settings. On the Nasuni Management Console, click Filers, then click Cache Settings.

During data migration, you want most of the cache to be available for the new, incoming data you provide. For this reason, you should set the Reserved Space as high as possible (90 percent) during the data migration. When the data migration is complete, you can set the Reserved Space to a lower value, or allow the cache to set the Reserved Space automatically again.

Copy-on-Write (COW) disk (aka snapshot disk)

The snapshot (copy-on-write or COW) disk is used during the snapshot process. If any writes to a Nasuni Edge Appliance occur during a snapshot, the previous data from the cache disk is copied to the COW disk, and the new data is written to the cache disk. Hence, the term “copy-on-write”. This allows new writes to occur anytime, even during the snapshot process.

The larger the cache, the larger the copy-on-write (COW) disk should be. If the copy-on-write (COW) disk is too small, a snapshot might fail, requiring a retry. The snapshot will eventually succeed, but this is inefficient and can unnecessarily cause the data migration to take longer. If this occurs, you will see a notification stating, “Snapshot ran out of internal space.”

As with the cache, you can change the size of the COW disk. Generally, the size of the COW disk should be at least 1/4 of the size of the cache disk.

Note: You can use an extra-large COW during a migration and a smaller one for production use. To change a COW disk, shut down the Nasuni Edge Appliance. Delete the COW disk, add a new, appropriately sized one, and then power back on. If you prefer, our Support team can help with this process.

Remote Access

Other Edge Appliances can access the volume on the target Edge Appliance. During the data migration, you should disable remote access temporarily. This removes processing that involves other Edge Appliances.

Bandwidth (Quality of Service)

The Quality of Service (QoS) settings indicate the inbound (from cloud to Edge Appliance) and outbound bandwidth limit of moving data to and from the Nasuni Edge Appliance, such as transmitting snapshots to cloud storage. The default inbound Quality of Service is unlimited. The default outbound Quality of Service is 10 megabits per second.

If outbound QoS is too low, fast output to cloud storage is not possible. However, if QoS is too high, this might prevent other processes from getting the network bandwidth that they need.

Tip: To minimize bandwidth during business hours, implement a scheduled QoS. This will restrict the Nasuni Edge Appliance's bandwidth during business hours but allow higher bandwidth after hours.

Warning: Datasets with many small files do not use throughput efficiently. In such situations, placing the source of the small data files close to the data pipe can help lower latency.

During data migration, enable as much bandwidth (Quality of Service) as can be spared for when snapshots do occur.

Global File Lock

The Global File Lock® feature prevents conflicts when two or more users attempt to change the same file on different Nasuni Edge Appliances. If you enable the Global File Lock feature for a directory and its descendants, any files in that directory or its descendants can only be changed by one user at a time. Any other users cannot change the same file at the same time. For details about Global File Lock, see Global File Lock.

Global File Lock requires special processing to coordinate access to files. If Global File Lock is enabled for the directory that is the destination of the data migration, this special processing can impede the data migration process, and increase the time necessary to complete the data migration.

For this reason, Nasuni strongly recommends disabling Global File Lock for the destination directory.

You enable or disable Global File Lock using the File System Browser in the Edge Appliance user interface or the NMC. You can examine the destination directory using the File System Browser and, if Global File Lock is enabled, disable Global File Lock for the destination directory.

Note: You can enable and disable Global File Lock using the NMC API. For details, see Nasuni Labs.

However, it is also possible that the destination directory might have Global File Lock enabled due to inheriting the Global File Lock setting from one of the directories above it in the directory tree. If this is the case, you must identify the directory above the destination directory that has Global File Lock enabled, and then disable Global File Lock for that higher-level directory.

If, for some reason, you must keep Global File Lock enabled for the destination directory, the data migration can proceed. However, you must be prepared that the data migration will take significantly longer to complete.

Global File Acceleration

Nasuni Global File Acceleration (GFA), a component of the Nasuni® file data platform, accelerates file synchronization across cloud regions or on-premises locations, helping customers to improve file sharing collaboration and optimize workforce productivity. For details about GFA, see Global File Acceleration.

GFA makes suggestions to an Edge Appliance about when to perform snapshots for the GFA-enabled volume. Because this scheduling can interfere with data migration, Nasuni recommends disabling GFA for the Edge Appliance on which you are migrating data. There are several approaches for arranging this:

The best approach is to schedule data migration activities when GFA processing is not scheduled for the Edge Appliance. This allows GFA processing to proceed during scheduled times but does not interfere with data migration. You can specify the GFA schedule using the NMC.
Another approach is to manually disable GFA during times that you are performing a data migration.
A final approach is to perform data migration during times when GFA will have no impact because no one is using the Edge Appliance.

Nasuni Antivirus Service

The Nasuni Antivirus Service scans new and changed files for malware. Enabling antivirus scanning using the Antivirus Service schedule generally has a low impact on performance, because files are scanned in batches. However, since files do not proceed to cloud storage until scanned, this can delay data propagation and file synchronization until after the scheduled scan occurs.

For this reason, Nasuni recommends disabling the Antivirus Service while performing the initial data migrations.

Auditing

Nasuni recommends disabling Auditing during data migration, unless you regard Auditing as necessary during this process.

Migration Strategies

There are many different approaches to migrating data. Careful consideration and planning are required before beginning any migration, to ensure that migration occurs efficiently and smoothly. Over the course of performing many migrations for a diverse customer population, Nasuni has developed techniques for several common scenarios. There might be overlap among these scenarios and the recommendations for each might be combined.

General Migration Guidance

Migrate data in stages:

Run initial migration jobs to capture the bulk of the data.
Run delta migration jobs to capture changes to the source dataset made since the initial migration job.
Run multiple delta migration jobs until the duration of the job is small enough to fit within an acceptable maintenance window.
Schedule a maintenance window and run a final delta job.
After the final delta job completes, run a snapshot.

Tip: To verify that a snapshot has been completed (both data phase and metadata phase), see Appendix: Verifying Snapshots.

Monitor the status of the appliance’s cache to ensure that all data has been protected.
Cut users over to the appliance.

Migration workloads have different performance characteristics from normal production workloads. A customer’s Nasuni appliances might be sized appropriately for their production workload, but undersized for their initial migration. In those cases, Nasuni might be able to provide temporary hardware resources to handle the migration workload.

Initial migration before production users

When performing the initial migration of data, it is ideal if no production users are accessing the data currently. This enables the system to perform its tasks without the added complications of handling data that production users are accessing or changing.

During initial migration, migrate data that is unlikely to be accessed by production users. This would include data that has not been changed or accessed for a long time.

Since no production users are accessing the data, you should disable Global File Lock. It would not be necessary at this time, and disabling Global File Lock also saves on processing new data.

Note: You can enable and disable Global File Lock using the NMC API. For details, see Nasuni Labs.

Similarly, you can also disable Global File Acceleration. Since no production users are trying to access data remotely, there is no need to use Global File Acceleration processing.

If migrating data to version 9.9 or before systems:

As a rule of thumb, you should migrate data to about 85 percent of the cache size. At this point, you can pause Robocopy (or your file transfer tool). You can then perform a manual snapshot and wait until the snapshot completes. Repeat this cycle until the initial migration is complete.

Tip: To verify that a snapshot has been completed (both data phase and metadata phase), see Appendix: Verifying Snapshots.

If migrating to version 9.10 or later systems, perform a manual snapshot when the initial migration is complete.

Subsequent migrations before production users

After the initial migration, you can migrate more recent data and data that production users are more likely to access. Again, ideally, this takes place before production users begin accessing data.

Your goal should be to ensure that as much necessary data is migrated before production users begin to access data.

Migration after production users

When you begin to allow production users to access data, you can then enable Global File Lock, if you believe its protection is necessary. This will influence the performance of migrations. However, subsequent migrations should not be large, consisting only of data that users accessed while the initial migration steps were occurring.

Similarly, when users begin to access data remotely, you can enable Global File Acceleration, if you believe its processing is necessary.

Note: You can enable and disable Global File Lock using the NMC API. For details, see Nasuni Labs.

Eventually, further migration from the source system should become unnecessary. All data being accessed by users should now be present, and users are creating new files or new versions of existing files that should not be altered with any older data from the source system.

If it should be necessary to migrate additional data after production users have begun accessing data, this additional migration should be scheduled for times when user access is lowest, such as evenings and weekends.

Source Data Access

Ensure that the account being used to migrate data to the Nasuni appliance has sufficient permissions on the source data to read both file data and security information. This can be accomplished in several ways, including membership in the appropriate security groups. It can also be beneficial to grant the user the “Backup Operator” role (or the equivalent).

Administrative User

An Administrative User on an appliance is a user with full access to all SMB shares and has the necessary permissions to change file and folder permissions, regardless of the ACL associated with the file or folder. It is recommended to assign the user account performing migration operations the Administrative User role on the appliance to avoid issues with migrating and configuring permissions. The Administrative User setting is appliance-specific and must be managed directly on the appliance; it is not configurable via the NMC.

Traversal Rights

With NTFS Exclusive Mode, the default behavior for Windows is to bypass traverse rights checking. This means that, when connecting to a share, the user does not need to have permissions on the parent objects for the share. The traverse permission is not required from the root of the volume. The traverse permission IS required from the root of the share and intermediate levels for browsing. Directly accessing a subfolder does NOT require traverse permissions.

With NTFS Compatible Mode, traverse rights checking is enforced by default. Traverse permissions ARE required at the root of the volume and intermediate levels for browsing. Directly accessing a subfolder DOES require traverse permissions.

For details about permissions, see Permissions Best Practices Guide.

Access Control List Rewrite

It is sometimes desirable to completely rewrite the Access Control List (ACL) structure for a dataset as part of the migration process. Migrations can be an ideal time to clean up ACLs that have drifted from the established standard, or to apply a new standard ACL structure. Rewriting the ACL structure also means that you can skip much of the data preparation work on the source. Robocopy can be used to build out several levels of the file system tree on the destination without copying the actual data. Once the basic directory structure is in place, ACLs can be defined on the directories. Since there is no data in the file system yet, any changes to the ACL structure quickly propagate through the file system.

The “/lev:<N>” switch copies only the top N levels of the source directory tree. Adding “/e” and “/XF *.*” copies just the folders and not the files. For example: "robocopy.exe <source> <dest> /e /lev:2 /XF *.*" copies just two levels of the directory structure from the source to the destination, but does not copy any files. After that is complete, you can set your permissions on the destination and run a subsequent Robocopy process that copies data but not security by specifying “/COPY:DAT”. As the files are written to the destination, they inherit the previously defined ACL structure.

Migration Source Smaller Than Available Cache

This is the most straightforward migration scenario. When the dataset being migrated can be stored within the available cache of the appliance, the copy process can proceed at full speed with little to no interaction or customization required. Before beginning the migration, suspend the snapshot schedule for the volume. When the migration is complete, trigger a manual snapshot and then enable the appropriate snapshot schedule, based on the Recovery Point Objective (RPO) and data propagation requirements.

Tip: To verify that a snapshot has been completed (both data phase and metadata phase), see Appendix: Verifying Snapshots.

Migration Source Larger Than Available Cache

If migrating data to version 9.9 or before systems:
In scenarios where you have a dataset to migrate that is larger than the available cache of the target Nasuni appliance, it is important to ensure that the cache does not fill up and interrupt the migration process.

If Robocopy is used to perform the migration, the migration should be broken up into distinct jobs that involve datasets that are small enough to fit within the available cache.

After the Robocopy job completes, a snapshot should be triggered so that the data can be protected.

Tip: To verify that a snapshot has been completed (both data phase and metadata phase), see Appendix: Verifying Snapshots.

The percentage of the cache reserved for new incoming data should be lowered so that the appliance starts moving data from the cache sooner, and maintains sufficient space for subsequent migrations to occur. After the data is protected, it is removed from the cache, creating free space for the next Robocopy job.

If migrating to version 9.10 or later systems, this procedure is not necessary.

Migrating Multiple Sites to Multiple Volumes

If each site involved in a migration has its own appliance and volume, then each site should be handled with the guidance from the other appropriate scenario, for example, Migration Source Smaller Than Available Cache.

Migrating Multiple Sites to a Single Volume

When migrating data from multiple sites to a single volume, the maximum push duration for data push phase snapshots can be increased during the migration (see Snapshot length) to load data into the cloud more efficiently and quickly.

Since the data push phase does not require a lock on the volume, multiple appliances can spend more time working concurrently on sending data to the cloud. This leads to longer metadata push phases, but it also allows more data to be loaded in a single snapshot than the normal behavior.

Migrating Oldest Data First

When the dataset being migrated to an appliance is larger than the available cache, the first data copied is evicted to free up cache space for subsequent migration jobs.

At the end of the migration process, the most recently migrated data is resident in the cache, while the older data only exists within the cloud. Using Robocopy commands, it is possible to copy the oldest data first, regardless of where it exists in the filesystem tree. By running multiple Robocopy jobs and decreasing the age threshold each time, you can migrate the most recently modified data last, ensuring that it is cache-resident when you cut over your users.

This technique involves using Robocopy’s “/MINAGE” switch to migrate older data first. For example, adding “/MINAGE:730” excludes files modified in the last two years. After the first Robocopy job completes, decrease the “/MINAGE” value for subsequent runs. For example, the next job could specify 365 for one year. For the final Robocopy job, remove the “/MINAGE” switch so that any remaining data is copied.

Migrating Data Created by Mac Users

When migrating data written by Mac clients, start by blocking the creation of .DS_Store files at the share level. When running a migration job, there is a list of well-known files and directories that should be excluded from the migration process:

.DS_Store
.Trashes
.Spotlight-V100
.VolumeIcon.icns
.VolumeIcon.low.icns
.com.apple.timemachine.donotpresent
.fseventsd
.apdisk
.FB*
.HS*
.TemporaryItems
TheVolumeSettingsFolder
Thumbs.db (If you do exclude Thumbs.db files, they might be automatically recreated, which could affect cache operation.)

Do not block “._*” AppleDouble/dotbar files when migrating (or at the share level). Those files store Mac resource forks and metadata. In rare cases, excluding them from migration might cause data loss.

Backend Configuration to Aid in Migration

Nasuni personnel can adjust settings on an appliance to improve migration performance, depending on the scenario.

Snapshot length

During the data push phase of a snapshot of a shared or local volume, by default, the appliance sends data to the cloud for up to 10 minutes. This is done to reduce the metadata push phase duration, because a lock is held on the volume for the duration of the metadata snapshot, and to allow data to sync to other appliances in a timely manner.

Limiting the data push phase duration can limit the metadata push phase duration and reduce the chances of snapshot contention on a shared or local volume. However, it can also lengthen the time it takes to protect a large amount of data, because metadata for the filesystem structure might be snapped many times across multiple snapshot cycles.

While the default 10-minute time is appropriate for most normal workloads, during migrations to shared or local volumes, it can be advantageous to increase the default duration. To increase the default duration, contact Nasuni Support.

When the migration is complete, it is important to request that Nasuni Support sets this back to the default 10 minutes.

Snapshot Database Connection Count

Increasing the number of database connections opened by the snapshot process might increase snapshot performance. To increase the number of database connections, contact Nasuni Support.

Note that increasing the connection count increases the amount of RAM used during a snapshot. If the connection count is increased during migration, it should be returned to normal before moving to normal production workloads.

Important considerations

Avoiding COW Overflow

When a snapshot is running on an appliance, the appliance uses a Copy-On-Write (COW) volume to track changes made to the filesystem during the snapshot. The COW is a temporary volume that is created and destroyed with each snapshot. During normal operations, the COW is large enough to handle all the user-generated modifications to the file system while a snapshot occurs.

During a migration, if many changes are being made to the filesystem, it might be too much for the COW to track during a snapshot. When this happens, the COW reaches capacity and must force the appliance to abort its snapshot so as not to lose incoming changes. While the appliance can handle these overflow scenarios, it is best to avoid them if possible.

Avoiding Cache Saturation

If the appliance’s cache utilization reaches capacity, any copy operation fails, due to insufficient disk space. The appliance reclaims cache space during the eviction process. If the cache reaches 100 percent utilization, this can cause issues with snapshot performance.

Robocopy jobs should be configured to ensure that a single copy operation does not completely fill the appliance’s cache.

Cutover Procedure

The migration process is not complete until active users are moved from the legacy file server to the Nasuni appliance. It is important not to move users to the Nasuni appliance until all data has been copied from the legacy file server and all the data has been fully protected via the snapshot process.

Tip: To verify that a snapshot has been completed (both data phase and metadata phase), see Appendix: Verifying Snapshots.

If there are large amounts of unprotected data from the migration process when users are cut over, their new changes are not protected until the backlog of migration data is processed. This means that the new data is at risk of missing RPO targets.

Additionally, when migrating to a shared volume, until the data has been written to the cloud, the other appliances attached to the volume do not see the data. This is of particular concern when a dedicated migration appliance is being used while users connect to another appliance that was not involved in the migration process.

Troubleshooting

Problem	Cause	Solution
Slow ingestion from client to Nasuni Edge Appliance.	Network topology is not optimal.	Bring the migration closer to the source and destination Nasuni Edge Appliance.
	Migration client machine resources.	Increase client machine resources or split migration onto another client machine.
	Small files in the data set.	Split migration into multiple instances of the migration tool (Robocopy).
Slow snapshot from Nasuni Edge Appliance to object store.	Slow internet connection.	Increase internet bandwidth resources at the site.
	Snapshots are pausing every 10 minutes.	Override default snapshot fairness algorithm.
	Small files in the data set.	Increase snapshot process threads.
	Small files in the data set.	Split migration by directory tree across multiple Nasuni Edge Appliances.

Problem	Cause	Solution
Robocopy failing to write certain files or copy timestamps or NTFS permissions.	ERROR 0 (0x00000000) Copying File - The operation completed successfully.	Confirm that an error copying a file is the problem and remove the problematic Alternate Data Streams.
	ERROR 2 (0x00000002) Copying File - The system cannot find the file specified.	During the final delta copy, make sure the source files are not in use.
	ERROR 5 - Access Denied.	Try Fastcopy or other copy tool.
	ERROR 32 (0x00000020) Copying File - The process cannot access the file because another process is using it.	During the final delta copy, make sure the source files are not in use.
	ERROR 59 (0x0000003B) Accessing Destination Directory \filerA\ An unexpected network error occurred. Waiting 3 seconds... Retrying...	The amount of data being copied to the filer exceeds the amount of available cache space left on the device. Increase cache disk on Nasuni Edge Appliance.
	ERROR 87 (0x00000057) Accessing Destination Directory \\DIR NAME The parameter is incorrect.	Remove or resolve corrupted Alternative Data Streams.
	ERROR 123 (0x0000007B) Copying File - The filename, directory name, or volume label syntax is incorrect.	Filename or directory path is longer than 255 bytes. Shorten the UNC path or rename the file.

Appendix: Verifying Snapshots

A snapshot is a complete picture of the files and directories in your file system at a specific point in time. Snapshots are either manually initiated, or automatically performed as part of a Snapshot Schedule that you specify.

The snapshot process includes saving both the data and the associated metadata to cloud object storage. For this reason, a snapshot consists of both a data phase (sometimes called “phase 1”) and a metadata phase (sometimes called “phase 2”). To be sure that data is protected in the cloud, both phases of each snapshot must complete successfully. Only then can you be certain that no unprotected data remains in the cache.

Various procedures, including the recovery of an Edge Appliance, require you to perform a snapshot, and to then verify that the snapshot has completed successfully. This ensures that no unprotected data remains in the cache.

This section describes how to verify that a snapshot has completed successfully.

Verifying that a snapshot completed successfully

To verify that a snapshot has completed successfully, follow these steps:

Log in to the NMC.
Click the bell-shaped Notifications icon at the top right.
Click View all Notifications. The Notifications page appears.
In the Filter text box, type “snapshot”, then click "Apply Filter".
The list is limited to notifications that include the word “snapshot”.
For the most recent snapshot, find the “Snapshot started” notification for your Edge Appliance and for your volume that contains the label “Data”.
For that notification, find the corresponding “Snapshot completed” notification for the same Edge Appliance, volume, and version number.
This verifies that the data phase of this snapshot completed.
Similarly, for the most recent snapshot, find the “Snapshot started” notification for your Edge Appliance and for your volume that contains the label “Metadata”.
For that notification, find the corresponding “Snapshot completed” notification for the same Edge Appliance, volume, and version number.
This verifies that the metadata phase of this snapshot completed.

Unprotected Files list

The Unprotected Files list on the Edge Appliance UI or the NMC is not sufficient verification that a snapshot has completed. The files in the Unprotected Files list are not yet protected, so any snapshots containing any of those files have not completed.

However, even if the Unprotected Files list has no files in it, that does not mean that all snapshots have completed. It could be, for example, that the data phase of a snapshot has completed, but that the metadata phase has not completed.

“New Data in Cache (not yet protected)” chart

The “New Data in Cache (not yet protected)” chart on the Edge Appliance UI is not sufficient verification that a snapshot has completed. The files in the “New Data in Cache (not yet protected)” chart are not yet protected, so any snapshots containing any of those files have not completed.

However, even if the “New Data in Cache (not yet protected)” chart has no files in it, that does not mean that all snapshots have completed. It could be, for example, that the data phase of a snapshot has completed, but that the metadata phase has not completed.