Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
docs_in_draft:snapraid [2023/04/22 20:05] – [Prepare for Drive Replacement] crashtest | docs_in_draft:snapraid [2023/04/28 23:45] (current) – removed crashtest | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | < | ||
- | < | ||
- | {{ : | ||
- | \\ | ||
- | ---- | ||
- | \\ | ||
- | < | ||
- | \\ | ||
- | {{ : | ||
- | |||
- | ====== SnapRAID Plugin For OMV6 ====== | ||
- | \\ | ||
- | \\ | ||
- | |||
- | ===== Summary ===== | ||
- | |||
- | SnapRAID is a backup program for JBOD disk arrays. SnapRAID stores data parity information which enables the recovery of disk failures. SnapRAID is targeted toward home media centers, with a lot of large files that rarely change. | ||
- | |||
- | Beside the ability to recover from disk failures, other features of SnapRAID are: | ||
- | * All data is hashed to ensure data integrity and to avoid silent corruption. | ||
- | * If the failed disks are too many to allow a recovery, only the data on the failed disks is lost. All data on the remainder of disks is safe. | ||
- | * If files are accidentally deleted, they can be recovered. | ||
- | * SnapRAID can be used with disks that already filled. | ||
- | * The disks of the array can be different sizes. | ||
- | * Data disks can be added at any time. | ||
- | * SnapRAID can be removed at any time without the need to reformat or move data. | ||
- | |||
- | ---- | ||
- | |||
- | ==== Third Party Software Note ==== | ||
- | |||
- | While this OMV plugin makes the SnapRAID package easy to integrate into openmediavault, | ||
- | |||
- | ===== Prerequisites ===== | ||
- | |||
- | * [[https:// | ||
- | * An additional disk drive is required to store parity data. SnapRAID' | ||
- | * For reports on drive health and automating SnapRAID administrative tasks, setting up -> [[https:// | ||
- | * Enabling -> [[https:// | ||
- | * Consider testing the server' | ||
- | |||
- | |||
- | |||
- | ===== How SnapRAID Works ===== | ||
- | |||
- | To explain SnapRAID, a comparison to RAID5 may be helpfull.\\ | ||
- | ---- | ||
- | \\ | ||
- | |||
- | < | ||
- | |||
- | SnapRAID is in between RAID and a Backup program trying to get the best benefits of both. During normal operation SnapRAID does not affect data in any way.\\ | ||
- | \\ | ||
- | Features:\\ | ||
- | * Can protect the contents of multiple disks. | ||
- | * Filesystem types are irrelevant but simple filesystems, | ||
- | * Calculates file parity information on demand. | ||
- | * Different sized disks can be protected without losing storage space. | ||
- | * Can reconstruct a failed hard drive. | ||
- | * Can restore deleted files. | ||
- | * Uses a check summing hash that protects against silent corruption (bit-rot), with the ability to reconstruct corrupted files. | ||
- | * A disk can be added or removed at any time. | ||
- | * Can be removed at any time without the need to recover or move data. | ||
- | |||
- | |||
- | {{ : | ||
- | |||
- | ---- | ||
- | |||
- | < | ||
- | |||
- | |||
- | Traditional RAID5 stripes data and interleaves parity information across multiple drives.\\ | ||
- | \\ | ||
- | Features:\\ | ||
- | * Can aggregate a collection of disks into a pool that appears, to the OS, to be a single drive. | ||
- | * Can use dissimilar sized disks (software RAID) but the array total will limit larger disks to the smallest disk size. (Hardware RAID may require identical disks.) | ||
- | * Calculates parity on the fly. | ||
- | * An array can operate with one member disk disabled. | ||
- | * Can reconstruct a failed hard drive. | ||
- | * Provides a parallel I/O speed boost. | ||
- | |||
- | {{ :: | ||
- | |||
- | |||
- | |||
- | ===== Installation ===== | ||
- | |||
- | In OMV6's GUI:\\ | ||
- | Under **System**, **Plugins**, | ||
- | \\ | ||
- | |||
- | ===== Initial Configuration ===== | ||
- | \\ | ||
- | < | ||
- | \\ | ||
- | Under **Services**, | ||
- | \\ | ||
- | In the **Drive** field: | ||
- | In the **Name** field: | ||
- | **Check the boxes** for **Content** and **Data**\\ | ||
- | When finished, click the **Save** button. | ||
- | \\ | ||
- | {{ : | ||
- | \\ | ||
- | < | ||
- | \\ | ||
- | \\ | ||
- | \\ | ||
- | < | ||
- | < | ||
- | <table width=" | ||
- | <tr> | ||
- | <td colspan=" | ||
- | < | ||
- | </td> | ||
- | </tr> | ||
- | <tr> | ||
- | <td style=" | ||
- | Since at least one good copy of the Content File is required for a full drive restoration, | ||
- | </tr> | ||
- | </ | ||
- | </ | ||
- | </ | ||
- | \\ | ||
- | ---- | ||
- | \\ | ||
- | < | ||
- | |||
- | \\ | ||
- | Again, the parity disk must be the same size, or larger, than the largest drive in the collection of disks to be protected. | ||
- | |||
- | In the **Drive** field: | ||
- | In the **Name** field: | ||
- | **Check the box** for **Parity**.\\ | ||
- | |||
- | {{ :: | ||
- | \\ | ||
- | < | ||
- | |||
- | ---- | ||
- | \\ | ||
- | < | ||
- | \\ | ||
- | {{ :: | ||
- | \\ | ||
- | ---- | ||
- | |||
- | ===== SnapRAID Initialization ===== | ||
- | |||
- | The functions of SnapRAID are supported after the first running of the '' | ||
- | \\ | ||
- | Under **Services**, | ||
- | A window will pop up that will show the progress of the Sync operation. | ||
- | \\ | ||
- | The remainder of the tools, under the tools icon, can be used for manual operations within the GUI. Some of these tools are discussed in manual operations. | ||
- | |||
- | {{ :: | ||
- | ===== SnapRAID Administration ===== | ||
- | |||
- | There are two methods of SnapRAID Administration and maintenance, | ||
- | |||
- | ==== Basic Order of Operations ==== | ||
- | |||
- | The basic order of SnapRAID maintenance operations is; **Diff**, **Sync** and **Scrub**. \\ | ||
- | \\ | ||
- | ---- | ||
- | === Diff === | ||
- | \\ | ||
- | The **Diff** plugin tool is found under **Services**, | ||
- | (The CLI command is '' | ||
- | \\ | ||
- | Diff (short of " | ||
- | |||
- | In accordance with the above, it's important to determine the following: | ||
- | |||
- | - Were there excessive deletes?\\ | ||
- | In normal data operations a hand full of user deletes are expected. | ||
- | - Were there an excessive number of updated or modified files?\\ | ||
- | In most cases, administrators will have a rough idea of what is normal for updated or modified files. | ||
- | |||
- | In either case, if there are excessive deletes or an excessive number of updated / modified files, **Diff** settings within this plugin can be used to __stop__ an automated sync operation, allowing for the recovery of deleted or modified files. | ||
- | |||
- | |||
- | < | ||
- | |||
- | {{ :: | ||
- | \\ | ||
- | ---- | ||
- | |||
- | |||
- | === Sync === | ||
- | |||
- | The **Sync** plugin tool is found under **Services**, | ||
- | (The CLI command is '' | ||
- | |||
- | After the initial sync, subsequent sync operations log new or changed file information into content file(s). Sync also creates new checksums and updates parity information for the same files. | ||
- | |||
- | **Sync considerations**: | ||
- | * It's important to note that when checksums and parity information are updated for changed files, it won't be possible to restore files or folders to their previous state. | ||
- | * When Sync is running, avoid adding or deleting files during the process. | ||
- | * If automation is used, schedule sync operations for after hours periods where changing or adding files is unlikely to occur. | ||
- | |||
- | ---- | ||
- | |||
- | |||
- | === Scrub === | ||
- | |||
- | The **Scrub** plugin tool is found under **Services**, | ||
- | (The CLI command is '' | ||
- | \\ | ||
- | Scrub uses file information and their checksums, to check for the presence and health of files and to detect bad blocks.\\ | ||
- | \\ | ||
- | < | ||
- | |||
- | {{ :: | ||
- | ---- | ||
- | |||
- | If bad blocks are found, during the scrub, SnapRAID **status** will list them. (In the GUI, **SnapRAID status** is found under **Services**, | ||
- | |||
- | {{ :: | ||
- | \\ | ||
- | The **Fix** command, executed on the command line, will repair bad blocks.\\ | ||
- | Use '' | ||
- | Then use '' | ||
- | |||
- | ---- | ||
- | |||
- | |||
- | === Fix === | ||
- | The **Fix** plugin tool is found under **Services**, | ||
- | (The CLI command is '' | ||
- | \\ | ||
- | If files are missing, that were not intentionally deleted, use the **Fix** tool to recover them. | ||
- | \\ | ||
- | {{ :: | ||
- | \\ | ||
- | If using the plugin' | ||
- | \\ | ||
- | \\ | ||
- | === Summary === | ||
- | |||
- | * **Diff** checks for the number of added, changed and restored files, before a sync operation. | ||
- | * **Sync** adds new files to content file(s), assigns checksums to new files and resets checksums to existing but changed files. | ||
- | * **Scrub** checks for parity errors and bad blocks. | ||
- | \\ | ||
- | ---- | ||
- | ===== Automation ===== | ||
- | |||
- | Automation of SnapRAID housekeeping is done with what is known as a " | ||
- | \\ | ||
- | In a Diff script, the first command **Diff**, checks primarily for changed or added files. | ||
- | The second command, **Sync**, catalogues new files and assigns checksums and creates parity information for them. **Sync** also updates checksums and parity information for changed files.\\ | ||
- | Finally the third command, **Scrub**, is run to check the health of a specified percentage of existing files.\\ | ||
- | \\ | ||
- | This plugin provides **Diff script functionality**, | ||
- | \\ | ||
- | |||
- | ==== Diff Script Setup ==== | ||
- | \\ | ||
- | Under, **Services**, | ||
- | \\ | ||
- | The following screen is where various parameters for the SnapRAID plugin' | ||
- | |||
- | 1. The defaults in these fields are fine, for most users.\\ | ||
- | 2. **Send Mail** will work only if users have configured and tested notifications, | ||
- | 3. **Run Scrub** | ||
- | 4. **Pre-hash** is an option that is used together with the** Sync** command. | ||
- | 5. **Scrub Percentage** and **Scrub Frequency**. | ||
- | - When Scrub Frequency is specified, (in this instance " | ||
- | - When Scrub Percentage is specified, (in this instance " | ||
- | \\ | ||
- | With a scrub percentage of 25, with scrubs scheduled to run once a week, the entire array will be scrubbed once a month. | ||
- | \\ | ||
- | 6. The **Update Threshold** and **Delete Threshold** are parameters for the Diff script. | ||
- | * **Update Threshold** sets the upper limit allowed for new files and updated / altered files. | ||
- | * **Delete Threshold** sets the upper limit for allowed file deletes. | ||
- | If either of the above thresholds are exceeded, the Diff script will halt and an E-mail will be sent to the user admin advising of the result. | ||
- | (As noted in the GUI, if these thresholds are set to 0, Sync and Scrub will be performed regardless.) | ||
- | \\ | ||
- | \\ | ||
- | {{ :: | ||
- | \\ | ||
- | < | ||
- | |||
- | \\ | ||
- | \\ | ||
- | ---- | ||
- | | ||
- | |||
- | \\ | ||
- | === Scheduling the Diff Script === | ||
- | |||
- | In the screen shown above, click on **Schedule Diff**. | ||
- | |||
- | This example is configured as follows: | ||
- | * The **Enabled** box is checked. | ||
- | **Under Time of execution**: | ||
- | * As shown in **Minute** and **Hour**, the Diff Script will start at **01:05AM** | ||
- | * In this case, under **Day of the week**, jobs are run only on **Sunday**. | ||
- | * Check the **Send command output via email**. | ||
- | * Finally, click the **Save** button. | ||
- | |||
- | {{ : | ||
- | \\ | ||
- | < | ||
- | < | ||
- | |||
- | \\ | ||
- | ---- | ||
- | |||
- | |||
- | |||
- | === Diff Script Considerations === | ||
- | |||
- | A consideration, | ||
- | \\ | ||
- | When it comes to speed of operations, in most use cases, **Diff** and **Sync** will be fast. However, depending on the scrub percentage chosen, the total amount of data on the collection of protected disks, the speed of protected disks and other factors, a **Scrub** may take **several hours**. | ||
- | \\ | ||
- | \\ | ||
- | |||
- | |||
- | ===== Notes ===== | ||
- | * Docker Containers that are stored on data drives should be paused or stopped during a sync. Otherwise sync errors may result. | ||
- | * It is recommended that SnapRAID' | ||
- | \\ | ||
- | \\ | ||
- | |||
- | |||
- | |||
- | |||
- | ==== Other Useful Command Line Tools ==== | ||
- | '' | ||
- | If files are detected with "zero sub-second timestamps", | ||
- | \\ | ||
- | '' | ||
- | If parity issues with the parity drive are persistent and the user admin is reasonably sure there are no data issues, the command '' | ||
- | |||
- | ---- | ||
- | |||
- | ===== Source Code ===== | ||
- | |||
- | -> [[https:// | ||
- | |||
- | \\ | ||
- | \\ | ||
- | {{ : | ||
- | ===== Recovery Operations ===== | ||
- | |||
- | Recovery operation examples for single files, missing files, etc., are provided in the -> [[https:// | ||
- | |||
- | ==== Recovering a Failed Drive ==== | ||
- | |||
- | === General === | ||
- | |||
- | One of the more desirable features of SnapRAID is it's ability to restore data to a replacement drive. | ||
- | \\ | ||
- | Configuring [[https:// | ||
- | \\ | ||
- | When it has been determined that a drive is beginning to fail, it is crucial that user / admins **DO NOT** run the **Diff Script** OR a manual SYNC operation. | ||
- | \\ | ||
- | \\ | ||
- | === Prepare for Drive Replacement === | ||
- | | ||
- | Replacing a failing or failed drive, requires a number of preliminary steps: | ||
- | |||
- | * First it's crucial that the **Diff script**, if automated, is turned **OFF**. | ||
- | * Do not run a **Snyc** operation until after the replacement is completed. | ||
- | * If user / admins have automated processes (downloaders, | ||
- | * Docker containers stored on protected drives should be paused or turned off. | ||
- | * Users should be informed to not use the server during the replacement. | ||
- | |||
- | |||
- | ---- | ||
- | |||
- | === Failure Scenario === | ||
- | \\ | ||
- | In the following scenario, a SnapRAID protected drive has failed completely.\\ | ||
- | \\ | ||
- | When server notifications are -> [[https:// | ||
- | |||
- | {{ :: | ||
- | |||
- | ---- | ||
- | |||
- | < | ||
- | |||
- | \\ | ||
- | {{ :: | ||
- | \\ | ||
- | |||
- | Physically remove the drive. | ||
- | |||
- | {{ : | ||
- | ---- | ||
- | |||
- | After physically removing the failed or failing drive, add the new drive noting it's serial number. | ||
- | |||
- | {{ :: | ||
- | |||
- | In the majority of cases, a " | ||
- | ---- | ||
- | |||
- | Under Storage, Filesystems: | ||
- | |||
- | Click the " | ||
- | In this example case, the file system selected from the pop-down will be **EXT4**.\\ | ||
- | In the **Device *** pop-down, **/ | ||
- | Click the **Save** button. | ||
- | |||
- | {{ :: | ||
- | |||
- | When the format is complete, click the **Close** button. | ||
- | ---- | ||
- | The following **Mount** window will be immediately presented. | ||
- | In the **File system *** field, click the pop-down **arrow** and select the formatted drive (**/ | ||
- | Then click **Save** and **apply** the configuration change. | ||
- | |||
- | {{ :: | ||
- | \\ | ||
- | ---- | ||
- | Under Storage, File Systems: | ||
- | |||
- | /dev/sde1 appears, empty and formatted to EXT4. | ||
- | |||
- | {{ :: | ||
- | |||
- | ---- | ||
- | |||
- | Under Services, SnapRAID, Drives: | ||
- | \\ | ||
- | {{ : | ||
- | |||
- | \\ | ||
- | |||
- | Highlight each drive, one at a time, and click on the **Edit** icon {{: | ||
- | ---- | ||
- | |||
- | A normal drive entry appears as follows. | ||
- | |||
- | {{ :: | ||
- | ---- | ||
- | |||
- | A missing drive appears as follows. | ||
- | |||
- | {{ :: | ||
- | |||
- | In the Drive field, using the pop-down arrow, select the new drive that has been wiped and formatted from the list.. | ||
- | |||
- | {{ :: | ||
- | |||
- | **Apply** the configuration change.\\ | ||
- | \\ | ||
- | At this point, the failed drive has been replaced with a new formatted but blank drive. | ||
- | ---- | ||
- | |||
- | === Restore Data === | ||
- | |||
- | In the same window (Services, SnapRAID, Drives) select the Tools icon {{: | ||
- | |||
- | Depending on the size and speed of the drive and the amount of data, the Fix command may run for several hours.\\ | ||
- | When END OF LINE is displayed, the FIX operation is complete.\\ | ||
- | \\ | ||
- | The following is the end of this example' | ||
- | \\ | ||
- | {{ :: | ||
- | \\ | ||
- | (The UNRECOVERABLE error is likely due to a change made after the last Sync operation, that cannot be restored.) | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | \\ | ||
- | |||
- | \\ | ||
- | \\ | ||
- | \\ | ||
- | \\ | ||
- | \\ | ||
- | \\ | ||
- | \\ | ||
- | \\ | ||
- | \\ | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | < | ||
- | < | ||
- | <table width=" | ||
- | <tr> | ||
- | <td colspan=" | ||
- | < | ||
- | </td> | ||
- | </tr> | ||
- | <tr> | ||
- | <td style=" | ||
- | Device names, dev/sda1, dev/sdb1, etc., may be reordered when a device goes " | ||
- | </tr> | ||
- | </ | ||
- | </ | ||
- | </ | ||
- | \\ | ||
- | \\ | ||
- | Add the **Mount Point** column: | ||
- | Under **Storage**, | ||
- | \\ | ||
- | {{ :: | ||
- | \\ | ||
- | ---- | ||
- | \\ | ||
- | The result now shows mount points by UUID. Where device names may be reordered on bootup, UUID's do not change.\\ | ||
- | \\ | ||
- | {{ :: | ||
- | \\ | ||
- | Note the "Copy and Paste" Icon{{: | ||
- | |||
- | | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||