Backups service
snapperd was introduced in HyperCloud 2.x as a high performance replacement for the previous snapshot daemon.
Info
HyperCloud 2.2.x introduced support for retrying missed or failed backups.
- In the event a backup window was missed due to cluster maintenance or other activities, if the next backup window has not been crossed, the backup service will immediately kick off a backup.
- In the event a backup fails, it will retry until the next backup window.
- In either event, crossing the threshold of the next backup window will result in that backup being taken instead. That is, only one missed or failed backup will be queued at any given time.
Advisement
There are several simple rules to follow when creating snapshots for recovery:
- Snapshot deltas (changes since last snap) directly and linearly impact transfer times when remote and/or archive backups are configured.
- Only take as many snapshots as needed for recovery point objectives (RPO). Subsequent snapshots degrade performance.
- Be cautious when setting snapshot frequency. Scheduling more than one snapshot per hour can negatively affect performance for the system's workload, or the entire cluster. Always prioritize the actual RPO needs of the system.
- The total storage usage for an image is the sum of all its snapshot deltas, plus the base image size. Images that undergo significant changes quickly will have increased storage usage. This happens because of how copy-on-write snapshots work.
- Multiple factors impact snapshot transfer speeds for remote and/or archive backups. Depending on the configuration, snapshot transfers may be skipped if a transfer takes too long and rolls into the next scheduled transfer. Some things to consider include:
- Local or remote cluster I/O capability, which is dramatically impacted by running workloads.
- Network connectivity between clusters, which includes latency, packet loss, bandwidth, and MTU.
Usage
There are three types of snapshots:
- Local
- Remote
- Archive
Local snapshots are stored on the local HyperCloud system.
Remote snapshots are copies of local snapshots that are stored on another (remote) HyperCloud system.
Archive snapshots are copies of local snapshots that are stored on another system that runs the S3 protocol.
Snapshot schedules
The schedule is a mix of destinations:
- Local
- Remote
- Archive
And frequencies:
- Hourly
- Daily
- Weekly
- Monthly
- Yearly
Hourly
An hourly schedule for local snapshots has the format:
Where:
NUM
(required) is the number of snapshots to keep on the local system.@MINUTE
(optional) is the minute of the hour to take the snapshot.- Valid
MINUTE
values are 0 - 59.
- Valid
Example
This denotes that the system will keep 4 hourly snapshots on the local system and the snapshots for this VM will occur on the 20th minute of each hour; that is, at 1:20, 2:20, 3:20, etc.Daily
A daily schedule for local snapshots has the format:
Where:
NUM
(required) is the number of snapshots to keep on the local system.@HOUR
(optional) is the hour of the day to take the snapshot.- Valid
HOUR
values are (denoted in 24 hour clock) 0 - 23.
- Valid
Example
This denotes that the system will keep 3 daily snapshots on the local system and the snapshots for this VM will occur once per day at 3 AM.Weekly
A weekly schedule for local snapshots has the format:
Where:
NUM
(required) is the number of snapshots to keep on the local system.@DAYOFWEEK
(optional) is the day of the week to take the snapshot.- Valid
DAYOFWEEK
values are:- Numbering the days of the week (starting with Sunday): 0 - 6
- Abbreviating the days of the week: Sun, Mon, Tue, Wed, Thu, Fri, Sat
- Writing out the full name of the days of the week: Sunday, Monday, Tuesday, etc.
- Valid
Example
This denotes that the system will keep 2 weekly snapshots on the local system and the snapshots for this VM will occur once per week on Wednesday. The latter example is an equivalent scheduling to the former listing.Monthly
A monthly schedule for local snapshots has the format:
Where:
NUM
(required) is the number of snapshots to keep on the local system.@DAYOFMONTH
(optional) is the day of the month to take the snapshot.- Valid
DAYOFMONTH
values are the numbers of the day of the month: 1 - 31.
- Valid
Example
This denotes that the system will keep 5 monthly snapshots on the local system and the snapshots for this VM will occur once per month on the 15th day of the month.Yearly
A yearly schedule for local snapshots has the format:
Where:
NUM
(required) is the number of snapshots to keep on the local system.@DAYOFYEAR
(optional) is the day of the month to take the snapshot.- Valid
DAYOFYEAR
values are:- Numbered months of the year: 01 - 12
- Days of the year: 001 - 366
- Abbreviated names of the month: Jan, Feb, Mar, Apr, May, Jun, July, August, Sep, Oct, Nov, or Dec
- Full names of the months: January, February, March, etc.
- Valid
Example
This denotes that the system will keep 2 yearly snapshots on the local system and the snapshots for this VM will occur once per year on January 15th.Or,
Additional example
This denotes that the system will keep 2 yearly snapshots on the local system and the snapshots for this VM will occur once per year on May 1st.Combining Schedules
You can combine local, remote, and archive schedules.
Example
This denotes that the system will take hourly local snapshots at the top of each hour and keep the most recent two (2). Additionally, at the top of each hour, it will copy the local snapshot to a remote system, where it will save the most recent four (4). Furthermore, during this same instance, another copy of the snapshot will be saved on an archive system, where the three (3) most recent snapshots filed will be maintained.Hourly, Daily, Weekly, Monthly, and Yearly schedules can be combined.
Example
This denotes that the system will take hourly local snapshots at the top of each hour and keep the most recent two (2). It will also take a daily snapshot at midnight and keep the most recent seven (7), a full week. Additionally, it will take a weekly snapshot on Sunday at midnight and keep the most recent four (4).Scheduling Snapshots on a VM
To schedule snapshots for a VM, a SNAPSHOT_SCHEDULE
Attribute needs to be added to the VM and the Value will be the snapshot schedule.
An example of the Attribute and Value are depicted below:
Configuring remote and archive backup targets
Remote configuration
Remote snapshots are copies of local snapshots that are stored on another (remote) HyperCloud system. We use RBD commands within Ceph to export a local snapshot to a remote HyperCloud system. The snapshot is then transferred to the remote system via SSH.
Since SSH is used as a transport, the public key for the local system must be copied to the remote system.
The public key for the user oneadmin
must be used.
This key can be found on the local system in:
remote.json
:
{
"destination": "remote",
"host": "10.1.2.3",
"image_prefix": "sailfish",
"pool": "rbd",
"ssh_options": "",
"compress": "",
"decompress": ""
}
destination
: Must beremote
host
: IP address of remote HyperCloud cluster's dashboardimage_prefix
: The label that will be the prefix of the RBD image on the remote system that holds the snapshotspool
: The name of the pool on the remote system that the snapshots should be stored inssh_options
: Any extra ssh options needed to login to the remote systemcompress
: Optional Must be a valid compress command (i.e.,bzip2 --compress
)decompress
: Optional Must be a valid decompress command (i.e.,bzip2 --decompress
)
Archive configuration
Archive snapshots are copies of local snapshots that are stored on another system that runs the S3 protocol.
The archive snapshot configuration is defined in a file named:
Here is a samplearchive.json
:
{
"destination": "archive",
"bucket": "sailfish",
"accesskey": "JQ37MHYZBMKZ1GKJ1Y78",
"secretkey": "0fYiDTZqv4nbTHIbpgo15zvpQ8qhUVOziGmhxbIB",
"host": "10.1.2.3",
"port": 7480,
"options": "--no-ssl",
"compress": "cat",
"decompress": "cat"
}
destination
: Must bearchive
.bucket
: The name of the S3 bucket to store the snapshots.accesskey
: Your S3 access key.secretkey
: Your S3 secret key.host
: The IP address of the S3 endpoint.port
: The port number of the S3 endpoint.options
: Optional. Extra flags thats3cmd
needs to upload snapshots.compress
: Optional. Must be a valid compress command (i.e.,bzip2 --compress
).decompress
: Optional. Must be a valid decompress command (i.e.,bzip2 --decompress
).
snapctl commands
The syntax for the snapctl
command is snapctl [argument(s)] [option(s)]
.
list schedules
# snapctl list schedules --help
List snapshot schedules for all VMs
Usage:
snapctl list schedules [flags]
Flags:
--config-dir string path to configuration files (default "/var/run/cluster-control/facts/snapshot")
--debug print API calls
--endpoint string url for ONE API
-h, --help help for schedules
--log string log level (default "warn")
--password string password for authentication to the ONE API
--username string username for authentication to the ONE API
list snapshots
# snapctl list snapshots --help
Lists existing snapshots
Usage:
snapctl list snapshots [flags]
Flags:
--all list local,archive and remote snapshots
--archive list archive snapshots
--config-dir string path to configuration files (default "/var/run/cluster-control/facts/snapshot")
--debug print API calls
--endpoint string url for ONE API
-h, --help help for snapshots
--local list local snapshots (default true)
--log string log level (default "warn")
--manual list manual snapshots (default true)
--password string password for authentication to the ONE API
--remote list remote snapshots
--username string username for authentication to the ONE API
list work
# snapctl list work --help
List status of snapshot daemon
Usage:
snapctl list work [flags]
Flags:
-h, --help help for work
--host string snapperd hostname (default "localhost")
--port int snapperd port number (default 7627)
nuke snapshots
# snapctl nuke snapshots --help
Nuke existing snapshots
Usage:
snapctl nuke snapshots [flags]
Flags:
--all nuke local,archive and remote snapshots
--archive nuke archive snapshots
--config-dir string path to configuration files (default "/var/run/cluster-control/facts/snapshot")
--debug print API calls
--endpoint string url for ONE API
-h, --help help for snapshots
--local nuke local snapshots (default true)
--log string log level (default "warn")
--password string password for authentication to the ONE API
--remote nuke remote snapshots
--username string username for authentication to the ONE API