Datastore Resilience

Motivation

Resilience is vital for data storage. Loss of a single component should never cause data loss. Loss of a single component should not remove our ability to determine if data has been corrupted.

Resilience in Neutron

Neutron provides resilience using a 3-way replica by default. This means that data written to Neutron always exists in three copies, enabling us to easily decide if one of those copies has become corrupt on-disk and repair it. Detecting corruption this way is vital to prevent data degradation and bit-rot in storage.

Neutron’s default datastore is created as a 3-way replica to enable smaller clusters with only 3 storage servers, but still provide resilience and the ability to detect and repair corrupt data.

3-way replica is a computationally cheap operation and provides good performance to VMs. However, a 3-way replica limits storage efficiency of the underlying devices to only 33.33% of the raw capacity.

Erasure coding

Erasure coding improves the storage efficiency and enables VMs to use a higher percentage of the underlying capacity. The caveat is that a larger number of hosts are required to ensure that resiliency is available. Erasure coding splits data into a number of chunks k and uses these to generate additional coded chunks m. Each of the coded m chunks can effectively recover any of the k chunks if one is lost.

VM Squared supports the following Erasure Coding profiles

    * 4 + 2 (k = 4, m = 2) (66.67% Storage Efficiency)
    * 8 + 4 (k = 8, m = 4) (66.67% Storage Efficiency)
    * 8 + 3 (k = 3, m = 3) (72.73% Storage Efficiency)

When creating additional datastores you can elect to use 3-way replica or one of the above Erasure Coding schemes.

SoftIron recommends k + m + 1, which allows recovery in the case where a whole storage node goes offline.

Setting m to a value of 2 will create a system with minimum robustness and increase the risk of inaccessibility.