B2 Resiliency, Durability and Availability B2 Resiliency, Durability and Availability

B2 Resiliency, Durability and Availability

Christopher Christopher

Backblaze's B2 Cloud Storage takes advantage of the Backblaze Vaults architecture to create a highly resilient, durable, and available storage service. The Vault architecture calculates at 99.999999999% (11 x 9s) annual durability.

Distributing Data

A Backblaze Vault is comprised of 20 Storage Pods, with the data evenly spread across all 20 pods. Each Storage Pod in a given vault has the same number of drives, and the drives are all the same size.

Drives in the same drive position in each of the 20 Storage Pods are grouped together into a storage unit we call a “tome”. Each file is stored in one tome, and is spread out across the tome for reliability and availability.

Every file uploaded to a Vault is broken into pieces before being stored. Each of those pieces is called a “shard”. Parity shards are added to add redundancy, so that a file can be fetched from a vault even if some of the pieces are not available.

Each file is stored as 20 shards: 17 data shards and 3 parity shards. Because those shards are distributed across 20 storage pods in 20 cabinets, the Vault is resilient to the failure of a storage pod, power loss to an entire cabinet or even a cabinet level networking outage.

Files can be written to the Vault when one pod is down, and still have 2 parity shards to protect the data. Even in the extreme and unlikely case where three Storage Pods in a Vault are offline, the files in the vault are still available because they can be reconstructed from the 17 pieces that are available. 

Reed-Solomon Erasure Coding Implementation

Just like RAID implementations, the Vault software uses Reed-Solomon erasure coding to create the parity shards. But, unlike Linux software RAID, which offers just 1 or 2 parity blocks, our Vault software allows for an arbitrary mix of data and parity. We are currently using 17 data shards plus 3 parity shards.

The beauty of Reed-Solomon is that we can then re-create the original file from any 17 of the shards. If one of the original data shards is unavailable, it can be re-computed from the other 16 original shards, plus one of the parity shards. Even if three of the original data shards are not available, they can be re-created from the other 17 data and parity shards.