Note

This technote is not yet published.

We describe the architectural choices and tradeoffs for the large scale object storage to be used at the USDF.

1 Introduction¶

Through the decade’s long planned operation of the Rubin telescope, we can expect in the order of 50+PB of raw images and over 600PB of other data. The challenges and decisions of its architecture, deployment, maintainence, and lifecycle are document here.

2 Data Requirements and Challanges¶

2.1 Data Volume, Variety and Access Patterns¶

overview of the types of data products and size

2.2 Challenges and Remediations¶

highlight specific things we need to worry about
high level posisble solutions

3 Architectural Motivation¶

repeatable: well documented
scalable: both management and performance
robust: tiering

4 Technical Design¶

4.1 Hardware¶

define standard building blocks of storage
define performance envelopes
define resilience of solutions

4.2 Software¶

overview of software solutions
pros/cons of ceph/minio
supportabilty of solutions

5 Operational Processes¶

5.1 Deployment¶

large amounts of storage added per year
easy to deploy, consistent and repeatable

5.2 Monitoring¶

hardware and software steady state and failure reporting requirements
environmentals and zoning?

5.3 Common Tasks¶

what does hardware failure look like? disks, servers, racks?
what are the high level responsibilities and roles required?

5.4 Life-cycle¶

what procedures required to replace hardware?

6 Proof of Concept¶

6.1 Scope and what are we trying to achieve¶

why kubernetes?
describe why minio and direct-csi

6.2 Deployment Experience¶

deployment steps and references

6.3 Operational Experience¶

performance benchmarking
simulating failures

7 Initial Hardware and Software Choices¶

dell xe7100 vs wd data102 vs seagate 4u100
what and why

DMTN-201: USDF Object Storage Architecture and Planning