DMTN-201: USDF Object Storage Architecture and Planning

  • Yee-Ting Li

Latest Revision: 2021-08-13

Note

This technote is not yet published.

We describe the architectural choices and tradeoffs for the large scale object storage to be used at the USDF.

1   Introduction

Through the decade’s long planned operation of the Rubin telescope, we can expect in the order of 50+PB of raw images and over 600PB of other data. The challenges and decisions of its architecture, deployment, maintainence, and lifecycle are document here.

2   Data Requirements and Challanges

2.1   Data Volume, Variety and Access Patterns

  • overview of the types of data products and size

2.2   Challenges and Remediations

  • highlight specific things we need to worry about
  • high level posisble solutions

3   Architectural Motivation

  • repeatable: well documented
  • scalable: both management and performance
  • robust: tiering

4   Technical Design

4.1   Hardware

  • define standard building blocks of storage
  • define performance envelopes
  • define resilience of solutions

4.2   Software

  • overview of software solutions
  • pros/cons of ceph/minio
  • supportabilty of solutions

5   Operational Processes

5.1   Deployment

  • large amounts of storage added per year
  • easy to deploy, consistent and repeatable

5.2   Monitoring

  • hardware and software steady state and failure reporting requirements
  • environmentals and zoning?

5.3   Common Tasks

  • what does hardware failure look like? disks, servers, racks?
  • what are the high level responsibilities and roles required?

5.4   Life-cycle

  • what procedures required to replace hardware?

6   Proof of Concept

6.1   Scope and what are we trying to achieve

  • why kubernetes?
  • describe why minio and direct-csi

6.2   Deployment Experience

  • deployment steps and references

6.3   Operational Experience

  • performance benchmarking
  • simulating failures

7   Initial Hardware and Software Choices

  • dell xe7100 vs wd data102 vs seagate 4u100
  • what and why