scaleoutSean's demos and how-to's

4:00

Velero CSI Snapshot Data Mover with NetApp Trident and SolidFire

This feature performs incremental "direct backup to S3" with the advantage of doing it off a CSI volume cloned from a CSI snapshot. So rather than copying files one by one (while they change) off a live volume using Kopia or Restic, this thing snaps the PV, clones the volume and backs it up to S3. More here: https://scaleoutsean.github.io/2023/09/15/velero-csi-snapshot-data-movement-with-netapp-solidfire.html

8 months ago 13 views

1:27

SolidFire Volume Clone Demo

Just a quick demo of creating a volume clone from a volume with 143G of Android source code data. Some platforms may be able to create clones faster, but they may suffer from contention. In the case of SolidFire they do not suffer from contention. More here: https://scaleoutsean.github.io/2023/08/30/monitoring-solidfire-clone-and-backup-jobs.html

8 months ago

6:40

SolidBackup with Kopia

Use SolidBackup scripts to clone and sync volumes and make them available to a "backup VM" which also has Kopia installed. Then, with these cloned volumes that need a backup accessible in this VM, define and run Kopia backup jobs to backup data to S3 or B2. This may be suitable for department scale environments where different application owners can give the administrator access to their data for data protection purposes. The entire process is administrator-managed, there are no user-managed steps. Additional details: https://scaleoutsean.github.io/2023/09/03/solidbackup-with-kopia.html

8 months ago 8 views

7:28

Example of using per-site and Erasure Coding rules and ILM policies

I use StorageGRID 11.7 to illustrate how to change the default StorageGRID "2 copies (on randomly selected storage nodes)" ILM rule to custom "1 copy per site DC1 and DC2 with EC for larger objects". On the one hand that takes one minute to do, on the other you don't want to do it wrong and there's usually some additional rule or rules that people want, so don't do this on production storage without RTFM and even testing in the lab.

8 months ago 1 view

Expand E-Series DPP (pool) and grow and shrink its preservation capacity

2:06

6:13

1:46

Simple multi-site or hybrid cloud workflow for S3 analytics with ONTAP S3

In this video I use the native S3 bucket created in the earlier video at https://rumble.com/v32fmic-brief-walk-through-over-ontap-native-and-multi-protocol-s3-services.html The idea is to show how one could organize multi-location ingress and processing: - Site A generates or ingresses data via S3 or NFS (in the case of multi-protocol ONTAP S3 buckets) - We use SnapMirror S3 to replicate to another ONTAP system such as AWS FSxN in the cloud. For multi-protocol buckets we could use SnapMirror (not SnapMirror S3!), CloudSync, rclone or some other utility - Site B can read data using NFS (multi-protocol) or S3 - whichever works better. It is recommended to use the same config (either multi-protocol, or "pure" S3) on both sides to avoid incompatibility issues. The video shows an example with native S3 buckets on both sides. On-the-fly conversion from Parquet file to Panda dataframes is meant to show sometimes data doesn't even need to be copied off S3 to local disk to be converted, which is convenient as the clients don't even need to mount NFS. The video is a bit short for so many steps, but you can check ONTAP S3-related and analytics-related solutions documentation for more comprehensive descriptions of such workflows.

9 months ago 8 views

3:04

Brief walk-through over ONTAP "native" and multi-protocol S3 services

tldr; - ONTAP "native" S3 has more complete AWS S3 API support, but is S3-only - ONTAP "multiprotocol" S3 is S3 service that usually runs on NFS shares; it lets you combine NFS uploads/downloads with S3 PUTs/GETs, but its S3 API support is more limited because it's impossible to perfectly translate the differences between protocols Users could run both, for different use cases and applications. Applications that use complex S3 API methods are less likely to work well on multi-protocol buckets.

9 months ago 6 views

3:03

3:53

How versioning and WORM-like ACLs work on NetApp StorageGRID

Versioning is used to provide access to previous revisions of an object (e.g. GET object.mp3?v=2 gets revision #2 of the object). *If* users are allowed to overwrite objects but *not allowed* to delete old versions (not the default!), then objects practically become indelible. But each revision is a copy that takes up disk space, so the benefit of versioning should be higher than its cost. One popular feature used in conjunction with S3 versioning is S3 Object Lock with specified retention, which guarantees retention until a certain date, but unlocks and allows deletion of older objects - very useful for backups that need to be ransomware-resistant as long as they're needed. But even without any of these tricks, the versioning feature protects files from accidental deletion or change, as you can always GET object.mp4?v=2 and re-upload it to recover from deleting the object or uploading a wrong revision 3. Wondering about Object Lock with S3 versioning vs. legacy "Compliance"? See https://docs.netapp.com/us-en/storagegrid-117/ilm/managing-objects-with-s3-object-lock.html "Software WORM" or ACLs-based WORM is simpler: it aims to prevent users from modifying (and hence also deleting) existing objects. To do that we craft a bucket policy ACL that prevents these requests to non-admin users. Obviously this isn't as robust, but it serves many purposes, including prevention from accidental modification or deletion of files, and unlike versioning, does not take extra storage space.

9 months ago 7 views

6:19

Use Elasticsearch to store NetApp StorageGRID audit log and build search index for objects

Prior to StorageGRID 11.6, StorageGRID couldn't forward audit log to external syslog servers. You had to copy it off the primary admin node, convert to JSON and upload. https://github.com/scaleoutsean/storagegrid-audit-analysis Version 11.6 has audit log forwarding. This demo shows StorageGRID 11.7 and Elasticsearch 8.7.1: a) Audit log forwarding: forwards audit log to Logstash which processes it and forwards to Elasticsearch See https://docs.netapp.com/us-en/storagegrid-enable/tools-apps-guides/elk-instructions.html for more. b) Platform services (search): configure StorageGRID to send event updates to Elasticsearch API endpoint. These updates are JSON files with system and object metadata, and allow us to search for various properties in Elasticsearch. See https://docs.netapp.com/us-en/storagegrid-117/tenant/using-search-integration-service.html

9 months ago 12 views

11:10

Use Kasten to backup and restore E-Series Performance Analyzer application

E-Series Performance Analyzer (EPA) is a collector that gathers NetApp NetApp E-Series storage array's metrics and events and stores them in InfluxDB. You can get it on Github. This video shows backup & restore actions of using Kasten. EPA consists of two or more application containers, one Grafana and InfluxDB v1 instance. - Config files - Secrets - Deployment - Service - PVC More: https://scaleoutsean.github.io/2023/02/10/backup-epa-data-on-kubernetes.html

1 year ago 5 views

4:08

KubeVirt with Trident, SolidFire and Kasten

KubeVirt is still a young and maturing product, but it's good enough for experimenting (it's just about to cross that line between frustration and experimentation). This video shows how a Kubevirt VM's persistent volume (PVC) can be protected with storage snapshots, using NetApp Trident v23.01 with SolidFire 12.5. https://scaleoutsean.github.io/2023/02/12/backup-restore-kubevirt-vms-with-solidfire-kasten-kubernetes.html

1 year ago 40 views

4:44

NetApp E-Series Performance Analyzer (EPA) v3.2.0 for Kubernetes

This fork of EPA aimed to separate E-Series collector container(s) from each other and from InfluxDB and Grafana, and that's been achieved in v3.2.0: now it's very easy to run EPA in containers using Docker, Docker Compose, Kubernetes or Nomad. Docker Compose works very similarly - see https://github.com/scaleoutsean/eseries-perf-analyzer There's a video for Docker Compose using EPA v3.1.0 here on Rumble, but in v3.2.0 it's even simpler in v3.2.0 as there's no "make build" for collector containers. Anyway, just see the repo README.md. FAQs: https://github.com/scaleoutsean/eseries-perf-analyzer/blob/master/FAQ.md Blog post for v3.2.0: https://scaleoutsean.github.io/2023/01/14/eseries-performance-analyzer-container-orchestrator-kubernetes.html

1 year ago 18 views

Containerized NetApp Cloud Sync Data Broker

2:49

2:04

E-Series Performance Analyzer 3.1.0 on Kubernetes

3:16

NetApp E-Series SANtricity API with JWT Bearer Tokens

1:55

NetApp E-Series Performance Analyzer walk-through

2:26

1:22

12:22

VMware Tanzu, vSphere CSI Plugin and NetApp E-Series Storage

This video shows the entire process of using NetApp E-Series arrays with VMware Tanzu with vSphere CSI plugin. While this isn't a substitute for reading the official Tanzu and E-Series documentation, it takes you through the whole process in 10 minutes.

2 years ago 26 views

4:48

BeeGFS on ARM64 with BeeGFS CSI

This is an all-CLI video with a walk through of configuring BeeGFS CSI plugin for Kubernetes on ARM64 servers running Ubuntu 20.04 (AWS Graviton, in this particular case). BeeGFS 7.3.0 is the first release with ARM64 support. Some additional information is available at this link: https://scaleoutsean.github.io/2022/04/30/beegfs-csi-on-arm64.html

2 years ago 5 views

4:03

Scaling out IO-intensive parametrized jobs with Nomad and BeeGFS

I call them parametric and HashiCorp calls them parametrized. But in any case, when such jobs run they need plenty of IO bandwidth and parallel file systems still beat Object Stores by a large margin. This video shows how we can scale such jobs with Nomad 1.3.0 with BeeGFS 7.3.0. More here: https://scaleoutsean.github.io/2022/04/24/nomad-batch-job-scale-out-parallel-filesystem-beegfs-e-netapp-series.html

2 years ago 1 view

2:56

HashiCorp Nomad batch jobs with BeeGFS and NetApp E-Series

3:06

scaleoutSean's demos and how-to's

Velero CSI Snapshot Data Mover with NetApp Trident and SolidFire

SolidFire Volume Clone Demo

SolidBackup with Kopia

Example of using per-site and Erasure Coding rules and ILM policies

Expand E-Series DPP (pool) and grow and shrink its preservation capacity

Thanos with NetApp S3 storage

Simple multi-site or hybrid cloud workflow for S3 analytics with ONTAP S3

Brief walk-through over ONTAP "native" and multi-protocol S3 services

Velero with ONTAP S3 backup repository

How versioning and WORM-like ACLs work on NetApp StorageGRID

Use Elasticsearch to store NetApp StorageGRID audit log and build search index for objects

Use Kasten to backup and restore E-Series Performance Analyzer application

KubeVirt with Trident, SolidFire and Kasten

NetApp E-Series Performance Analyzer (EPA) v3.2.0 for Kubernetes

Containerized NetApp Cloud Sync Data Broker

NetApp Cloud Sync API and Elasticsearch

E-Series Performance Analyzer 3.1.0 on Kubernetes

NetApp E-Series SANtricity API with JWT Bearer Tokens

NetApp E-Series Performance Analyzer walk-through

E-Series sizing for BeeGFS

VMware Tanzu, vSphere CSI Plugin and NetApp E-Series Storage

BeeGFS on ARM64 with BeeGFS CSI

Scaling out IO-intensive parametrized jobs with Nomad and BeeGFS

Kanister with BeeGFS CSI and E-Series

HashiCorp Nomad batch jobs with BeeGFS and NetApp E-Series