Cache Smarter,
Compute Anywhere
GPU capacity is scarce. Your data shouldn't be the bottleneck. Amazon FSx for NetApp ONTAP FlexCache moves your data logically — not physically — to wherever compute exists.
GPU Scarcity is a Storage Problem
AI and ML workloads are exploding, but GPU and accelerated compute capacity is unevenly distributed across AWS regions and availability zones. When your data lives in us-east-1 but available GPU capacity is in us-west-2, you face an impossible choice: wait months for GPU allocation or perform an expensive, time-consuming full data migration.
Estimated cost impact: Data copy vs. FlexCache for a 500 TB dataset
FSx ONTAP FlexCache
FlexCache creates a sparse, read-through cache volume that appears as a full dataset to compute nodes — without physically copying data. Only the blocks your workload actually reads are transferred and cached.
Single source of truth.
Only hot blocks stored.
Full training throughput.
Break Free from Compute Constraints
FlexCache turns a capacity problem into a non-event. Instead of waiting for GPU allocations to open up in your home region, spin up a cache in any region where instances are available and start training within minutes.
Time-to-Train
Start training in hours instead of weeks. No waiting for data migration pipelines.
Egress Savings
Cache only the data you read. Avoid paying to transfer a 500 TB dataset you use 10% of.
Data Governance
Origin remains the master copy — security policies, encryption, and auditing stay centralized.
Low Latency
After the first read, hot blocks are served from the local cache at full NVMe/SSD speeds.
Typical read latency: cross-region direct access vs. FlexCache (after warm-up)
FlexCache Beyond GPU Scarcity
Any workload that reads data across geographic or network boundaries benefits from bringing the data closer to compute.
Multi-AZ Read Scaling
Deploy FlexCache volumes in multiple AZs within the same region. Read-heavy applications (web serving, analytics dashboards) get sub-millisecond local reads without any data replication overhead.
Dev / Test Acceleration
Give each developer team a FlexCache volume pointing at the production dataset. Teams read live data without cloning terabytes — and writes are redirected to isolated sandbox volumes, keeping production untouched.
Global Collaboration
Engineering teams in APAC and EU read the same media, seismic, genomic, or design datasets without suffering cross-Atlantic or cross-Pacific latency. Each region gets a FlexCache; the origin stays authoritative.
Hybrid Cloud Bursting
On-premises data served via ONTAP can be cached in AWS for burst compute jobs. FlexCache over AWS Direct Connect gives cloud workloads near-local read latency against on-prem datasets — no lift-and-shift required.
DR Read Access
DR sites often sit idle, burning budget. FlexCache lets the DR region serve live read traffic — reporting, analytics, model inference — while SnapMirror keeps the origin authoritative. Your DR investment starts delivering ROI on day one.
Edge / Satellite Offices
Retail, manufacturing, and media organizations with remote sites can run FlexCache in the nearest AWS region, pulling only the files each site actually needs. Employees get fast local-equivalent performance without maintaining separate storage infrastructure at every site.
Solution approach comparison across key dimensions
vs. The Alternatives
Teams facing compute scarcity typically reach for one of three alternatives — each with significant drawbacks.
Full Dataset Copy (S3 / EFS)
Weeks of transfer time, full egress cost, ongoing sync complexity. Datasets drift out of sync, causing silent training errors.
Wait for GPU Availability
Weeks or months of delay. Missed product deadlines, wasted engineering cycles, and competitive disadvantage.
Read Directly Cross-Region
100–250 ms inter-region latency cripples random-read workloads. GPU utilization collapses waiting on I/O.
FlexCache (Recommended)
Spin up in minutes. Pay only for blocks read. Single source of truth. No app changes. GPU utilization stays high.
Get Running in 4 Steps
Peer the SVMs
Create a cluster and SVM peer relationship between the origin FSx ONTAP file system (us-east-1) and the cache file system (us-west-2).
Create FlexCache
Run volume flexcache create on the cache cluster, pointing to the origin volume. Size the cache to cover your hot working set.
Mount & Train
Mount the FlexCache volume on your GPU instances via NFS. Point your training framework at the mount path. No code changes needed.
Pre-warm (Optional)
Use volume flexcache prepopulate to pre-stage hot files before training starts — eliminating even first-read latency.