The Infrastructure Relevance for Research and HPC
Translating “Wicked Smaht” to the Rest of Us
The first time I walked into a serious HPC / Research Computing conversation, I felt like I had landed on another planet.
Not metaphorically. Literally.
Different language. Different acronyms. Different priorities.
These weren’t “IT people.” These were scientists. “Wicked smaht”. The kind of wicked smaht where you start questioning your own SAT scores from 1996.
They weren’t talking about SLAs or replication networks.
They were talking about modeling climate 40 years into the future. Running genomic pipelines across petabytes. Multi-node GPU jobs that make your enterprise cluster look like a science fair project. They want to win Nobel Prize.
And there I was. With my enterprise storage vocabulary.
For a long time, I didn’t pretend to understand it — because I didn’t. I asked basic questions. I Googled acronyms later. I listened more than I spoke. I felt like an exchange student who accidentally walked into an advanced physics lecture.
But slowly, something clicked.
Underneath all the brilliance — all the math, all the “wicked smaht” — the same fundamental truth applies:
The plumbing still matters.
Years ago, during a large state government modernization effort, I had developers tell me infrastructure was irrelevant. Everything was abstracted. Everything was cloud-native. Hardware was “just there.”
We quietly introduced a Pure Storage FlashArray into their environment.
Build times collapsed. Latency flattened. Pipelines accelerated. The same engineers who declared infrastructure irrelevant were suddenly asking what kind of sorcery had been introduced into their environment.
Even the cleanest abstraction eventually lands on something physical.
That lesson came back to me hard this week while studying the architecture of a major R1 university.
From the outside, you imagine one giant, beautiful supercomputer humming in perfect harmony.
From the inside?
It’s more like a carefully balanced ecosystem built over years of constraint.
Multiple parallel filesystem clusters — not because anyone enjoys complexity, but because historical hardware limitations forced isolation. Separate domains to prevent metadata storms from taking down entire environments. Storage carved along grant boundaries because funding models demand it. Admins acting like I/O referees so one runaway workload doesn’t saturate shared infrastructure.
And then there’s the trade-off researchers live with every day.
Let me rename it so it makes sense outside of academia.
They choose between Warp Speed and Seatbelt Mode.
Warp Speed is blistering. It’s where simulations fly and GPUs stretch their legs. But it’s volatile. Limited protection. If something goes sideways, recovery is painful.
Seatbelt Mode is governed. Protected. Snapshotted. Backed up.
But slower.
Imagine being a Nobel-level researcher forced to choose between raw velocity and data protection.
That’s not theoretical. That’s Wednesday.
And this is where I stopped feeling like an alien… and started feeling like a translator.
Because once you understand the daily trade-offs researchers are living with, you start asking a different question:
What if that trade-off isn’t actually required anymore?
In serious HPC environments, the conversation frequently turns to Pure Storage FlashBlade — not because it’s shiny, but because its architecture attacks those compromises directly.
Let me translate why.
Traditional parallel filesystems often rely on dedicated metadata nodes. When workloads spike, those nodes can “storm.” Think of a toll booth suddenly facing 10,000 cars at once.
FlashBlade distributes metadata across every blade. As you scale out, metadata performance scales linearly.
Translation:
You don’t hit a ceiling just because your GPUs decided to get busy.
Then there’s Zero-Move Tiering — the part that honestly made me pause.
Historically, research environments force humans to decide where data lives. Fast tier? Capacity tier? Temporary? Protected?
With FlashBlade, TLC and QLC flash operate within a single namespace. Data doesn’t need to be manually shuffled between performance and capacity tiers over time.
Translation:
You stop playing data Tetris.
And then there’s the uncomfortable reality no one likes discussing in research computing — ransomware.
Open academic environments are collaborative by design. That openness is powerful. It’s also risky.
SafeMode™ provides immutable, indelible snapshots that cannot be altered — even by a compromised admin. Rapid Restore brings data back fast.
Translation:
Seatbelt Mode protection… without forcing everyone to drive the minivan.
This is what shifted my thinking.
The conversation in research computing shouldn’t just be about raw performance anymore.
It should be about eliminating artificial and legacy trade-offs.
When distributed metadata removes storm ceilings…
When zero-move tiering removes manual friction…
When immutable snapshots remove fear…
Researchers don’t have to choose between Warp Speed and Seatbelt Mode. They just work.
And if there’s one thing I’ve learned from walking into rooms full of wicked smaht scientists feeling like I’m on another planet, it’s this:
My job isn’t to understand their equations.
My job is to make sure the plumbing never becomes the bottleneck to discovery.
Infrastructure might look like plumbing.
But in research computing, it’s the difference between a trickle…
and a breakthrough.
P.S. If this resonated:
That little ❤️ at the bottom? That means Like. (Yes, I had to Google that the first time.)
The circular arrow? That’s Restack — which is Substack’s fancy way of saying “share this with your people.”
And the Subscribe button… well… that one’s not optional. Let’s be honest.
Appreciate you reading.
Dmitry Gorbatov
© 2025 Dmitry Gorbatov | #dmitrywashere



