Back to Articles

Making the Case for 100 GbE Networking with Azure Local and Storage Spaces Direct

Why dense NVMe nodes change the networking conversation sooner than most designs expect

By Pieter Teeling • Azure Local Architecture

As NVMe becomes the default choice for modern hyperconverged infrastructure, one question comes up repeatedly in both Storage Spaces Direct (S2D) and Azure Local designs: When does 100 GbE networking actually make sense? For many years, 10 GbE — and later 25 GbE — was sufficient for most storage workloads. Even today, dual‑port 25 GbE designs are often considered "more than enough." But once you start building dense all‑NVMe nodes, those assumptions begin to break down. This article explains why 100 GbE becomes justified earlier than expected in all‑NVMe S2D and Azure Local deployments, why roughly 12 NVMe drives per node is a practical, conservative inflection point, how the economics of 4 × 25 GbE versus 2 × 100 GbE work out in real designs, and why the same logic applies across vendors like Dell and Cisco.

Design Assumptions (Made Explicit)

Before going further, it's important to anchor the discussion in realistic design practice:

No serious production S2D or Azure Local design uses a single storage network port. All scenarios discussed here assume:

At least two network ports per node
SMB Multichannel enabled
SMB Direct (RDMA) in use
A baseline of 2 × 25 GbE networking for storage/compute traffic

This reflects real‑world Azure Stack HCI and Azure Local designs and avoids straw‑man comparisons like "100 GbE vs a single 10 GbE link."

What This Article Covers

Why 100 GbE becomes justified earlier than expected
Why roughly 12 NVMe drives per node is the practical inflection point
How the economics of 4 × 25 GbE versus 2 × 100 GbE work out in real designs
Why the same logic applies across vendors (Dell, Cisco)

How NVMe Moves the Bottleneck to the Network

A single enterprise NVMe drive is capable of multiple gigabytes per second of throughput. When a node contains 12, 16, or 24 NVMe drives, the aggregate storage capability becomes enormous. In Storage Spaces Direct, all east‑west storage traffic — including mirrored writes — flows over the network using SMB 3 with RDMA. Azure Local uses the same underlying S2D mechanisms for its local storage.

As NVMe density increases, the limiting factor stops being the media and quickly becomes available network bandwidth. At that point:

Adding more NVMe increases peak potential
But does not improve realized performance unless the network scales with it
If the network doesn't keep up, you effectively pay for NVMe performance that never reaches the workload

What VMFleet Tells Us About Real‑World Limits

VMFleet, Microsoft's official S2D performance and validation tool, is designed to generate highly parallel I/O, stress storage, CPU, and network simultaneously, and expose architectural bottlenecks rather than idealized peaks. In all‑NVMe S2D clusters, VMFleet runs consistently show that nodes can sustain well over 5 GB/s of storage throughput, network links saturate before NVMe does, and scaling flattens once aggregate NIC bandwidth is consumed. This behavior appears even when SMB Multichannel is active and multiple NICs are available. Azure Local, running on the same S2D foundations, exhibits the same pattern when you push it with dense NVMe and realistic I/O mixes.

Dual‑Port 25 GbE: Good, But Not a 50 GbE Pipe

With 2 × 25 GbE, SMB Multichannel allows traffic to flow across both links concurrently. This meaningfully increases available bandwidth compared to a single port, but it's not a perfect doubling.

It's crucial to understand what SMB Multichannel does — and does not — provide:

It balances connections, not individual I/O operations
Storage traffic includes replica writes, metadata, and control flows
Mirror traffic competes with client I/O for the same NICs
Utilization across ports is rarely perfectly even under sustained load

As a result, dual‑port 25 GbE does not behave like a single flat 50 GbE pipe for storage workloads.

That's a substantial improvement over a single link — but it's also a ceiling that dense NVMe nodes can hit surprisingly quickly.

Where NVMe Density Meets the Network Ceiling

Once we assume 2 × 25 GbE as the baseline, a clear pattern emerges:

NVMe drives per node: ≤ 8 — Network rarely limits performance
NVMe drives per node: ~12 — Network pressure becomes visible under sustained load
NVMe drives per node: 16+ — Network is consistently the bottleneck in stress scenarios

At roughly 12 NVMe drives per node, the node is often capable of driving more throughput than dual‑port 25 GbE can sustain under continuous load. Beyond that point, performance gains from additional NVMe are increasingly constrained by the network. This is the point where 100 GbE stops being "nice to have" and becomes architecturally justified.

Why 100 GbE Changes the Equation

Moving to 100 GbE does more than increase peak bandwidth:

It removes the network as the primary limiter
It improves performance consistency under contention
It provides headroom for growth and future workloads
It avoids costly retrofits later (re‑cabling, new switches, downtime)

With dual‑port 100 GbE:

Aggregate bandwidth is high enough that storage or CPU once again become the limiting factors
NVMe density can scale without immediately flattening performance
The system behaves as customers expect when they invest in NVMe: adding drives actually results in more usable performance

This is why many all‑NVMe S2D and Azure Local reference architectures move directly to 100 GbE rather than trying to "stretch" 25 GbE with additional ports.

"The hybrid cloud is no longer a compromise—it's the optimal architecture for modern business."

— Core principle in modern HCI design

The Real Design Choice: 4 × 25 GbE vs 2 × 100 GbE

Once you've decided that a single dual‑port design won't scale to your NVMe density, you face a practical choice:

Option A: Four Ports of 25 GbE (4 × 25)

Theoretical aggregate: ~12.5 GB/s
More mature ecosystem (widely deployed)
Easier to find replacement NICs and cable support
Scales in smaller increments
Higher operational management (more ports to maintain, monitor, troubleshoot)
More switch port consumption for the same throughput

Option B: Two Ports of 100 GbE (2 × 100)

Theoretical aggregate: ~25 GB/s
Lower operational overhead (fewer ports, fewer cables, simpler monitoring)
Smaller physical footprint (fewer slots in the host)
Single switch port per NIC (lower density consumption)
Growing ecosystem (increasingly common in new hardware)
Still more expensive per port (though total cost can be competitive)

For dense all‑NVMe configurations with 12+ drives per node, 2 × 100 GbE is becoming the cleaner, simpler, and more cost‑justifiable choice — even when factoring in the per‑port NIC cost.

A Simple, Defensible Rule of Thumb

If you're building a node with 12 or more NVMe drives, plan for 100 GbE networking from the start.

This isn't a hard constraint. You can certainly run 12 drives with dual‑port 25 GbE, and performance will still be good for many workloads. But you'll likely find the network limiting further scaling, and you'll have missed the opportunity for a cleaner, more operational design.

If you start with 100 GbE on a 12-drive node, you have clear headroom for:

Future NVMe additions without network rework
VM-to-VM traffic growth
Management and replication traffic that compete for the same pipes

For smaller nodes (4–8 drives), dual‑port 25 GbE is appropriate and cost-effective.

Conclusion

The economics and architecture of all‑NVMe Storage Spaces Direct and Azure Local deployments have shifted. NVMe is no longer the bottleneck; the network is. At around 12 drives per node, that bottleneck becomes actively visible, and 100 GbE transitions from "nice to have" to "required for proper scale." Whether you choose 4 × 25 GbE or 2 × 100 GbE, the answer depends on your operational preferences and total cost model. But for designs with dense NVMe, skipping the networking upgrade is a false economy.

Teeling Cloud Courant

Making the Case for 100 GbE Networking with Azure Local and Storage Spaces Direct

What This Article Covers

Practical Per‑Node Reality with 2 × 25 GbE

Option A: Four Ports of 25 GbE (4 × 25)

Option B: Two Ports of 100 GbE (2 × 100)