Engineering
The Bulkhead Architecture: Why We Isolate Every Organisation
This is overkill. We know.
Let's get this out of the way: isolating every single organisation into its own dedicated infrastructure pod is, by conventional SaaS wisdom, overkill. Multi-tenant architectures with shared databases and shared compute have powered successful platforms for decades. Thousands of apps operate this way. It works. We're not here to tell you it doesn't.
But we've been building software for the Shopify ecosystem for a long time. Long enough to have seen what happens when shared infrastructure meets the real world. Long enough to have watched cross-store data access show up as a recurring pattern — not because anyone set out to build insecure software, but because shared tenancy creates a class of vulnerability that even careful engineering can't fully eliminate. One missed scope check. One query that doesn't filter by tenant ID. One cache key collision. These aren't theoretical risks. They're patterns we've seen play out, and they tend to surface at the worst possible time.
We don't think we're incapable of building a shared-tenancy system that avoids these issues. We probably could. But we'd rather not rely on "probably". We'd rather make cross-tenant data access physically impossible — not prevented by application logic, but eliminated by architecture.
Defence in depth, not defence in hope
There's a philosophy in security engineering called defence in depth: don't rely on a single layer to keep you safe. Assume every layer will eventually fail, and design so that no single failure is catastrophic.
This isn't a niche idea. It's how the most security-conscious organisations in the world build their systems. Apple doesn't just write careful code and hope their applications are free of vulnerabilities — they build mitigations directly into the processor and operating system that limit what a vulnerability can do even if it's exploited. The Secure Enclave Processor handles your biometrics and encryption keys in hardware that is physically isolated from the rest of the system. A compromised app can't reach it. A compromised kernel can't reach it. The isolation is architectural, not aspirational.
At Alloy, we don't build apps — we build solutions. We'd rather build a kernel that powers your online store than a lightweight app that sits on top of it. And if we're going to think like kernel engineers, we need to think about isolation the same way Apple does. The Bulkhead is our Secure Enclave: it doesn't just protect your data with good code — it makes entire categories of breach structurally impossible.
The Bulkhead model
Every organisation that uses an Alloy solution gets its own isolated pod — a dedicated set of infrastructure resources that are never shared with another tenant.
┌─────────────────────────────────────────┐
│ Organisation: acme-corp │
├─────────────────────────────────────────┤
│ Database db-acme-corp-a1b2c3 │
│ Compute pod-acme-corp-a1b2c3 │
│ Storage bucket-acme-corp-a1b2c3 │
│ Secrets vault-acme-corp-a1b2c3 │
│ Network vpc-acme-corp-a1b2c3 │
└─────────────────────────────────────────┘
Each pod runs in its own Cloud Run service with dedicated Firestore collections, Cloud Storage buckets, and Secret Manager entries. There is no shared database. There is no shared compute. The blast radius of any single failure is exactly one organisation.
The name comes from naval architecture. A bulkhead is a watertight wall inside a ship's hull. If one compartment floods, the bulkhead stops the water from spreading. The ship stays afloat. The same principle applies here: if something goes wrong in one organisation's pod, it stays there. No other organisation is affected. No data leaks sideways. No cascade failures.
Why this matters more than you'd think
Shared-tenancy architectures create an invisible contract with every customer: trust us to never make a mistake with your data. Every database query needs to be correctly scoped. Every cache key needs to include a tenant identifier. Every background job needs to verify it's operating on the right store's data. Every API response needs to be filtered. Every log line needs to be scrubbed.
That's a lot of "every". And it only takes one miss.
And even if you do get it right — even if your row-level security is flawless and every query is perfectly scoped — that guarantee only holds for as long as the data stays inside that one system. The moment you bolt on a microservice that consumes tenant data, or integrate with a third-party platform downstream, the isolation boundary you carefully built inside your database doesn't extend to those systems. Each new service, each new integration, each new data pipeline becomes another place where tenant scoping needs to be implemented and maintained correctly. The attack surface isn't static — it grows with your architecture. What started as a solved problem in your primary datastore quietly becomes an unsolved problem across a distributed system.
With the Bulkhead model, there is no tenant ID to forget. There is no shared table to accidentally query without a filter. A pod physically cannot access another pod's database because it doesn't have the credentials, the network path, or the IAM permissions to do so. The isolation isn't enforced by application code that could have a bug — it's enforced by infrastructure that would need to be explicitly and intentionally reconfigured to break.
This is the difference between preventing cross-tenant access and eliminating it. Prevention relies on every engineer, on every commit, getting it right every time. Elimination means it doesn't matter if someone gets it wrong, because the architecture won't let the mistake have consequences beyond the boundaries of a single pod.
What this means in practice
Performance isolation
Your workloads don't compete with anyone else's. A large snapshot operation on one organisation has zero impact on another organisation's deploy pipeline. There are no noisy neighbours. Your performance characteristics are yours alone.
Data sovereignty
Your data lives in dedicated storage that is never co-mingled with another organisation's data. When you delete your account, we delete your pod — there's no residual data hiding in a shared table, no orphaned rows, no ghost records in a cache. Deletion is complete and verifiable.
Security boundaries
Each pod has its own encryption keys, its own network boundaries, and its own access controls. A vulnerability in one pod cannot propagate to another. Even in a worst-case scenario — a full pod compromise — the attacker has access to one organisation's data and nothing else. The blast radius is bounded by design.
Independent lifecycles
Each pod can be updated, scaled, and maintained independently. A migration that needs to run for one organisation doesn't require a maintenance window for everyone. A configuration change for one customer doesn't risk a regression for another. This also means we can canary changes to individual pods before rolling them out broadly.
The trade-offs
We'd be dishonest if we didn't acknowledge what this costs.
This architecture is more expensive to operate than a shared-tenancy model. More infrastructure, more resources, more operational surface area. We accept that trade-off because the security and reliability guarantees are worth it for the merchants who depend on Alloy.
Provisioning is more involved. Spinning up a new organisation means provisioning dedicated infrastructure — compute, storage, secrets, networking. We've automated this entirely, but it does mean there's a brief setup period rather than instant access. We think that's a reasonable exchange for the guarantees you get in return.
Operational complexity is higher. Monitoring, alerting, and deployment pipelines all need to be tenant-aware. We've invested heavily in tooling to make this manageable, but it's undeniably more work than operating a single shared deployment. We see that investment as part of the product, not overhead.
Request pipeline
Every request that enters an Alloy solution passes through five layers before it touches your data:
- Edge authentication — JWT validation at the edge, before the request reaches any application code
- Tenant resolution — Map the request to the correct organisation pod using verified claims
- Permission evaluation — Check RBAC policies for the authenticated user within their organisation
- Rate limiting — Per-tenant rate limits that are never shared or pooled across organisations
- Pod routing — Route to the organisation's dedicated infrastructure with no cross-tenant network path
This pipeline ensures that even at the routing level, there is no cross-tenant interaction. A request for one organisation never passes through another organisation's infrastructure. The network paths are distinct. The credentials are distinct. The compute is distinct.
The bottom line
Could we build a secure multi-tenant system with shared infrastructure? Yes. Plenty of good teams do. But we've chosen to make a different bet: that the merchants who trust us with their stores deserve an architecture where entire categories of security failure simply cannot happen. Not because we wrote perfect code, but because the infrastructure won't allow it.
That's the Bulkhead. It's overkill by design. And we think that's exactly the point.
Further reading
- Platform Architecture — overview of the Bulkhead model
- Security — encryption, authentication, and access control details
Alloy Engineering
Engineering