Modern CI/CD has outgrown the network.
Compute has become elastic, but data hasn’t. The software dependencies produced during development, testing, building, and packaging—collectively referred to as “artifacts,” including modules, containers, Helm charts, and other related scripts and binaries—are growing faster than the bandwidth available to move them. Even small inefficiencies compound across distributed CI/CD pipelines, leading to the same story everywhere: longer builds, overloaded registries, and unpredictable developer productivity.
At Varnish Software, we’ve seen this pattern before. Twice, in fact. Each generation of caching has been shaped by the data workloads of its time.
Wave 1: Web & API Caching (2010 – 2018)
The first era of Varnish adoption focused on accelerating web and API traffic. Applications needed to serve dynamic content globally without collapsing backend services. Varnish made it possible to handle millions of requests per second while maintaining deterministic latency and stability. This was the foundation of modern edge architectures.
Wave 2: Streaming & Private CDN (2018 – 2024)
The second wave emerged as enterprises began moving massive volumes of video and media over the internet. Caching evolved from an HTTP accelerator into a distributed delivery fabric. Persistent storage (MSE), multi-region replication, and adaptive routing enabled large organizations to operate their own CDNs with the same efficiency as hyperscalers.
Wave 3: Software, AI/ML & Data (2024 → )
The latest challenge isn’t web traffic but data movement throughout CI/CD. Modern CI/CD now moves gigabytes to terabytes of artifacts through build systems thousands of times per day. Registry services like Docker Hub, Artifactory, GitHub Packages, and PyPI, and many others, sit at the center of this flow, and they’re straining under the load.
- Repository size and object count have exploded.
- SaaS tiers, egress fees, and rate limits add hidden latency and unpredictable cost.
- The same artifacts are pulled and replicated across multiple regions and clouds.
We’ve entered an era where data is heavier than compute. Even in high-performance environments, bandwidth and latency—not CPU or GPU—are the real bottlenecks. And since network propagation can’t outrun the speed of light, scaling compute without local data only makes the problem worse.
Introducing Varnish Orca
Varnish Orca (Objects, Registry, Cache and Artifacts) is a virtual registry manager: a distributed caching layer purpose-built for artifact and object acceleration.
It consolidates multiple registries (public or private) into a single, high-performance cache that sits near CI/CD, build, and runtime environments.
Orca accelerates Docker images, NPM packages, Helm charts, PyPI and Maven dependencies, Go modules, RPMs, and more; any immutable binary artifact that slows down builds or saturates origin networks.
Deployed as a lightweight container or bare-metal service, Orca can:
- Cache and serve artifacts locally across build clusters and data centers
- Persist data with MSE4 for cold-start resilience and offline operation
- Deduplicate concurrent requests, eliminating backend storms
- Integrate seamlessly with existing registries via simple YAML configuration, no re-tooling or new APIs required
- OpenTelemetry support for observability across artifact access, cache hit rates, registry offload, latency, and cost reduction
What It Solves
Across large engineering organizations, the same patterns repeat:
- 90–99 % of artifact requests are served directly from cache
- Git clone and build times improve 2–3×
- Backend load and egress traffic drop 6–7 figures annually
- CI/CD pipelines stay operational even when registry backends fail
Varnish Orca doesn’t just cache data; it also introduces routing flexibility, allowing teams to define where artifacts are fetched from, cached, or mirrored. This effectively creates a virtual registry layer that unifies multiple sources (Artifactory, GitHub, Nexus, S3, and more) under a single access point—without rewriting tooling or pipelines.
The result is deterministic performance, predictable cost, and freedom from vendor lock-in.
Data Gravity and Locality
At large scale, moving data dominates every performance curve. A single pipeline might reference thousands of packages totaling hundreds of gigabytes. Multiplied across geographies, that becomes petabytes of repeated transfer. No WAN or CDN architecture can make that efficient if the data always lives elsewhere.
The most efficient solution isn’t more bandwidth or more replication but locality.
Bring the data as close as possible to where it’s used, keep it warm in cache, and synchronize intelligently.
That’s the principle behind Orca: don’t move the data unless you have to.
Simplicity by Design
Despite its performance profile, Orca’s design goal is simplicity.
A complete deployment can be defined in a single YAML file, no VCL editing or complex orchestration required. Metrics are exposed via OpenTelemetry, and persistence can be configured with a few lines of policy. The free edition accelerates public registries out-of-the-box; the premium edition extends that to private, persistent, and multi-site use cases.
Start Local, Scale Global
Varnish Orca is part of the third wave of caching. Built for data-intensive, cloud-native software delivery.
It brings the proven performance characteristics of Varnish to the world of CI/CD and artifact distribution, helping teams release faster, reduce cost, and regain control of how their data moves.
Because sometimes, the most elegant way to accelerate data isn’t to move it faster but to stop moving it at all.
Get started for free, within a few minutes: www.varnish-software.com/orca
/VS-logo-2020-197x60.png?width=136&height=60&name=VS-logo-2020-197x60.png)