After Redis’s License Shift: Valkey, Garnet, and Dragonfly in 2025—Compatibility, Performance, and a Safe Migration Playbook
Redis’s licensing change in 2024 catalyzed a split in the in-memory data store ecosystem. While Redis remains popular and mature, the community now has multiple serious alternatives that aim to preserve open development models, push performance forward, or optimize for cloud-native durability. In 2025, the most discussed options in technical teams are Valkey (the Linux Foundation fork of Redis OSS), Microsoft’s Garnet, Dragonfly, and KeyDB.
This article compares these engines across the dimensions that matter to practitioners—protocol and client compatibility, modules, clustering and failover, persistence and durability, and real-world performance characteristics. It concludes with a pragmatic, low-risk migration and rollback playbook you can apply regardless of your target.
The bottom line: there is no universally “best” choice. The right engine depends on your workload shape, feature usage, and operational model. Move deliberately—with measurement and reversible steps.
TL;DR for busy engineers
- If you need near drop-in compatibility with open governance: choose Valkey. It adheres closely to Redis OSS semantics, supports modules, cluster, Lua, RDB/AOF, and popular clients.
- If you need extreme single-instance throughput at low tail latency and can live without modules: consider Dragonfly. It’s multi-threaded, Redis/Memcached compatible, and increasingly mature with snapshots/journaling.
- If you want a cloud-native, durable cache/store with a Redis-compatible protocol and strong disk-backed performance: evaluate Microsoft Garnet. It focuses on hot-path commands with a log-structured durable core (via the FASTER lineage). Expect partial command coverage; plan to validate.
- If you want a multi-threaded Redis fork that keeps modules and familiar semantics: KeyDB is a practical option. It offers good throughput while preserving much of Redis’s feature set, though its cluster-mode story is not as strong as Redis/Valkey.
- Migration: inventory features, benchmark with your data, use dual writes and shadow reads, cut over behind a feature flag, and keep a tested rollback path. Favor RDB seeding + incremental sync where supported; otherwise, use key-by-key copy or traffic mirroring.
The context: Redis’s licensing pivot and the new landscape
In early 2024 Redis Ltd. transitioned the Redis source availability and licensing model, leading to community concern and the formation of an independent, vendor-neutral fork: Valkey, now under the Linux Foundation. In parallel, high-performance alternatives like Dragonfly and KeyDB had already matured, and Microsoft made Garnet publicly available, emphasizing durable, cloud-optimized design with Redis protocol compatibility.
As of 2025, that leaves engineering teams with four credible options beyond proprietary Redis Enterprise:
- Valkey: community-governed continuation of Redis OSS
- Dragonfly: performance-first, multi-threaded, Redis/Memcached compatible server
- Microsoft Garnet: durable cache/store with Redis-compatible protocol and disk-backed engine
- KeyDB: multi-threaded Redis fork with broad feature compatibility and modules
Each has distinct design tradeoffs.
Compatibility and features: what works with your workloads
This section covers the practical compatibility questions teams face when trying to move existing Redis workloads.
Wire protocol and client libraries
-
RESP2/RESP3 compatibility
- Valkey: full RESP2 and RESP3 support as continuity from Redis OSS. Works with redis-cli, redis-benchmark, and mainstream clients (redis-py, Jedis, node-redis, go-redis, etc.).
- Dragonfly: implements the Redis protocol with good client compatibility; supports RESP2 and RESP3 in practice for standard commands. Most popular clients “just work.”
- Microsoft Garnet: advertises Redis protocol compatibility for a growing subset of commands. Works with common clients for supported commands. Verify RESP3 features and edge behavior with your specific client.
- KeyDB: as a Redis fork, supports RESP2/RESP3; clients work out of the box.
-
TLS and ACLs
- Valkey: supports TLS and ACLs similar to Redis OSS.
- Dragonfly: supports authentication and TLS in recent builds; verify your distro/packaging.
- Garnet: supports TLS and authentication; ACL parity with Redis is not 1:1—validate your policy needs.
- KeyDB: supports TLS and ACLs.
-
Pub/Sub
- Valkey: full support.
- Dragonfly: supports Pub/Sub.
- Garnet: check feature matrix—Pub/Sub may be present but validate message semantics and scaling.
- KeyDB: supported as per Redis.
Takeaway: If you rely on standard commands over standard clients, Valkey and KeyDB are closest to drop-in. Dragonfly is highly compatible for mainstream usage. Garnet targets compatibility for common operations; treat it as a subset to validate rather than assume full parity.
Data types and command coverage
- Strings, Hashes, Lists, Sets, Sorted Sets: broadly supported by Valkey, Dragonfly, KeyDB. Garnet supports core operations; validate advanced or less common commands.
- Streams: fully supported in Valkey; supported in KeyDB. Dragonfly’s Streams support has improved, but verify feature completeness and performance under your patterns (e.g., XREADGROUP semantics). Garnet’s Streams support should be validated.
- Geospatial, HyperLogLog, Bitmaps/Bitfields: Valkey and KeyDB provide expected functionality. Dragonfly implements many but check edge cases and performance characteristics. Garnet may omit or partially implement some of these—don’t assume availability.
- Transactions (MULTI/EXEC): supported in Valkey, Dragonfly, KeyDB. Garnet supports transactional patterns for supported commands—verify semantics, especially with persistence enabled.
- Scripting (Lua): Valkey supports Redis-style Lua. KeyDB supports Lua. Dragonfly provides Lua/EVAL functionality with certain caveats (e.g., differences in blocking behavior and deterministic execution vs. multi-threading; check documentation). Garnet prioritizes core command performance; treat Lua scripting as not supported or limited unless you’ve verified.
Rule of thumb: if you rely heavily on Streams, Lua scripting, or niche commands, Valkey offers the lowest surprise factor. Dragonfly and KeyDB cover most mainstream patterns. Garnet should be matched to workloads that mostly use String/Hash/List/Set/ZSet operations and don’t rely on advanced features.
Modules and extensibility
- Valkey: aims to maintain module API compatibility from Redis OSS. Community and third-party modules (e.g., RedisBloom-like, RedisJSON-like equivalents if available under permissive terms) can be used, subject to their licensing.
- Dragonfly: does not support Redis modules. Its multi-threaded architecture and different internal data structures make binary-compatible modules impractical.
- Garnet: no Redis module support; its engine design differs significantly from Redis internals.
- KeyDB: supports modules (as a Redis fork). If you rely on modules, KeyDB and Valkey are your viable options.
Clustering, sharding, and HA
-
Sentinel-style HA (single primary, replicas, automatic failover)
- Valkey: yes, parity with Redis OSS. Sentinel works as expected.
- Dragonfly: supports replication and failover; cluster features have been expanding. Check current version docs for sentinel-equivalent behavior or recommended orchestrations.
- Garnet: supports replication and durability features; it does not aim to be Redis Sentinel/Cluster identical. Plan for orchestrated failover via your platform.
- KeyDB: supports replication; Sentinel may work for failover similar to Redis, with KeyDB-specific tuning.
-
Cluster mode (hash-slot partitioning, CLUSTER MEET, etc.)
- Valkey: Redis Cluster compatible.
- Dragonfly: implements a cluster mode compatible with Redis Cluster client semantics in recent versions; confirm coverage.
- Garnet: generally does not implement Redis Cluster protocol. Use client-side sharding or a proxy-based approach.
- KeyDB: historically weaker cluster-mode story than Redis; some versions offer compatibility options, but production parity is not universally reported. Many operators prefer standalone + replicas or external sharding.
-
Client-side sharding and proxies
- Any engine can be sharded with clients that support consistent hashing or via proxies like Twemproxy, Envoy/Redis filter, or custom routers.
Persistence and durability
This is where engines diverge the most.
-
Valkey
- RDB snapshots: Yes. Traditional fork-based snapshotting.
- AOF (Append Only File): Yes, with rewrite policies (always/everysec/no) and mixed RDB/AOF modes.
- Replication: PSYNC-based with backlog; standard.
- Operational implication: Forking for RDB/AOF rewrite can cause transient latency spikes; tune save policies and memory allocator (jemalloc) and consider CPU pinning.
-
Dragonfly
- Snapshotting: Provides snapshot capabilities to disk without classic OS fork overhead, mitigating long GC/fork stalls common in Redis-like models.
- AOF/journaling: Journal-style durability exists; check the exact guarantees (fsync modes, replay semantics) in your version.
- Replication: Supports replica-of patterns and failover; verify cross-compatibility if importing from Redis.
- Operational implication: Reduced pause times during persistence under load compared to fork-based designs; different memory utilization patterns vs. Redis/Valkey.
-
Microsoft Garnet
- Durable core: Built on concepts related to FASTER (hybrid log with checkpointing and page caching). Offers disk-backed durability with high throughput.
- Snapshots/checkpoints: Yes, with periodic checkpointing and log replay.
- Replication: Supported; semantics differ from Redis replication. It’s not a drop-in for Redis replication tooling.
- Operational implication: Strong durability characteristics; designed for cloud environments and NVMe-backed storage. Verify fsync semantics and recovery times under your SLAs.
-
KeyDB
- RDB/AOF: Yes, largely compatible with Redis but with multi-threaded execution that can reduce some stalls.
- Replication: Supports standard Redis replication semantics and additional active-active options depending on version; review conflict policies if multi-master is used.
- Operational implication: Higher throughput on multi-core boxes than Redis while keeping familiar persistence knobs.
If you have strict durability requirements and plan to sustain heavy write loads, Garnet’s design is particularly interesting. If you need low pause-times for large heaps without giving up in-memory performance, Dragonfly is compelling. For broadly compatible, mature persistence semantics, Valkey and KeyDB are straightforward.
Security and multi-tenancy
- TLS, ACLs, protected-mode: broadly supported across Valkey, KeyDB, and Dragonfly. Garnet supports secure configurations but ACL parity may differ—validate before assuming policy-level compatibility with Redis.
- Namespaces/tenant isolation: not a Redis feature per se; multi-tenancy is typically handled at deployment level. If you need hard isolation, run separate instances/pods and leverage network policies.
Performance: what to expect and how to measure it
Performance is context-dependent. Vendor benchmarks can be directionally informative but rarely match your workload. Focus on architectural traits and measure with your data and access patterns.
Architectural differences that matter
-
Threading model
- Valkey: single-threaded event loop with optional I/O threading. Predictable, mature. Can saturate a single core; scale via sharding or cluster mode.
- KeyDB: multi-threaded execution with work-stealing; improves throughput on multi-core machines while retaining Redis semantics.
- Dragonfly: shard-per-core architecture with user-level fibers and careful memory layout. Typically delivers very high throughput and low p99 latency on multi-core hardware.
- Garnet: optimized for durable throughput using a log-structured design and modern storage; scales with cores and fast NVMe.
-
Persistence overheads
- Fork-based snapshots (Valkey/KeyDB) can incur latency spikes during copy-on-write if the dataset is large and heavily mutated.
- Dragonfly avoids classic fork pauses; snapshotting and journaling are designed to be less disruptive under load.
- Garnet’s checkpoint+log design targets predictable durability with high throughput on SSD/NVMe.
-
Memory allocator and fragmentation
- jemalloc (Valkey/KeyDB) is robust, but fragmentation can become notable under churn.
- Dragonfly’s internal allocators and data structures may have different memory/CPU tradeoffs; often more memory-efficient for certain workloads.
- Garnet’s on-heap/off-heap management is tuned for durability and large datasets; measure resident set size and cache ratios.
Workload patterns
- Small-object GET/SET at high QPS: Dragonfly and KeyDB often show 2–3x throughput scaling vs single-threaded designs on the same box. Valkey scales horizontally via shards/cluster. Garnet is competitive, especially when durability is enabled, but raw in-memory peak might trail Dragonfly’s top-end on a single node.
- Large values (hundreds of KB): Network and serialization dominate; copy avoidance and sendfile-like optimizations matter. Validate latency percentiles and tail.
- Sorted set heavy workloads: Implementations differ in internal structures (e.g., skip-lists). Test throughput and memory.
- Streams and consumer groups: Valkey is a safe baseline. Dragonfly’s performance can be strong but check feature fidelity. Garnet may not target Streams-heavy workloads; validate first.
- Pub/Sub fanout: All engines can deliver strong results, but scaling semantics and backpressure differ. Observe dropped connections, slow consumer handling, and memory growth.
Benchmarking checklist
- Use representative payload sizes and command mix. Avoid redis-benchmark defaults as your sole source of truth.
- Run with persistence enabled as you will in production (AOF fsync policy, snapshot cadence, Garnet checkpointing, etc.).
- Measure p50/p95/p99 latency under steady-state and during disruptive events (snapshot, AOF rewrite, replica promotion, re-sharding).
- Test failover and recovery time under load.
- Observe memory footprint, allocator fragmentation, and RSS after churn.
- Validate client behavior (timeouts, retries, backoff) against each engine’s tail latencies.
Choosing the right engine for your constraints
- Valkey: Choose for maximum compatibility, open governance, and straightforward ops. Ideal if you rely on modules, Streams, Lua, and Redis Cluster.
- Dragonfly: Choose for high throughput per node and low tail latency with a Redis-compatible API, if you don’t need modules and are comfortable validating advanced command parity. Good for high-QPS caches, ephemeral session stores, and real-time workloads.
- Microsoft Garnet: Choose for durable cache/store scenarios with sustained write throughput and large datasets backed by fast storage. Best when your command set is well within its supported subset and when durability is first-class.
- KeyDB: Choose for a familiar Redis experience with multithreaded performance and module support. If you need modules but want more throughput than single-threaded designs deliver, it’s a pragmatic option.
A safe migration and rollback playbook
Migrations fail when teams assume compatibility and skip measurement. The antidote is a phased approach with fast abort paths.
Phase 0: Inventory and risk assessment
- Capture your command set and feature usage for at least a week (or a representative peak). Enable commandstats and keyspace hits/misses sampling.
- Identify “special” features: Lua scripts, Streams, Pub/Sub patterns, modules, cluster mode specifics, eviction policies, and latency SLOs.
- Decide the target engine based on must-haves (e.g., modules → Valkey/KeyDB; cluster parity → Valkey/Dragonfly; durability-first → Garnet).
Phase 1: Lab validation
- Stand up the candidate with production-like configuration:
- Persistence settings (AOF fsync, snapshot cadence or checkpointing interval)
- Replication topology (replica count, backlog sizes)
- TLS and ACLs as in prod
- Eviction policy (allkeys-lru, volatile-ttl, etc.)
- Run synthetic benchmarks that match your payload sizes and mixes.
- Reproduce “bad days”: force snapshots, AOF rewrites, replica promotion, and network partitions.
- Validate client behavior: timeouts, backpressure, connection pooling.
- Verify command parity: run a command-coverage test suite or record-replay your prod traffic.
Example: replay production-like traffic
bash# Capture a window of commands (if you have a proxy that can mirror traffic or a tap) # Then replay via a tool or simple script; example with redis-benchmark custom payloads redis-benchmark -h candidate -p 6379 -n 500000 -P 64 -t get,set -d 256
Phase 2: Data seeding and consistency plan
Pick one of these approaches, in order of preference:
- Seed via RDB/AOF, then incremental sync
- Valkey → Valkey or Redis → Valkey/KeyDB/Dragonfly: create an RDB snapshot during a low-traffic window and load it into the target. For engines that can act as a replica of Redis/Valkey (e.g., Valkey, KeyDB, and in many cases Dragonfly), configure replicaof to ingest a live stream of changes, then break replication at cutover.
- Garnet: load via supported import tools (e.g., its own checkpoint/log restore). If direct RDB/AOF import is not supported, proceed with method 2.
- Online key-by-key copy with dual writes
- Use SCAN-based copy with pipelined RESTORE to the target. Keep dual writes enabled in your app to capture deltas during the copy.
- Traffic mirroring and rebuild
- Mirror production traffic to the target (read shadowing). For writes, enable dual write in the app to both source and target; for reads, shadow to the target and compare responses.
Seed verification:
- Compare dbsize and memory usage (expect differences). Sample random keys: DUMP on source and DUMP on target; compare payloads where portable.
- For Streams, verify consumer group state.
Phase 3: Dual writes and read shadowing
- Implement a feature-flagged dual-write layer in your application for a subset of operations. Example (Python):
pythonimport os import redis primary = redis.Redis(host=os.getenv("PRIMARY_HOST"), port=6379, socket_timeout=0.05) shadow = redis.Redis(host=os.getenv("SHADOW_HOST"), port=6379, socket_timeout=0.05) dual_writes_enabled = os.getenv("DUAL_WRITES", "0") == "1" def kv_set(k, v): res = primary.set(k, v) if dual_writes_enabled: try: shadow.set(k, v) except Exception: # log but don’t fail the request pass return res def kv_get(k): val = primary.get(k) # optional: shadow read for comparison, async to avoid latency penalty return val
- Enable dual writes for a small tenant or traffic slice. Monitor write failures, command latencies, and error logs on the target.
- Shadow reads: sample a small fraction of reads, compare values and TTLs. Track divergence metrics.
Phase 4: Canary cutover
- Promote a small tenant or shard to read/write from the target only. Keep source hot and current via dual writes or replication for instant rollback.
- Observe for at least a business cycle (peak periods, background jobs). Track SLOs and functional correctness.
- Expand canary gradually until 100% of traffic is on target.
Phase 5: Decommission or keep warm for rollback
- Maintain the source in read-only sync for an agreed window (e.g., a week). If no rollbacks are needed, decommission to save cost.
- If issues occur, roll back by flipping the feature flag to route back to source. Because you kept the source current, rollback is instant.
Rollback strategies that actually work
- Keep bi-directional safety: During canary, maintain either dual writes (source→target) or replication from target back to source if supported, to avoid losing updates.
- Maintain a compatibility abstraction in code (e.g., a storage interface) so a revert is a config change, not a code change.
- Keep observability on both sides: identical metrics, logs, and dashboards make anomalies obvious.
Configuration snippets and operational examples
Below are examples to help you get started. Adjust versions and security settings for your environment.
Valkey basic config (valkey.conf excerpts)
ini# Network and security bind 0.0.0.0 port 6379 tls-port 0 protected-mode yes requirepass "${VALKEY_PASSWORD}" daemonize no # Persistence save 900 1 save 300 10 save 60 10000 appendonly yes appendfsync everysec no-appendfsync-on-rewrite yes # Replication replica-read-only yes repl-backlog-size 256mb
Dragonfly minimal CLI startup
bashdfly --port=6379 --dbfilename=dfly.rdb --dir=/data --logtostderr --enable_tls=false
Notes:
- Use flags to enable snapshotting/journaling as desired.
- Pin CPU cores and set file descriptor limits for high connection counts.
Microsoft Garnet example (conceptual)
Garnet packaging varies. The typical pattern is to configure checkpoint intervals and storage paths for durability. Validate your deployment method (container, systemd unit) and enable TLS/auth as required. Expect a YAML or CLI-based config specifying log directories, checkpoint cadence, and memory limits.
KeyDB config highlights (keydb.conf excerpts)
ini# Multi-threading server-threads 4 # Persistence similar to Redis save 900 1 appendonly yes appendfsync everysec # Replication replica-read-only yes
Migrating data with SCAN+RESTORE (fallback method)
If direct replication or RDB import isn’t available, you can stream keys. Example Python script:
pythonimport redis src = redis.Redis(host='source', port=6379, password='source_pw') dst = redis.Redis(host='target', port=6379, password='target_pw') cursor = 0 pipeline = dst.pipeline(transaction=False) count = 0 while True: cursor, keys = src.scan(cursor=cursor, count=1000) for k in keys: ttl = src.pttl(k) dumped = src.dump(k) if dumped is None: continue if ttl < 0: # no TTL or TTL unknown, set without expiry pipeline.restore(k, 0, dumped, replace=True) else: pipeline.restore(k, ttl, dumped, replace=True) count += 1 if count % 1000 == 0: pipeline.execute() if cursor == 0: break pipeline.execute() print("Copied", count, "keys")
Caveats:
- Requires RESTORE support on the target and DUMP compatibility. If DUMP/RESTORE encoding differs, fallback to type-aware GET/SET/HGETALL/ZADD restoration.
- For Streams and more complex types, write dedicated copy routines.
Operational gotchas and best practices
- Eviction policies: allkeys-lru vs volatile-ttl semantics can differ in edge cases. Confirm memory behavior under pressure.
- Snapshot pauses: on large heaps with high write rates, fork-based snapshots in Valkey/KeyDB can spike p99 latency. Schedule snapshots off-peak or reduce frequency; consider AOF everysec with no-appendfsync-on-rewrite.
- Lua scripts: ensure pure determinism and avoid time-based or non-deterministic constructs when migrating to engines with different execution models.
- Cluster resharding: if using Valkey/Dragonfly cluster, test resharding under load and client behavior (ASK/MOVED handling, retry budgets).
- Network: enable TCP keepalive, tune somaxconn, and client timeouts (connect/read) to match engine tail latencies.
- Observability: expose and scrape engine-specific metrics. Track command latencies, instantaneous ops/sec, memory fragmentation, replication offsets, AOF fsync lag, checkpoint durations, and slowlog.
Feature-by-feature comparison (at a glance)
-
Modules
- Valkey: yes
- Dragonfly: no
- Garnet: no
- KeyDB: yes
-
Cluster mode
- Valkey: yes (Redis Cluster compatible)
- Dragonfly: yes (compatible API; verify details)
- Garnet: no (use sharding/proxies)
- KeyDB: limited/uneven; many use replication or external sharding
-
Persistence/durability
- Valkey: RDB + AOF
- Dragonfly: snapshot + journaling (no classic fork stalls)
- Garnet: checkpoint + log (disk-first design)
- KeyDB: RDB + AOF (multithreaded)
-
Lua scripting
- Valkey: yes
- Dragonfly: yes (with caveats; validate)
- Garnet: limited/not primary goal; validate
- KeyDB: yes
-
Streams
- Valkey: yes
- Dragonfly: improving; validate completeness
- Garnet: validate support
- KeyDB: yes
-
Pub/Sub
- All: supported; validate scaling characteristics
What we’d choose, and when (opinionated)
- We default to Valkey when migrating open-source Redis workloads that use Streams, Lua, and modules. It provides the least friction and a clear governance story.
- We recommend Dragonfly when throughput per node and latency stability under persistence pressure matter more than modules. It’s a strong fit for high-QPS caches, session stores, and real-time ranking where Redis’s fork pauses used to hurt.
- We’d choose Garnet when durability and disk-backed efficiency are first-class requirements and the command set is relatively standard. It’s compelling for cost-efficient, large datasets with strong recovery guarantees.
- We consider KeyDB when teams want module compatibility and materially better throughput on the same hardware without retooling—especially for single-node or simple replicated deployments.
FAQ
-
Can I run mixed fleets (e.g., Valkey for Streams, Dragonfly for hot cache)?
- Yes. Many organizations segment use cases by feature set and latency tolerances.
-
Can I point my existing Redis clients at these engines without code changes?
- Often yes for Valkey, KeyDB, and Dragonfly. For Garnet, validate the exact commands you use.
-
Do these engines replicate cross-vendor?
- Valkey and KeyDB can replicate with Redis/each other in many cases thanks to protocol parity. Dragonfly supports replica-of style ingestion for migration. Garnet’s replication is distinct; plan RDB import, checkpoint restore, or key-by-key copy.
-
What about hosted options?
- Cloud offerings vary. Evaluate SLAs, persistence modes, and the provider’s operational maturity with your chosen engine.
Conclusion
The Redis ecosystem’s diversification is good news for engineering teams. In 2025, you can optimize for compatibility (Valkey), raw performance (Dragonfly), durability at scale (Garnet), or a balanced, module-friendly middle ground (KeyDB). The winning choice depends on your workload profile and constraints.
Regardless of the engine, the migration discipline is the same: measure your own workloads, test under failure, seed and verify data, use dual writes and read shadowing, cut over behind flags, and keep a warm rollback path. With that approach, you can adopt the engine that best fits your technical and governance needs—without betting the business on day one.
Happy—and safe—migrating.