Summary: How DoorDash Reduced Feature Store Costs by 75% using CockroachDB
Recently I came across this article from Doordash — Using CockroachDB to Reduce Feature Store Costs by 75%
Following is the summary of this article.
DoorDash is a food delivery platform that needed a feature store to handle its massive machine-learning growth. The company found that combining different databases could boost efficiency and simplify operations. At first, DoorDash used Redis for its online machine-learning storage, but as the number of ML features increased, it became clear that Redis wasn’t cost-effective or maintenance-friendly. Therefore, DoorDash decided to supplement Redis with another database, CockroachDB. After using CockroachDB to augment its online serving platform, DoorDash reduced its cloud spend per value-stored on average by 75% with a minimal increase in latency.
DoorDash encountered maintenance overheads with large-scale Redis clusters (>100 nodes). Upscaling using native AWS ElastiCache consumed extra CPU, causing latencies to increase and resulting in an indeterminate amount of time required to complete a run. DoorDash had to create its approach to scaling Redis with almost no downtime. DoorDash’s process for upscaling large Redis clusters with zero downtime involves spinning up a Redis cluster with the desired number of nodes from the…