Database and cache design for user system. The blog is written based on our experience and online collected materials. Written by Les_baguettes and Charle.
Design User System
- Authentication service
- Friendship service
Scenarios
- Sign up, login, request user profile, update profile requst user profile is the most frequent operation, operations need request user profile.
- Support 100M DAU
- Sign up, login, update profile are less frequent.
100M * 0.1 (10 days once)/86400(seconds in a day) = 115
- Request user profile
100M * 100 (100 times a day)/86400(seconds in a day) = 115740
Peak = 100k*3 = 300k QPS
Service
- Authentication service: Sign up, login
- User service: User profile CRUD
- Friendship service
Storage
- MysQL/PostgreSQL: 1k QPS
- MongoDB/Cassandra NoSQL: 10k QPS
- Redis/Memcached: 100k ~ 1m QPS This is an estimation, it will vary based on the data model, schema design, hardware, etc.
So what we need for these two group of services?
- Sign up, login, update profile: relational database (MySQL/PostgreSQL) is enough.
- Request user profile: This service has much more reading operation then writing operation. 300k QPS, we can use Cache (Redis/Memcached).
Cache
Cache is a concepte, not a specific technology. It can be in-memory cache (Redis/Memcached) or distributed cache (CDN). It can also be a file system, even the cpu has cache. Cache is not only for server side, it can also be for client side (browser cache, local storage).
Cache Aside pattern
Cache and database are twoo different systems. They communicate via webserver. first check the cache, if not found, then get from database and set to cache.
cache.set("keys":"value")
cache.get("keys")
cache.set("foo","bar",time=60) # expire in 60 seconds
class UserService:
def get_user(self, user_id):
"""Memoization algorithm"""
key = f"user:{user_id}"
user = cache.get(key)
if user:
return user
user = database.get(user_id)
cache.set(key, user)
return user
def set_userA(self, user_id):
key = f"user:{user_id}"
cache.delete(key)
database.set(user_id)
def set_userB(self, user_id):
key = f"user:{user_id}"
database.set(user_id)
cache.set(key)
def set_userC(self, user_id):
key = f"user:{user_id}"
cache.set(key)
database.set(user_id)
def set_userC(self, user_id):
key = f"user:{user_id}"
cache.delete(key)
database.set(user_id)
Problem:
Concurrent update and read can lead to inconsistency between cache and database.
Scenario 1
sequenceDiagram
autonumber
participant R as Thread 1 (read get_user)
participant C as Cache (Redis/Memcached)
participant D as Database
participant W as Thread 2 (write set_user)
R->>C: GET user:{user_id}
C-->>R: miss
R->>D: SELECT user WHERE id=user_id
D-->>R: user=v1 (old)
Note over R,W: Context switch / interleaving to another thread
W->>C: DEL user:{user_id}
W->>D: UPDATE user(id)=v2 (new)
D-->>W: ok
Note over R,W: Context switch / interleaving to original thread
R->>C: SET user:{user_id}=v1 (old)
R-->>R: return v1
Failure mode (stale cache written after a concurrent update):
get_userreads a cache miss and fetchesv1from the database.- Before
get_usercan populate the cache, a concurrentset_userdeletes the cache key and writesv2to the database. get_userresumes and writes its previously fetchedv1back into the cache.- Now the database has
v2but cache hasv1, so most subsequent reads hit cache and see stale data until the key is invalidated again or expires (TTL).
Scenario 2
sequenceDiagram
autonumber
participant W as Thread 1 (write set_user)
participant C as Cache (Redis/Memcached)
participant D as Database
participant R as Thread 2 (read get_user)
W->>C: DEL user:{user_id}
Note over W,R: Context switch / interleaving to another thread
R->>C: GET user:{user_id}
C-->>R: miss
R->>D: SELECT user WHERE id=user_id
D-->>R: user=v1 (old)
W->>D: UPDATE user(id)=v2 (new)
D-->>W: ok
Note over W,R: Context switch / interleaving to old thread
R->>C: SET user:{user_id}=v1 (old)
R-->>R: return v1
Failure mode (read repopulates cache with stale data during a write):
set_userdeletes the cache key first (intending to force the next read to go to DB).- A concurrent
get_userlands after the delete but before the DB update; it misses cache and readsv1from the database. - The write completes and commits
v2to the database. - The read (still holding
v1) writesv1back into cache, recreating a stale cache entry even though the write succeeded. These four steps are not atomic and there may be failures in between. So we may get an old user after setting a new user.
We will get data inconsistency problem and will get dirty data. For two operations, we are not worry about the first operation fails, because it will raise an exception and not dirty data in the database. What we concern is the second operation fails, which will cause data inconsistency between cache and database.
Consistency Solutions:
Can we add a lock to make these four steps atomic? The answer is no.
- the database and cache are two different systems, we cannot have a distributed transaction across them.
- third party distributed lock system like zookeeper will slow down the performance, and may also fail.
- The processes do not share the same memory space, so in-memory lock will not work.
Best pratice:
- database.set(key, user)
- cache.delete(key)
Why set before delete
When using cache-aside, a write needs to update the database and invalidate the cache. Two common orders:
- Delete then set (invalidate → write)
cache.delete(key)→database.set(key, user) - Set then delete (write → invalidate) (recommended)
database.set(key, user)→cache.delete(key)
Assume:
t_cache= cache op latency (GET/SET/DEL), usually sub-millisecondt_db_r= database read latencyt_db_w= database write latency, typicallyt_db_w >> t_cache- Reads are much more frequent than writes (
λ_read >> λ_write)
Why “set then delete” is more optimal
- With delete → set, you create a cache-miss window of roughly
t_db_w(after the delete, before the DB write commits). During this window, a read is forced to the DB and can still see the old valuev1, then repopulate cache withv1. This can “poison” the cache with stale data after the write commits. - With set → delete, the database is already
v2before invalidation. A cache miss will readv2, so you avoid stale-cache poisoning under normal conditions. The remaining inconsistency is a short stale-read window where some requests can still hit old cache data until the delete happens (≈t_cache), or a rare case where the delete fails (then stale data lasts until TTL).
You can estimate “how often it happens” per write by comparing windows:
- delete → set: expected affected reads ≈
λ_read * t_db_w - set → delete: expected affected reads ≈
λ_read * t_cache
Since t_db_w >> t_cache, the first order exposes many more reads, and it can create longer-lived stale cache entries.
Potential inconsistency flows
A) Delete then set (invalidate → write): stale cache poisoning
sequenceDiagram
autonumber
participant W as Writer (set_user)
participant C as Cache
participant D as Database
participant R as Reader (get_user)
W->>C: DEL user:{id}
Note over W,R: During t_db_w (DB write not committed yet), in another thread
R->>C: GET user:{id}
C-->>R: miss
R->>D: SELECT user:{id}
D-->>R: v1 (old)
Note over W,R: Original thread succesfully updated db
W->>D: UPDATE user:{id} = v2 (new)
D-->>W: ok
Note over W,R: Another thread already have old user data hence polluting the cache
R->>C: SET user:{id} = v1 (old)
Note over C,D: Cache now has v1 while DB has v2
In this order, cache inconsistency is common because the vulnerable window is about t_db_w (relatively large), and reads are frequent.
B) Set then delete (write → invalidate): rare inconsistency
sequenceDiagram
autonumber
participant B as Reader B (get_user)
participant C as Cache
participant D as Database
participant A as Writer A (set_user)
B->>C: GET user:{id}
C-->>B: miss
B->>D: SELECT user:{id}
D-->>B: v1 (old)
Note over A,B: Context switch / interleaving
A->>D: UPDATE user:{id} = v2 (new)
D-->>A: ok
A->>C: DEL user:{id}
Note over A,B: Reader B resumes with v1 in memory
B->>C: SET user:{id} = v1 (old)
Note over C,D: Cache now has v1 while DB has v2
Failure mode (reader overwrites cache with old data after invalidation):
- Reader B misses cache and reads
v1from the database. - Writer A updates the database to
v2. - Writer A deletes the cache key.
- Reader B (still holding
v1) writesv1into cache, making cache stale again.
For this to happen, Reader B’s end-to-end execution time must fully overlap and outlast Writer A’s work (DB write + cache delete). Under typical latency assumptions (t_db_w and t_cache are small compared to long-tail delays), this is usually rarer than scenario A, but when it does occur it is worse, because it recreates a stale cache entry (not just a transient stale read).
We need to set ttl for cache to avoid stale data. for example ttl = 7 days, so after 7 days, the cache will expire and we will get the latest data from database. the inconsistency will last at most 7 days. This is call eventual consistency.
strong consistency vs eventual consistency ?
Cahe can optimise the read performance, but it can not improve the write performance. Because we still need to write to database.
Cache through pattern
All read and write operations go through the cache. The cache is the only entry point to the database. Redis is often the choice for this partern. Cache and database will be in sync for all given time.
Write throughout bottleneck
We need to have more database.
Authentication service
Session
After login, we need to create a session object for the user and send session key back to the browser as a cookie. The browser will send the cookie back to server for each request. The server will get the session key from cookie and get the user id from session table and check if the user_id exists and not expired.
Session table
The server will store the session info in session table.
| Field | Type | Description |
|---|---|---|
| session_key (Token) | string | A hash value, globally unique, unpredictable (UUID, etc) |
| user_id | Foreign Key | References the User table |
| expire_at | timestamp | Expiration time |
| device_token | string | optional, used to identify the device |
- Server will not delete the session key noramally, it will just check whether it is validat or expired.
- On device login vs multi device login
- On device login: when user login from a new device, invalidate the old session key. Server can also send a notification to the old device like what we have in whatsapp or wechat.
- Multi device login: allow user to login from multiple devices, each device has its own session key.
- What data system to use for session table?
- For small scale system, we can put session table in cache, even it disappears, user can login again. we can give a new session key based on user id and password.
- For large scale system, we can put session table in relational database like MySQL/PostgreSQL and add cache.
- We can have a index on the user_id to fast query all sessions for a user.
Cookie
Session_key will be stored in cookie on the browser side. A cookie is a small piece of data stored on the client side (browser) and sent to the server with each request. It is used to maintain session state and store user preferences. You can treat it as a hash table store on the client side. The Cookie will be sent to server for each request. So the bigger the cookie size, the more bandwidth it will consume and the slower the request will be.