In previous article
, I show you how to design a URL shortener service.
I presented the basics of capacity planning, collecting requirements, DB sharding, etc.
Now, I would like to cover another interesting system design interview question - How would you design Twitter?
Requirements and capacity planning
We want a system that allows a user to share short text messages (posts).
Those messages will be displayed in 2 places:
- on user board
- on home board
User’s X board displays posts that user X published.
Every logged-in user Y has also their own private home board.
Home board displays posts of users subscribed by user Y.
You can ask many questions here to clarify e.g.
- How users subscribe to each other. Is it mutual (bidirectional like Facebook friends) or not (Twitter follows).
- Do we need to support videos and images in posts?
This time, we will use simple text messages and Twitter-like follow (unidirectional) mechanism and focus on building news feed mechanism (user and home boards) as this is the most interesting part. Note that, a similar mechanism could be used to build a Facebook wall.
Images and videos are not a big problem anyway. We could just store links to video/image files in the post and upload binary content to a separate service - CDN-based e.g. on S3.
As not to bore you with capacity math let’s just say that we want to have a highly available, global scale service (check previous article
how to do that if you want).
Naive solution - technically those are just 2 SQL tables
In the SQL world, you would have 2 tables
posts (storing messages) and
subscriptions (storing who subscribes to whom).
So user board is just:
select * from posts
where author = :user_id
and home board is:
select * from posts
on subscriptions.subscribed_user = posts.author
where subscriptions.subscriber = :user_id
(of course both with some
As you might notice, this approach would not scale much and can become very slow in global scale scenario (mostly because of
join operation - check out why
joins are “bad” in NoSQL mindset article
To support global scale and high availability use case, everything runs on multiple hosts (including database).
If you don’t know how you could shard a DB or how global scale DB systems works - go back to URL shortener article
and DynamoDB desing
The trick in this design is to store both boards precomputed in some distributed memory store (Twitter uses Redis for that).
You can separate reads from writes using CQRS pattern so we will have 2 sub-services here - Feeds reader and publisher.
When user X wants to publish a new post, the request will be forwarded to the Feeds publisher which will:
- Store that post in Posts DB.
- Update user X board.
- Fetch all subscribers of user X from Subscribers DB.
- Update each subscriber’s home board with this post in Feed memDB.
- Publish “post published” event to the new feeds queue
When user X wants to read the home board or other user board, the request will be forwarded to the Feeds reader which will just read it from Feeds memDB - this is very fast!
We can accept some delay on write but want to be as fast as possible on read.
Note that, new feeds queue is not needed right now, but this is a good way to separate the core of the system (news feed) from other services like trends, new content discovery, search, etc. Alternatively, other services could just query Feeds reader. It depends on the use case of course.
Also, all the data is also stored in persistent DB (Posts DB) so in case of disaster, you should have some tool that will republish all posts to Feeds memDB.
You should also consider high availability here.
Data stored in memory is volatile.
To mitigate that, you should replicate each data on few nodes e.g. 3.
All of that (storing few copies of pre-computed data in memory) is expensive.
That’s the cost of being fast.
You should put a lot of attention into what data do you really need to store in memDB!
Twitter just stores a list of less than 1k entries on every precomputed board and each entry is just 20 bytes long (
tweet_id - 8 bytes,
user_id - 8 bytes,
internal_bit_field - 4 bytes).
Few problems with that architecture are:
- Celebrity with 10 million subscribers posting something generates an explosion of messages. It is slow to precompute millions of boards for just 1 post. That kind of post also causes a massive reaction (avalanche effect).
- Time bounded events like New Year’s Eve causes a massive explosion of messages in a short period of time coming from a significant amount of users.
- Can memDB memory end? What will happen then?
We just learn how to design a news feed system using a precomputation approach.
It is not just about Twitter.
It is also used in Facebook and Instagram walls (where the photos are the news) or even in dating apps like Tinder.
As a bonus, I leave a link to Raffi Krikorian (Twitter VP) talk how Twitter really works
Article by Tomasz Kuczma
Software engineer with a passion. Interested in computer networks and large-scale distributed computing.
He loves to optimize and simplify software on various levels of abstraction starting from memory ordering through non-blocking algorithms up to system design and end-user experience. Geek. Linux user.
The views I express are my alone and they do not necessarily express the views of my employer or ex-employers.
They are not investment advice nor based on any non-public information of any kind.
Poglądy, które wyrażam, są tylko moje i niekoniecznie wyrażają opinie mojego pracodawcy lub byłych pracodawców.
Nie są poradami inwestycyjnymi ani nie opierają się na jakichkolwiek niepublicznych informacjach.