Evolution of the Netflix API Architecture

Sep 14, 2023

This week’s issue brings to you the following:

Evolution of the Netflix API Architecture
Why Amazon Abandoned Microservices Architecture in Favor of Monolith?
How Slack Send a Message to a Million Clients in Real Time

So, let’s dive in.

Postman's VS Code Extension (Sponsored)

The extension brings the Postman API Platform closer to developers' workflows within VS Code—now with expanded functionality supporting collections and environments. Over 135,000 developers have installed the Postman VS Code extension since it was unveiled this summer, and it's a featured choice on Microsoft's VS Code Marketplace. Check it out.

Download the extension

Evolution of the Netflix API Architecture

The highly scalable and loosely linked microservice architecture used by Netflix is well known. Independent services provide independent scaling and varied rates of evolution. Yet, they increase the complexity of use cases involving several services. Netflix provides a uniform API aggregation layer at the edge rather than making hundreds of microservices available to UI developers.

During that time, Netflix API architecture went through 4 main stages:

Monolith: A complete application is packaged as a single deployment unit.
Direct Access: This architecture enables client apps to hit the microservices, which is unsuitable for many clients.
Gateway Aggregation Layer: As they observed much duplicative data fetching, Netflix built graph API to provide unified abstraction on top of data and relationships.
Federated Gateway: As the number of consumers and the amount of data in the graph increased and the API team was disconnected from the domain expertise, they introduced a federated gateway to provide a unified API for consumers while giving backend developers flexibility and service isolation.

Their GraphQL Gateway is based on Apollo’s reference implementation and is written in Kotlin.

Evolution of an API Architecture (Credits: Netflix)

Why Amazon Abandoned Microservices Architecture in Favor of Monolith?

In the latest post, a team working on Amazon Prime Video explained their approach to ensuring customers receive high-quality content. They use a tool to monitor every stream viewed by customers and use it to identify quality issues.

The tool was intended to run on a small scale, so they noticed that onboarding more streams to the service was very expensive. So, they decided to revise the architecture.

The initial architecture consisted of serverless components orchestrated by AWS Step Functions. What they did here is to move expensive operations between components into a single process to keep the data more trans within process memory.

Building initial solutions with serverless components was a good choice because it enabled it to be done quickly and scale each component, yet such a way of using some components caused issues at 5% of the expected load.

After the analysis, they concluded that the distributed approach didn't bring many benefits, so they packed all the components into a single process. Moving their service to a monolith reduced their infrastructure cost by over 90% and increased scaling capabilities.

The diagram represents a control and data plan for the updated architecture. All the components run within a single ECS task, therefore the control doesn't go through the network. Data sharing is done through instance memory and only the final results are uploaded to an S3 bucket. — The updated architecture for monitoring a system (Source: Amazon Prime Tech)

How Slack Send a Message to a Million Clients in Real Time

Usually, Slack sends millions of messages daily on different real-time channels. As a result, they have peak times, which are generally during work hours locally.

Their architecture consists of a core service written in Java.

Channel Servers - are stateful, in-memory servers with some channel history. They are mapped to a subset of channels; every server sends and receives messages for those channels.
Gateway Servers - are in-memory servers that hold user information. They are an interface between Slack clients and Channel Servers. They are deployed in multiple regions.
Admin Servers - are the in-memory interfaces between the web app backend and Channel Servers.
Presence Servers - are in-memory to track users online and show green dots.

Every Slack client has a persistent WebSocket connection to Slack servers to receive real-time events. At the app's start, it fetches settings from the web app backend, the Hacklang codebase that hosts all their APIs. It also consists of JavaScript code to render Slack clients, which make a WebSocket connection to the nearest regions (Envoy proxy and Gateway Servers).

Once everything is set up, sending a message to a channel is broadcasted to all clients online in the channel. What is happening here:

The client hits Webapp API to send a message.
Webapp then sends that message to the Admin Server, which looks at the channel ID in the message.
The message is routed to the appropriate Channel Server, which, when it receives the message, sends it out to every gateway server across the world that is subscribed to the channel.

To read more, check the text from the Slack Engineering blog.

Bonus: Linux Commands Cheat Sheet

An excellent overview of all Linux commands, such as:

Basic File Operations: ls, cp, mv, rm, ...
File Viewing: cat, less, head, tail, nl, ...
Dates and times: xclock, cal, date, ...
Network: traceroute, ifconfig, netstat, who, ...
Viewing Processes: ps, uptime, w, top, ...

Check the high-resolution image here.

More ways I can help you

1:1 Coaching: Book a working session with me. 1:1 coaching is available for personal (leadership) and organizational/team growth topics. Let’s win together 🚀.
Promote yourself to 14,000+ subscribers by sponsoring this newsletter.

Tech World With Milan Newsletter

Discussion about this post