设计类YouTube或Netflix平台

难度: medium

设计一个视频分享服务,类似于YouTube,使用户既能上传视频,也能观看视频,并通过搜索发现内容。

Solution

System requirements

Functional:

  1. Authentication, we want to create a way for users to either create an account or sign up via 3-rd party auth services like Google, Facebook, Etc.
  2. Upload huge videos, unlimited amount of time
  3. Store videos, encode them, process them in chunks, store them in different resolutions, and distribute them all over the globe.
  4. Have a CDN network to fetch videos
  5. Ability to search/navigate videos
  6. Ability to watch videos without downloading them

Non-Functional:

  1. Support very high scale (1.5 Billion active users monthly) meaning 10.5^ 9 users and 10.5 * 10 ^ 6 daily which is a very high scale across the globe
  2. Durability when uploading, we can't lose a video when someone uploads it
  3. Latency is very important when viewing videos, more important than uploading. We need to make sure we can load the first minute of playtime within 400 ms and then keep loading chunks in the background
  4. Let's assume there are about 1B videos on our service, if every single video is about 200 MB, that's about 10^9 * 2 * 10^3 -> 10^12 mb -> 200B mb -> over 2PB of data
  5. Let's assume about 2% of videos are hot (2% of 1 B) -> 20 Million * 200 MB is 400 GB of CDN required storage around the globe.
  6. Let's assume we get around 1M concurrent views, so we get 200-400k req/s to fetch chunks of the videos at different resolution
  7. Our network Ingrees/Engress is very high, possibly in the terabits range.

Capacity estimation

Estimate the scale of the system you are going to design...

  1. Support very high scale (1.5 Billion active users monthly) meaning 10.5^ 9 users and 10.5 * 10 ^ 6 daily which is a very high scale across the globe
  2. Durability when uploading, we can't lose a video when someone uploads it
  3. Latency is very important when viewing videos, more important than uploading. We need to make sure we can load the first minute of playtime within 400 ms and then keep loading chunks in the background
  4. Let's assume there are about 1B videos on our service, if every single video is about 200 MB, that's about 10^9 * 2 * 10^3 -> 10^12 mb -> 200B mb -> over 2PB of data
  5. Let's assume about 2% of videos are hot (2% of 1 B) -> 20 Million * 200 MB is 400 GB of CDN required storage around the globe.
  6. Let's assume we get around 1M concurrent views, so we get 200-400k req/s to fetch chunks of the videos at different resolution
  7. Our network Ingrees/Engress is very high, possibly in the terabits range.

API design

  1. Reading Data:
  • A client makes a read request which is directed to a server via a LoadBalancer.
  • The server retrieves data from a Metadata database.
  • If the data is not in the cache(CDN), the server fetches it from S3 chunks.
  1. Writing Data:
  • A client sends a write request to a server through a LoadBalancer.
  • The server asks Zookeeper for a write node.
  • Zookeeper provides a node ID for the server to execute a stateful write.
  • The server then performs a stateful write and acknowledges the write back to the client.
  1. Video Uploading and Processing:
  • A client begins sending a video over to a Stateful Node.
  • The Stateful Node stores the raw video in an S3 store and sends the raw S3 URL to SQS for processing.
  • An Encoding Worker retrieves the video from SQS, processes it (likely encoding or chunking), and then stores the processed chunks back into S3 chunks.
  • Additionally, the metadata related to the video is updated in the Metadata database.

This architecture depicts a distributed system designed for scalability and efficiency, leveraging services like load balancers, caches, distributed storage (S3), and a coordination service (Zookeeper) for managing distributed nodes. It specifically addresses the needs of video processing and storage, including the initial upload, processing, and retrieval of video data.

Database design

There are going to be two primary databases + cache/CDN solutions.

First, we want to have a database to store our videos. The key idea is that we want to break up videos into chunks (of about 1-4 MB). These chunks are going to be needed to store video in small chunks and the user can switch between different resolutions based on the user's ability to receive videos.

We also want to implement a PUSH-based CDN to preload 'hot movies' to our CDN (in this case we can consider CloudFront) to push the links to S3 chunks and get the latency down with this approach.

Lastly, we need a database to store the metadata of our videos/movies. We will store that unstructured metadata in document storage like MongoDB for easy scalability and other ACID-compliant benefits that come with it.

High-level design

Initially, a client requests to read data by connecting to a load balancer, which then forwards this request to a Server. The Server interacts with a Metadata database to access necessary metadata. For reading operations, the Server first attempts to retrieve the s3 url from a CDN; if the required data isn't available there, it proceeds to fetch the data from the metadata to get the S3 chunks.

For write operations, a client sends a write request to the LoadBalancer, which is then directed to a server. This server asks Zookeeper for a write node, receiving a node ID in response. With this ID, the server performs a stateful write operation and confirms the action back to the client.

When a client begins sending a video, the data is routed to a Stateful Node. This node is responsible for storing the raw video data in an S3 store and then sends the raw S3 URL for processing to an AWS SQS. An Encoding Worker retrieves this task from SQS, processes the video into S3 chunks, and updates the Metadata database accordingly.

The read flow updates the 'hotness' of the video and makes the adjustment for the CDN and how long the video should live in the CDN.

flowchart LR
    B[client] -- Read video/id -->  C[LoadBalancer]
    C[LoadBalancer] -- credentials -->  Q[Auth Service]
    Q[Auth Service] -- signed auth token -->  B[Client]
    C -->  D[Server]
    D[Server] -->  Y[Metadata database]
    D[Server] -- READ | Queries metadata -->  F[CDN]
    D[Server] -- READ | Requests data from S3 -->  M[S3 Chunks]
    D[Server] -- video id not found -->  B[client]

    A[client] -- write video data --> Z[LoadBalancer] --> I[Server] -- request write node --> K[Zookeeper]
    K[Zookeeper] -- provide node ID for writing -->  I[Server]
    I[Server] -- write data to storage node -->  A[client]

    A[client] <-->  Q[Auth Service]
    A[client] -- stream video content -->  U[Stateful Node]
    U[Stateful Node] -- upload raw data to S3 -->  H[Raw S3 Store]
    U[Stateful Node] -- provide S3 URL for processing -->  T[SQS]
    J[Encoding Worker] <--- T[SQS]
    J[Encoding Worker] -->  M[S3 chucnks]
    J[Encoding Worker] -->  Y[Metadata database]

Request flows

In a system involving multiple components for handling read and write operations, as well as video processing, the sequence of interactions begins with a client initiating a read request. This request is first received by a LoadBalancer, which then forwards the request to a Server. The Server performs multiple tasks based on this request: it accesses a Metadata Database to retrieve necessary metadata, checks with a CDN (Content Delivery Network) for cached content ("READ | first here"), and accesses S3 Chunks for data not found in the CDN ("READ | go to s3"). It's noted that reads are managed through either the CDN or directly from S3 Chunks, depending on where the requested data resides.

For write operations, the Client sends a write request to the LoadBalancer, which again forwards this request to a Server. This time, the Server requests a write node from Zookeeper, a service that manages the distribution of write operations across nodes. Zookeeper responds with the node ID that should take the write operation. The Server then communicates back to the Client, indicating it can perform a stateful write.

When the Client begins sending a video, it directly interacts with a Stateful Node. This node is responsible for putting the raw video data into an S3 store designated for raw content. It also sends the raw S3 URL for processing to SQS (Simple Queue Service), which queues the video for processing. An Encoding Worker picks up this task from SQS, processes the video into S3 Chunks, and updates the Metadata Database with the new metadata related to the processed video. This series of interactions ensures that read and write operations, along with video processing, are efficiently managed across different components of the system.

Detailed component design

The first clarification is to ensure that whenever we are only polling from the SQS queue when the enqueuing service is ready to take the load. With that in mind, it's pretty obvious to ensure that the encoding service is able to read from the S3 bucket and process the video doing the following steps:

  1. decoding the video from whichever type it is (.imec, mp4, etc.) into a standard video format of our service
  2. decoding video into many resolutions from (140p to 4k HD).
  3. Note how encoding may take a longer time for a higher resolution, so we'll make sure that the video can be published sooner at a lower resolution.
  4. Another important thing to see is that uploading the chunks to S3 should be idempotent, we don't care if the chunk is uploaded more than one time as long as it's uploaded at least once. That allows to utilization an unbounded workflow engine like AWS Step Functions to make this encoding process easier to implement
  5. We also want to ensure that the video that is being uploaded doesn't contain restricted material, the encoding service would handle these checks asynchronously with the encoding steps.
  6. Encoding is very computationally heavy and it is the bottleneck of our system

The second thing we would to focus on is our CDN design. To ensure the users have a good experience, we'd like to ensure that we can provide these core functionalities in our service:

  1. The chunks are small enough that we can buffer them in the client's memory to allow for smoother video play
  2. We can dynamically upgrade/downgrade from the resolution on the fly by maintaining a stateful read flow and checking if the user's internet connection is able to receive chunks fast enough
  3. The user is able to preload/download the entire video to allow for offline play in any resolution
  4. Our CDN stores link to the S3 URL of the chunk (not the other video file) to perform fast sequential-like reads

Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...

The key idea is that we want to trade off write latency/consistency (writes can be really slow) for availability (our system relies on reading from highly available S3 service), latency (most of the 'hot movie chunks' are stored in our store in our CloudFront CDN), and durability - we persist the write in the SQS to make sure that the message is read at least once.

We do have eventual consistency in the system because we don't want to wait until MongoDB finishes replicating the metadata across all nodes before the video is considered available.

Additionally, we trade-off waiting until high resolution is available by allowing to see 140p encoded video which completes faster

Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.

1. We have quite a bit of SPOF - which is our S3 bucket. If S3 is down, we pretty much can't do any reads or writes. The good part is that AWS S3 is highly available and that is not a major concern, but if it is down, it would bring the entire system down.

2. We can be also concerned that in case the Auth Service is down, we would completely lose access to writes, but we don't have to completely lose access to reads. We can consider letting users watch the videos, even if they can't sign in.

3. Another point of failure/bottleneck is the availability and deployment of CDN, we would want to ensure CDN is deployed across the entire globe (as well as our database nodes are distributed) to minimize latency worldwide.

Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?

The system can definitely be improved by adding AWS Step Functions as the workflow engine for processing videos, we would need to ensure that the upload function is idempotent and we can upload something at least once to make sure that it works. That would be an ideal example of a workflow since we have a sequence of tasks that need to be performed at least once.

The second portion to figure out would be the CDN approach. I'd love to get more research done on the video owner/video author to determine if this video needs to be preloaded in Cache to allow for faster reads.

It would be great to dive deep into the moderation service. Moderation service would take the entire video and use Ml/AI to detect if it infringes copyrighted material (either in audio or video). If the copyrighted material is only in audio, we can remove the voice audio from the encoded video, notify the user, and upload the video. If the video is copyrighted, we can notify the user of the certain frames that we deleted and ask the author to explain/remove the content. If the video is outright malicious/inappropriate (excessive violence, adult content) - the video can be outright rejected and the author warned and or struck.

The auth service can definitely be improved to allow either sign-ups with the password/email or token from SSO services like Google, Facebook, etc. User data can be stored in MySQL database.


得分: 9