设计Uber后端系统

难度: hard

构建一个类似于Uber的共享出行平台的架构,该平台作为寻求交通的乘客与拥有空闲车辆的司机之间的中介。

Solution

System requirements

Functional:

  1. Registration
  2. Ride request
  3. Matching algorithm
  4. Tracking the rider location
  5. Payment processing
  6. Passengers and drivers have rating and can leave feedback

Non-Functional:

  1. 10 different towns
  2. 100k active users
  3. 3 rides per week on each user
  4. 5k drivers in each town

Capacity estimation

  1. 100k*3/(7*100k)= 0,5 RPS on rides requests
  2. 3500 active drivers in peak

API design

Registration

Input:

  • phone number

Output:

  • registration token

Phone approve

Input:

  • registation token
  • otp code

Output:

  • JWT (with user id)

Ride request

Input:

  • user JWT
  • start coordinates
  • finish coordinate

Output:

  • driver data
  • id
  • name
  • rating
  • ride data
  • total distance
  • price


Get driver location

Input:

  • driver id

Output:

  • coordinates

Update driver location

Input:

  • driver id
  • coordinates

Output:

  • success

Add payment card

Input:

  • JWT
  • card data
  • number
  • date
  • cvv

Output:

  • payment url (from external payment service)

Leave feedback

Input:

  • JWT (with user id)
  • driver id
  • rating

Output:

  • success

Database design

Users table

  • id
  • phone
  • registered_at

Driver table

  • id
  • name
  • registered_at

Feedback table

  • reviewer type (user or rider)
  • reviewer id
  • feedback on type (user or rider)
  • rating (from 1 to 5)
  • date

Driver location

  • driver id
  • coordinates
  • last update date

Payment data

  • user id
  • card token (for identification in external payment system)
  • masked card number (like ***1234)

High-level design

  1. Load balancer. We can use Nginx, because it is popular solution
  2. API Service. Service that provides API for the users
  3. Database. The database will store all main data about users, drivers feedbacks and users payment card. Relational database like PostgreSQL may be used here.
  4. Riders location cache. Database for storing current riders location. We will use in-memory database like Redis, because the data is updated very frequently and the users need to get these updates fast.
  5. Graph service. The service that processes the algorithm of matching users and riders based on locations.
  6. Graph database. The database will store current drivers locations. So Graph service will use these locations to get the nearby drivers for the user
  7. Payment integration service. Service for integration with external payment service. The service is needed for increasing the availability of the system. It will incapsulate all the integration with the external system.
  8. External payment service
  9. Message broker. The broker will be used for communication between services.
flowchart TD
    B[client] --> LB{Load Balancer}
    LB --> API{API Service}
    API --> D[Database]
    API --> G{Graph Service}
    G --> GD[Graph database]
    API --> MB{Message broker}
    MB --> I{Payment Integration Service}
    I --> P{External Payment Service}
    API --> C[Riders location cache]
    MB --> G

Request flows

Ride request

  1. The request goes through the LB. The LB checks user's JWT
  2. The requests goes to the API service
  3. API Service retrieves all nearby riders from the Graph serivce. Graph service searches the nearby riders for the user. Also graph service counts the ride price based on start and finish coordinates
  4. API Service filters the nearby riders using riders ratings from the database. The service peaks a suitable rider based on location and rating
  5. API Service retrieves user's payment information and sends payment data and price to the payment service using message bus. Payment service will charge the user for the total amount
  6. API Service returns data to the user

Update driver location

  1. Every few seconds rider app sends information about the current location
  2. The requests goes to the API service
  3. API Service updates current location in the cache
  4. API service sends location update to the messsage bus
  5. Graph service receives the location update and updates it in the database

User gets driver location

  1. Client sends request for the current rider location
  2. API service retrieves the rider location from the cache and returns it to the user

Detailed component design

Graphs

Graph service and database must use technologies and algorithms that are used for the map graphs. The service must be stateless and highly scalable. For database scalability we will use:

  • partitioning based on the different towns
  • replication in each town. It will increase availbility and decrease read latency

Trade offs/Tech choices

PostgreSQL for main database

We will use PostgreSQL because it is very popular and has big community. We don't have very high load on this database, so we don't need to think about some noSQL option. To increase availability and read throughput the replication we be used.

Kafka for message broker

Kafka is popular solution for message broker tasks. It is highly available and scalable.

Redis for rides location cache

We need an in-memory database, because the data is updated and requested frequently. For increasing availability we can use Redis Cluster replication

Failure scenarios/bottlenecks

As mentioned above, all the services are stateless, so they can be scaled horizontally. They are not a bottleneck

All the databases are replicated, so they are not a bottleneck two

Future improvements

The graph service may be improved with more advanced algorithms. Also some machine learning may be used for better matching and price counting.


得分: 9