难度: hard
Solution
System requirements
Functional:
- Registration
- Ride request
- Matching algorithm
- Tracking the rider location
- Payment processing
- Passengers and drivers have rating and can leave feedback
Non-Functional:
- 10 different towns
- 100k active users
- 3 rides per week on each user
- 5k drivers in each town
Capacity estimation
- 100k*3/(7*100k)= 0,5 RPS on rides requests
- 3500 active drivers in peak
API design
Registration
Input:
- phone number
Output:
- registration token
Phone approve
Input:
- registation token
- otp code
Output:
- JWT (with user id)
Ride request
Input:
- user JWT
- start coordinates
- finish coordinate
Output:
- driver data
- id
- name
- rating
- ride data
- total distance
- price
Get driver location
Input:
- driver id
Output:
- coordinates
Update driver location
Input:
- driver id
- coordinates
Output:
- success
Add payment card
Input:
- JWT
- card data
- number
- date
- cvv
Output:
- payment url (from external payment service)
Leave feedback
Input:
- JWT (with user id)
- driver id
- rating
Output:
- success
Database design
Users table
- id
- phone
- registered_at
Driver table
- id
- name
- registered_at
Feedback table
- reviewer type (user or rider)
- reviewer id
- feedback on type (user or rider)
- rating (from 1 to 5)
- date
Driver location
- driver id
- coordinates
- last update date
Payment data
- user id
- card token (for identification in external payment system)
- masked card number (like ***1234)
High-level design
- Load balancer. We can use Nginx, because it is popular solution
- API Service. Service that provides API for the users
- Database. The database will store all main data about users, drivers feedbacks and users payment card. Relational database like PostgreSQL may be used here.
- Riders location cache. Database for storing current riders location. We will use in-memory database like Redis, because the data is updated very frequently and the users need to get these updates fast.
- Graph service. The service that processes the algorithm of matching users and riders based on locations.
- Graph database. The database will store current drivers locations. So Graph service will use these locations to get the nearby drivers for the user
- Payment integration service. Service for integration with external payment service. The service is needed for increasing the availability of the system. It will incapsulate all the integration with the external system.
- External payment service
- Message broker. The broker will be used for communication between services.
flowchart TD B[client] --> LB{Load Balancer} LB --> API{API Service} API --> D[Database] API --> G{Graph Service} G --> GD[Graph database] API --> MB{Message broker} MB --> I{Payment Integration Service} I --> P{External Payment Service} API --> C[Riders location cache] MB --> G
Request flows
Ride request
- The request goes through the LB. The LB checks user's JWT
- The requests goes to the API service
- API Service retrieves all nearby riders from the Graph serivce. Graph service searches the nearby riders for the user. Also graph service counts the ride price based on start and finish coordinates
- API Service filters the nearby riders using riders ratings from the database. The service peaks a suitable rider based on location and rating
- API Service retrieves user's payment information and sends payment data and price to the payment service using message bus. Payment service will charge the user for the total amount
- API Service returns data to the user
Update driver location
- Every few seconds rider app sends information about the current location
- The requests goes to the API service
- API Service updates current location in the cache
- API service sends location update to the messsage bus
- Graph service receives the location update and updates it in the database
User gets driver location
- Client sends request for the current rider location
- API service retrieves the rider location from the cache and returns it to the user
Detailed component design
Graphs
Graph service and database must use technologies and algorithms that are used for the map graphs. The service must be stateless and highly scalable. For database scalability we will use:
- partitioning based on the different towns
- replication in each town. It will increase availbility and decrease read latency
Trade offs/Tech choices
PostgreSQL for main database
We will use PostgreSQL because it is very popular and has big community. We don't have very high load on this database, so we don't need to think about some noSQL option. To increase availability and read throughput the replication we be used.
Kafka for message broker
Kafka is popular solution for message broker tasks. It is highly available and scalable.
Redis for rides location cache
We need an in-memory database, because the data is updated and requested frequently. For increasing availability we can use Redis Cluster replication
Failure scenarios/bottlenecks
As mentioned above, all the services are stateless, so they can be scaled horizontally. They are not a bottleneck
All the databases are replicated, so they are not a bottleneck two
Future improvements
The graph service may be improved with more advanced algorithms. Also some machine learning may be used for better matching and price counting.
得分: 9