设计一个天气报告系统

难度: easy

开发一个系统，为用户提供指定地点的当前天气信息。

Solution

System requirements

Functional:

List functional requirements for the system (Ask the chat bot for hints if stuck.)...

Provide weather information, including temperature, humidity, wind speed and precipitation for a location.
Provide temperature and precipitation prediction for every hours for the rest of the day.
Provide temperature and precipitation prediction for next 7 days.

Non-Functional:

List non-functional requirements for the system...

Short end-to-end latency.
High availability.

Capacity estimation

Estimate the scale of the system you are going to design...

Assume there are 1 million active users, and each user checks weather around 3 times a day, and everytime they check the weather, 2 API calls are issued. Which means that the total daily traffic is

1000000 * 3 * 2 = 6 * 10^6 requests/day

6 * 10^6 requests/day / 100k = 60 requests / second.

Generally speaking a robust server can handle over 60 requests / second, but we need to have duplicate to maintain availability.

API design

Define what APIs are expected from the system...

API: getCurrentWeather

request: locationInfo

response: weatherInfo, temperatureInfo, precipitationInfo, humidity, windSpeed

API: getWeatherPrediction

request: locationInfo

response:

List<WeatherPrediction>

The list should contain 7 entries, each for a day in the next week.

The WeatherPrediction object should include a date object represents the date that the weather prediction is for, as well as certain weather information including temperature and weather prediction

Database design

Defining the system data model early on will clarify how data will flow among different components of the system. Also you could draw an ER diagram using the diagramming tool to enhance your design...

Not applicable as we do not store data internally for functionality, we might store data for BI purposes.

High-level design

You should identify enough components that are needed to solve the actual problem from end to end. Also remember to draw a block diagram using the diagramming tool to augment your design. If you are unfamiliar with the tool, you can simply describe your design to the chat bot and ask it to generate a starter diagram for you to modify...

The basic system is very simple as it consists of only client, server and the third party vendor. There is no internal data storage needed for the functionality.

We should add a cache in the system to store the already fetched weather data for each location, this would help alleviates traffic to vendor and potentially reduce cost.

However, to store Business Intelligence data (such as number of clicks, etc), there are three more components needed: A queue to store data processing jobs, job handler worker machines, and a data warehouse. The purpose of queue and worker is to handle the data asynchronously, so that the latency of handling data would not affect user traffic. The data warehouse is used to store BI data.

flowchart TD
    B[client] --> C{server}
    C --> H[Cache]
    H --> D[Vendor]
    C --> E[BI data queue]
    E --> F[Job Handler]
    F --> G[Data Warehouse]

Request flows

Explain how the request flows from end to end in your high level design. Also you could draw a sequence diagram using the diagramming tool to enhance your explanation...

For both API calls, the client sends the server a request. The server should check cache first to see if the weather / forcast information has already been fetched for the user location. If so, server returns the cached value. Otherwise, server fetches the weather information from vendor, and returns the vendor results.

Detailed component design

Dig deeper into 2-3 components and explain in detail how they work. For example, how well does each component scale? Any relevant algorithm or data structure you like to use for a component? Also you could draw a diagram using the diagramming tool to enhance your design...

Cache:

The system is mostly doing reading as read data from vendor. Therefore, we can use a read-through cache in the system. The server should read the cache first, if the data is not in the cache, the cache system read data from vendor and have the result cached.

BI Data Queue:

We use a queue in the system to store metrics for future usage. The queue could be a kafka or sqs job queue. This helps decouples the BI data handling and real time traffic, so that the process of handling BI data should not affects users latency

Trade offs/Tech choices

Explain any trade offs you have made and why you made certain tech choices...

The system chooses to use a cache to store the vendor response. This makes the system slightly complicated than without it. However, adding this extra layer will reduce cost and latency introduced by vendor call.

Failure scenarios/bottlenecks

Try to discuss as many failure scenarios/bottlenecks as possible.

If the server fails, all clients will be DOS. Therefore, this system will have duplication at server layer. The traffic should first go through a load balancer and one of the available servers should handle the request.

If the cache fails, the server can temporarily get results from the vendor directly. This will result in a degraded service and higher cost. Therefore, the system should choose a high availability cache such as Redis.

If the vendor fails, the system will be out of service. Therefore, when choosing a vendor, the availability SLA needs to be considered carefully.

Another way out of this is to have 2 vendors, one of them would be used as a backup. However, this also introduces additional costs.

Future improvements

What are some future improvements you would make? How would you mitigate the failure scenario(s) you described above?

Add another vendor and find out the best way to incorporate both results.

得分: 9