搜寻名人

难度: Medium

Submission

运行时间: 664 ms

内存: 16.4 MB

# The knows API is already defined for you.
# return a bool, whether a knows b
# def knows(a: int, b: int) -> bool:


class Solution:
    def __init__(self):
        self.cache = {}

    def cache_knows(self, i, j):
        """
        We can use a local variable to cache the result of knows API since its network latency may be high
        """
        if (i, j) in self.cache:
            return self.cache[(i, j)]
        else:
            self.cache[(i, j)] = knows(i, j)
            return self.cache[(i, j)]

    def is_celebrity(self, candidate, n):
        for i in range(n):
            if i == candidate:
                continue
            if not (self.cache_knows(i, candidate) and not self.cache_knows(candidate, i)):
                return False
        return True

    def findCelebrity(self, n: int) -> int:

        candidate = 0
        # loop invariant: (candidate, i) does not contain the celebrity which can be approved by mathematical induction
        for i in range(1, n):
            # knows(candidate, i) == 0 -> i is not the celebrity
            # knows(candidate, i) == 1 -> candidate is not the celebrity
            if self.cache_knows(candidate, i) == 1:
                candidate = i

        if self.is_celebrity(candidate, n):
            return candidate
        return -1

Explain

这个题解使用了两次遍历来找到名人。第一次遍历从候选人0开始，将候选人与之后的每个人进行比较，如果候选人认识当前人，说明候选人不是名人，将当前人设为新的候选人。这样第一次遍历后，我们就能找出潜在的名人候选人。第二次遍历用来验证这个候选人是否真的是名人，即所有其他人都认识候选人，且候选人不认识任何其他人。同时使用了缓存来存储已经调用过的knows API结果，避免重复调用。

时间复杂度: O(n)

空间复杂度: O(n^2)

# The knows API is already defined for you.
# return a bool, whether a knows b
# def knows(a: int, b: int) -> bool:


class Solution:
    def __init__(self):
        self.cache = {}

    def cache_knows(self, i, j):
        """
        We can use a local variable to cache the result of knows API since its network latency may be high
        """
        if (i, j) in self.cache:
            return self.cache[(i, j)]
        else:
            self.cache[(i, j)] = knows(i, j)
            return self.cache[(i, j)]

    def is_celebrity(self, candidate, n):
        for i in range(n):
            if i == candidate:
                continue
            # candidate必须不认识i，且i必须认识candidate，否则candidate就不是名人
            if not (self.cache_knows(i, candidate) and not self.cache_knows(candidate, i)):
                return False
        return True

    def findCelebrity(self, n: int) -> int:
        candidate = 0
        # 遍历寻找可能的名人候选人
        # 循环不变式：区间(candidate, i)内不包含名人，可以用数学归纳法证明
        for i in range(1, n):
            # knows(candidate, i) == 0 -> i 不是名人
            # knows(candidate, i) == 1 -> candidate 不是名人
            if self.cache_knows(candidate, i) == 1:
                candidate = i
        
        # 再次遍历，验证候选人是否真的是名人
        if self.is_celebrity(candidate, n):
            return candidate
        return -1

Explore

在第一次遍历中，如果`candidate`认识`i`，则根据名人的定义（名人不认识任何其他人，但所有其他人都认识名人），`candidate`显然不能是名人。因此，我们需要更换候选人。选择`i`作为新的候选人是基于假设如果`candidate`不是名人，而`candidate`认识`i`，那么`i`有可能是名人。这种方法不会错过名人，因为每次更换候选人都是因为当前候选人违反了名人的条件。最终，这个过程将筛选出一个唯一的可能的名人候选人，然后需要通过第二次遍历来验证这个候选人是否真的是名人。

如果`cache`中没有存储`knows`函数的某次调用结果，那么`is_celebrity`函数将直接调用`knows`函数来获取结果，并将其存储在`cache`中。这将确保每次`knows`调用的结果都被缓存，减少了重复调用的需要。如果没有使用缓存，每次需要验证候选人与其他所有人的关系时都可能需要重新调用`knows`，这会增加API调用次数，导致性能下降。因此，使用缓存可以显著优化性能，尤其是在网络延迟或API调用成本较高的情况下。

缓存`knows`API的结果特别有效于减少网络或API调用延迟，以及调用次数较多的情况。例如，在高并发环境或网络延迟高的情况下，缓存可以显著提升性能。然而，如果缓存过多数据，的确有可能影响性能，尤其是在内存有限的系统中。缓存占用过多内存可能导致系统其他部分的性能下降，或者触发更频繁的垃圾收集，从而影响整体性能。因此，合理设计缓存的大小和清理策略是很重要的，以确保缓存机制能够提供最大的性能优势而不引入额外的性能负担。