< Back





295. Find Median from Data Stream

We're asked to implement a class that maintains the median from a stream of numbers that we've seen
in realtime. If we're asked for the median and the number of numbers we've seen is even, we need
to return the average of the two middle numbers.

Using two heaps, we can maintain a max heap that contains all of the numbers less than the median,
and a min heap which contains all numbers greater than the median. We'll arbitrarily choose the max
heap to be the larger heap if the number of numbers we've seen is odd. This also means that the top
of the max heap will be the median if the number of numbers we've seen is odd.

When a new number is seen in the stream, we add it to the max heap, causing the max heap to
rebalance. If the number was lower than the median, nothing really changes. If the number was
higher, it'll be at the top of the max heap, and we'll definitely need to move it to the min heap.

Regardless, we'll pop off the max heap and push to the min heap, to maintain equal size between the
two heaps. If the min heap is ever greater in size than the max heap, then we pop off the min heap
and push to the max heap.

When searching for the median, we know that the heaps are equal in size, save for the median on the
max heap. If the heaps are equal in size, we return the average of the two tops. Otherwise, we
return the top of the max heap.

The solution is as follows:


  from heapq import *

  class MedianFinder:

      def __init__(self):
          self.min = []
          self.max = []
          self.min_size = self.max_size = 0

      def addNum(self, num: int) -> None:
          heappush(self.max, -num)
          heappush(self.min, -heappop(self.max))
          self.min_size += 1

          if self.min_size > self.max_size:
              heappush(self.max, -heappop(self.min))
              self.max_size += 1
              self.min_size -= 1

      def findMedian(self) -> float:
          if self.max_size > self.min_size:
              return -self.max[0]
          return (self.min[0] - self.max[0]) / 2


_ Time Complexity:

  O(log n) - Heap insertions and deletions are O(log n).

_ Space Complexity:

  O(n) - We require two heaps to store the numbers.