Find the k-th largest element in an array of values
[1, 2, 3, 5, 4] When k = 1, the 1st largest element is 5 When k = 3, the 3rd largest element is 3
Solution | Time | Space |
---|---|---|
Sorting | O(n log n) | O(1) to O(n) |
Min-heap | O(n + k log n) | O(n) |
Max-heap | O(n log k) | O(k) |
Quickselect | O(n) to O(n2) | O(log n) to O(n) |
Median of medians | O(n) | O(log n) |
To find the k-th largest value:
Sorting the entire array is fine when n is small, but fully sorting the array is doing a lot of unnecessary work when all we want is to find a single number. Plus you would want more-efficient solutions when n is large.
def kth_largest_element(values, k) values.sort![-k] end
Time | O(n log n) | Average time complexity for sorting n numbers |
Space | O(1) to O(n) | Varies depending on the sorting algorithm |
To find the k-th largest value:
Runtime analysis:
Looking at different values of k:
require 'max_heap' def kth_largest_element(values, k) heap = MaxHeap.new(values) kth_largest = nil k.times do kth_largest = heap.pop end kth_largest end
Time | O(n + k log n) | O(n) to build the heap + k pops at O(log n) each |
Space | O(n) | n elements in the heap |
Alternatively, we can use a min heap to save space and for the opposite trend in time complexity as the value of k changes.
To find the k-th largest value:
Time | O(n log k) | n insertions into a k-sized heap take O(log k) time each |
Space | O(k) | k elements in the heap |
Quickselect is a popular selection algorithm that finds the kth-smallest element in O(n) time on average.
By only partially sorting the array, we can avoid doing unnecessary work.
To find the k-th largest value:
def kth_largest_element(values, k) n = values.length quickselect(values, 0, n-1, n-k) end def quickselect(values, low, high, i_target) return values[low] if low == high i = partition(values, low, high) if i == i_target values[i] elsif i < i_target quickselect(values, i + 1, high, i_target) else quickselect(values, low, i - 1, i_target) end end def partition(values, low, high) i_pivot = (low + (high - low)*rand).to_i pivot = values[i_pivot] values[high], values[i_pivot] = values[i_pivot], values[high] j = low (low .. high).each do |i| if values[i] < pivot values[i], values[j] = values[j], values[i] j += 1 end end values[j], values[high] = values[high], values[j] j end
Time | O(n) to O(n2) | Worst case O(n2) if you get unlucky |
Space | O(log n) to O(n) | Recursion stack depth, worst case O(n) |
The previous solution used quickselect with random pivots, which randomly has poor worst-case if you get unlucky and select a bunch of bad pivots.
By improving the way that pivots are chosen, we eliminate the possibility of running into the worst-case performance scenario. The tradeoff is that we'll do slightly more work on average, but still in linear time.
Time | O(n) | |
Space | O(log n) | Recursion stack depth |