More high performance iterable work with bisect
In my last post, I explored the Python array data structure. In keeping with the vibe of iterable data structures, I'm moving on to cover the
bisect uses a simple bisection algorithm to insert items into a list without botching the sort. The gist of the algorithm is simple: we're going to create a midpoint using the low and high marks of the list, check if our value is greater than or less than the midpoint, then use the midpoint to slice the list into smaller and smaller lists until our item falls right at the midpoint.
You can see this in action here:
def bisect_pseudo(a_list, my_value): low = 0 high = len(a_list) while low < high: midpoint = (low+high)//2 if my_value < a_list[midpoint]: high = midpoint else: low = midpoint + 1 print "low:", low, "high", high return low my_list = [1, 4, 9, 16, 25] print "Value greater than midpoint: " bisect_pseudo(my_list, 10) print "Value less than midpoint: " bisect_pseudo(my_list, 2)
In this example, we see the algorithm sifting through our list, searching for the point in the list where our test value would naturally slide in. This is a left bisection algorithm - where we start from the left side of the list and work up - and the reverse would be a right bisection algorithm. The
bisect module includes both.
import bisect names = ['Joe', 'Beth', 'Steve'] names.sort() print names print bisect.bisect_left(names, 'David') bisect.insort_left(names, 'David') print names
In this example, we're using the
bisect_left() function, an implementation of the same algorithm described above, to find the midpoint in our sorted list. The
insort_left() function will perform this algorithm and then insert our value exactly where it needs to be. The code using the right bisection algorithm looks similar:
print bisect.bisect_right(names, 'Shamyl') bisect.insort_right(names, 'Shamyl') print names
bisect module is fairly quick, but it will use a faster C implementation if one's available. That's all there is to it!