#问题
二分查找
list.index()无法应对大规模数据的查询,需要用其它方法解决,这里谈的就是二分查找
#思路说明
在查找方面,python中有list.index()的方法。例如:
>>> a=[2,4,1,9,3] #list可以是无序,也可以是有序
>>> a.index(4) #找到后返回该值在list中的位置
1
>>> a.index(5) #如果没有该值,则报错
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: 5 is not in list
这是python中基本的查找方法,虽然简单,但是,如果由于其时间复杂度为O(n),对于大规模的查询恐怕是不足以胜任的。二分查找就是一种替代方法。
二分查找的对象是:有序数组。这点特别需要注意。要把数组排好序先。怎么排序,可以参看我这里多篇排序问题的文章。
基本步骤:
1. 从数组的中间元素开始,如果中间元素正好是要查找的元素,则搜素过程结束;
2. 如果某一特定元素大于或者小于中间元素,则在数组大于或小于中间元素的那一半中查找,而且跟开始一样从中间元素开始比较。
3. 如果在某一步骤数组为空,则代表找不到。
这种搜索算法每一次比较都使搜索范围缩小一半。时间复杂度:O(logn)
# 递归
def binary_search(lst, value, lo, hi):
if lo > hi:
return -1
half = (lo + hi)/2
if lst[half] == value:
return half
elif lst[half] > value:
return binary_search(lst, value, lo, half-1)
else:
return binary_search(lst, value, half+1, hi)
# 循环
def binary_search_while(lst, value):
lo, hi = 0, len(lst)-1
while lo <= hi:
half = (lo + hi)/2
if lst[half] > value:
hi = half - 1
elif lst[half] < value:
lo = half + 1
else:
return half
return -1
对于python,不能忽视其强大的标准库。经查阅,发现标准库中就有一个模块,名为:bisect。
- 模块接受排序后的列表。
- 本模块同样适用于长列表项。因为它就是用二分查找方法实现的,有兴趣可以看其源码(源码是一个很好的二分查找算法的例子,特别是很好地解决了边界条件极端的问题.)
- 关于bisect模块的更多内容,可以参看[官方文档](https://docs.python.org/2/library/bisect.html)
"""Bisection algorithms."""
def insort_right(a, x, lo=0, hi=None):
"""Insert item x in list a, and keep it sorted assuming a is sorted.
If x is already in a, insert it to the right of the rightmost x.
Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.
"""
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if x < a[mid]: hi = mid
else: lo = mid+1
a.insert(lo, x)
insort = insort_right # backward compatibility
def bisect_right(a, x, lo=0, hi=None):
"""Return the index where to insert item x in list a, assuming a is sorted.
The return value i is such that all e in a[:i] have e <= x, and all e in
a[i:] have e > x. So if x already appears in the list, a.insert(x) will
insert just after the rightmost x already there.
Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.
"""
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if x < a[mid]: hi = mid
else: lo = mid+1
return lo
bisect = bisect_right # backward compatibility
def insort_left(a, x, lo=0, hi=None):
"""Insert item x in list a, and keep it sorted assuming a is sorted.
If x is already in a, insert it to the left of the leftmost x.
Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.
"""
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if a[mid] < x: lo = mid+1
else: hi = mid
a.insert(lo, x)
def bisect_left(a, x, lo=0, hi=None):
"""Return the index where to insert item x in list a, assuming a is sorted.
The return value i is such that all e in a[:i] have e < x, and all e in
a[i:] have e >= x. So if x already appears in the list, a.insert(x) will
insert just before the leftmost x already there.
Optional args lo (default 0) and hi (default len(a)) bound the
slice of a to be searched.
"""
if lo < 0:
raise ValueError('lo must be non-negative')
if hi is None:
hi = len(a)
while lo < hi:
mid = (lo+hi)//2
if a[mid] < x: lo = mid+1
else: hi = mid
return lo
# Overwrite above definitions with a fast C implementation
try:
from _bisect import *
except ImportError:
pass
参考资料:https://github.com/qiwsir/algorithm