stl容器使用中的经验（六）--考虑 map::opreator[]和map::insert()的效率问题

程序员文章站 2022-03-21 17:40:44

...

结论：当效率至关重要时，如果要更新一个容器的值，优先选择map::operatot[],如果需要插入一个新的元素，则优先选择map::insert()。

假设定义如下类：

class Widget{
public:
    Widget();
    Widget(const double& val);
    Widget& opeartor=(double val);
    ...
}

创建一个从int到map的映射容器。

std::map<int, Widget> m;

m[1] = 1.5;
m[2] = 0.14;
m[10] = 2.45;

上面的例子使用了map的[]操作符，我们知道，map的[]操作符和其他的一些容器，比如vector、deque等的[]操作符是不一样的。map的map::operator[]操作符被设计的目的是为了提供方便更新和插入的功能。

map<key, val> m;

m[k] = v;

也就是说对于上面的容器而言，表达式m[k] = v;首先会检查容器m中是否存在键为k的元素，如果存在，则会更新该元素的值，如果不存在，则会在容器中插入一个键值对pair<k, v>。

_Tp& operator[](const key_type& __k) {
    iterator __i = lower_bound(__k);
    // __i->first is greater than or equivalent to __k.
    if (__i == end() || key_comp()(__k, (*__i).first))
      __i = insert(__i, value_type(__k, _Tp()));
    return (*__i).second;
}

查看源码我们可以发现，map::operator[]会返回一个与 k 相关联的值元素的引用，如果容器中不存在该 k 对应的元素，则会向容器中插入一个默认的值类型的空对象。

所以，我们对最上面开始的例子做分析。

std::map<int, Widget> m;
m[1] = 1.5;

m[1]是m.opeartor[](1)的缩写。该函数必须会返回一个Widget的引用，如果当容器中原本没有键 1 对应的元素时，容器会首先插入一个默认的对象，并返回该对象的引用。所以上面表达式在功能上等同于：

std::map<int, Widget> m;

typedef std::map<int, Widget> IntWidgetMap;
pair<IntWidgetMap::iterator, bool> iter = m.insert(IntWidgetMap::value_type(1, Widget()));

iter.first->second = 1.5;

我们通过上面的代码片段能够看出来，使用map::operator[]向容器中插入新元素时，首先会插入一个新的默认空对象，然后将值赋给该空对象。在这个过程中，我们其实已经使用了insert方法。

所以，在向容器中插入新元素时，使用insert()方法会节省开销。直接调用:

m.insert(IntWidgetMap::value_type(1, 1.5));

这样的效果和上面的函数是一样的，但是会节约三个函数的调用，一个是使用默认函数构造的临时对象，一个是对该临时对象的析构，一个是调用Widget的复制操作符。

上面我们已经看了像容器中插入元素时，使用map::inset()会比使用map::operatot[]效率更高，那么如果容器中已经存在键为k的元素，又会怎样呢？

下面是一个测试用例。

#include<iostream>
#include<map>
using namespace std;

int main()
{
	map<int, int> m;
	typedef map<int, int> IntMap;
	
	m.insert(IntMap::value_type(1, 5));
	pair<IntMap::iterator, bool> rst = m.insert({1, 4});
	
	cout << m[1] << " " << rst.second;	// 5 0
	
	rst.first->second = 7;
	
	cout << m[1] << " " << rst.second;	// 7 0
	return 0;
}

我们发现，如果已经存在，只是调用 map::insert 的话是不会修改其值的，必须要对map::insert 返回的元素的引用进行赋值操作。

m.insert({1, 4}).first->second = 7;

和

m[1] = 9;

从语法上我们就已经给更倾向于使用map::operator[]。并且呢，map::insert 方法的参数是一个 IntMap::value_type（也就是pair<int, int>）类型的对象，所以当调用时，我们需要创建好析构一个它的对象，如果元素值是类对象的话，还会有其他对象的创建和析构的开销。而map::operator[]操作不使用 pair 对象，因此会节省开销。

上面的两种方式选择权完全在开发者的手中，那么有没有一种可能，stl提供一个新的方法，自己判断是否存在键为k的元素，如果存在，则更新元素的值，如果不存在，则进行插入操作。

#include<iostream>
#include<map>

using namespace std;

template<typename MapType, 
		 typename KeyType, 
		 typename ValueType > 
		 typename MapType::iterator
efficientAddorUpdate(MapType& map, const KeyType& key, const ValueType& val)
{
	typename MapType::iterator iter = map.lower_bound(key);
	if(iter != map.end() && !(map.key_comp()(key, iter->first)))
	{
		iter->second = val;
		return iter;
	}
	else
	{
		typedef typename MapType::value_type MVT;
		return map.insert(iter, MVT(key, val));
	}
}

int main()
{
	map<int, int> m;
	typedef map<int, int> IntMap;
	
	m.insert(IntMap::value_type(1, 5));
	pair<IntMap::iterator, bool> rst = m.insert({1, 4});
	
	cout << m[1] << " " << rst.second;	// 5 0
	
	rst.first->second = 7;
	
	cout << m[1] << " " << rst.second;	// 7 0
	
	IntMap::iterator iter = efficientAddorUpdate(m, 1, 6);

	cout << m[1] << " " << iter->second;	// 6 6

	return 0;
}

上面的逻辑是，首先先判断容器中是否已经存在k的值，如果在，找到他的位置并进行值的修改，如果不在，找到它应该被插入的位置。而面对已经是排序的 map 容器，lower_bound方法可直观的找到它的位置。然后对其找到的元素和我们给定的键做等价对比，判断容器中是否已经存在 k 。

在这个函数中，执行上述的操作，也就意味着如果我们需要插入元素的时候，其实已经确定了元素被插入的位置，也就是说，插入元素时间复杂度是常数级。

以最开始的容器为例，调用下面的操作:

IntWidgetMap::iterator iter = efficientAddorUpdate(m, 1, 6.0);

也能够进行数值的推导，直接将6.0赋值给其中为 k 为1的元素，调用Widget类中的 Widget::operator=(double)方法，而不是重新进行类临时变量的创建和析构。

上一篇： matplotlib.gridspec改变网格间距（分区）

下一篇： 17.位运算符