webrtc Histogram(直方图) 算法研究
程序员文章站
2022-07-01 17:39:14
...
webrtc Histogram(直方图) 算法研究
说明:Histogram被使用在neteq的DelayManager使用,被用做计算网络延迟。
关键数据结构:
private:
std::vector<int> buckets_;
int forget_factor_; // Q15
const int base_forget_factor_;
int add_count_;
const absl::optional<double> start_forget_weight_;
buckets_ 桶,每一个桶代表数组索引个单位的延迟的百分比使用Q30表示(Q30代表定点数表示浮点数的方法,具体实现可以百度),所有桶相加的和为100% 如:
buckets_[1]=10% 一个单位延迟的包占比10%
buckets_[2]=20% 两个单位延迟的包占比20%
buckets_[3]=30% 三个单位延迟的包占比30%
…
forget_factor_ 遗忘因子,每一次跟新数据时需要遗忘的百分比。
base_forget_factor_最终趋向的稳定遗忘因子。
add_count_ 所有的样本的累加数。
网络延迟统计算法:
void Histogram::Add(int value) {
RTC_DCHECK(value >= 0);
RTC_DCHECK(value < static_cast<int>(buckets_.size()));
int vector_sum = 0; // Sum up the vector elements as they are processed.
// Multiply each element in |buckets_| with |forget_factor_|.
//统计所有的bucket并使用遗忘因子进行遗忘
for (int& bucket : buckets_) {
bucket = (static_cast<int64_t>(bucket) * forget_factor_) >> 15;
vector_sum += bucket;
}
// Increase the probability for the currently observed inter-arrival time
// by 1 - |forget_factor_|. The factor is in Q15, |buckets_| in Q30.
// Thus, left-shift 15 steps to obtain result in Q30.
//使用(1-forget_factor_)更新最新的buckets_所占百分比,注意这个buckets_是Q30表示,而forget_factor_是Q15
buckets_[value] += (32768 - forget_factor_) << 15;
vector_sum += (32768 - forget_factor_) << 15; // Add to vector sum.
// |buckets_| should sum up to 1 (in Q30), but it may not due to
// fixed-point rounding errors.
//将vector_sum的值维持在1的大小
vector_sum -= 1 << 30; // Should be zero. Compensate if not.
if (vector_sum != 0) {
// Modify a few values early in |buckets_|.
int flip_sign = vector_sum > 0 ? -1 : 1;
for (int& bucket : buckets_) {
// Add/subtract 1/16 of the element, but not more than |vector_sum|.
int correction = flip_sign * std::min(std::abs(vector_sum), bucket >> 4);
bucket += correction;
vector_sum += correction;
if (std::abs(vector_sum) == 0) {
break;
}
}
}
RTC_DCHECK(vector_sum == 0); // Verify that the above is correct.
++add_count_;
// Update |forget_factor_| (changes only during the first seconds after a
// reset). The factor converges to |base_forget_factor_|.
//使用自定义权重更新
if (start_forget_weight_) {
if (forget_factor_ != base_forget_factor_) {
int old_forget_factor = forget_factor_;
int forget_factor =
(1 << 15) * (1 - start_forget_weight_.value() / (add_count_ + 1));
forget_factor_ =
std::max(0, std::min(base_forget_factor_, forget_factor));
// The histogram is updated recursively by forgetting the old histogram
// with |forget_factor_| and adding a new sample multiplied by |1 -
// forget_factor_|. We need to make sure that the effective weight on the
// new sample is no smaller than those on the old samples, i.e., to
// satisfy the following DCHECK.
RTC_DCHECK_GE((1 << 15) - forget_factor_,
((1 << 15) - old_forget_factor) * forget_factor_ >> 15);
}
} else {//使用默认更新方式
forget_factor_ += (base_forget_factor_ - forget_factor_ + 3) >> 2;
}
}
1.统计所有bucket * forget_factor_ 的值
2.增加新到bucket 值的权重
3.将vector_sum的值维持在1,这是由于浮点转定点的计算误差导致
4.更新forget_factor_, 使遗忘因子forget_factor_逼近base_forget_factor_(DelayManager使用start_forget_weight_进行更新,start_forget_weight_ = 2,base_forget_factor_=0.9993, )
使用自定义start_forget_weight_更新
使用默认方式更新(其中的+3让人比较容易误解,这个3是Q30的没有多大)
获取当前的延迟:
int Histogram::Quantile(int probability) {
// Find the bucket for which the probability of observing an
// inter-arrival time larger than or equal to |index| is larger than or
// equal to |probability|. The sought probability is estimated using
// the histogram as the reverse cumulant PDF, i.e., the sum of elements from
// the end up until |index|. Now, since the sum of all elements is 1
// (in Q30) by definition, and since the solution is often a low value for
// |iat_index|, it is more efficient to start with |sum| = 1 and subtract
// elements from the start of the histogram.
int inverse_probability = (1 << 30) - probability;
size_t index = 0; // Start from the beginning of |buckets_|.
int sum = 1 << 30; // Assign to 1 in Q30.
sum -= buckets_[index];
while ((sum > inverse_probability) && (index < buckets_.size() - 1)) {
// Subtract the probabilities one by one until the sum is no longer greater
// than |inverse_probability|.
++index;
sum -= buckets_[index];
}
return static_cast<int>(index);
}
依据这个probability这个百分比取获取延迟,
统计满足probability概率的索引值,记为B,并将B返回。
DelayManager中使用Histogram的参数
struct DelayHistogramConfig {
int quantile = 1041529569; // 0.97 in Q30.
int forget_factor = 32745; // 0.9993 in Q15.
absl::optional<double> start_forget_weight = 2;
};