聚类的评价指标
程序员文章站
2022-07-14 19:35:55
...
聚类的评价指标
归一化互信息(Normalized Mutual Information)
1、定义
归一化互信息提供了一种对称度量来量化两个聚类分布之间共享的统计信息,反映了聚类的质量,值越大,聚类效果越好。
2、程序
1)NMI function:
function [ NMI ] = NMI( id_new,id_real )
% Function_NMI:hard clustering measure: normalized mutual information
% [data_num,center_num]=size(U);
data_num = length(id_real);
center_num = max(id_real(:))-min(id_real(:))+1;
id_label=min(id_real(:)):max(id_real(:));
%% compute the number of each cluster
Pn_id_new = zeros(1,center_num);
Pn_id_real = zeros(1,center_num);
P_id_new = zeros(1,center_num);
P_id_real = zeros(1,center_num);
for i = 1:center_num
Pn_id_new(i) = length(find(id_new(:)==id_label(i)));
Pn_id_real(i) = length(find(id_real(:)==id_label(i)));
P_id_new(i) = length(find(id_new(:)==id_label(i)))/data_num;
P_id_real(i) = length(find(id_real(:)==id_label(i)))/data_num;
end
%% compute entropy
H_new = 0;
H_real = 0;
for i = 1:center_num
H_new = H_new - P_id_new(i)*log2(P_id_new(i));
H_real = H_real-P_id_real(i)*log2(P_id_real(i));
end
%% compute the number of mutual information
count_new_real=zeros(center_num,center_num);%同时隶属于Ri与Qj的数据点的个数
for i = 1:center_num
for j = 1:center_num
for n = 1:data_num
if ((id_new(n) == id_label(i)) && (id_real(n) == id_label(j)))
count_new_real(i,j) = count_new_real(i,j) + 1;
end
end
end
end
P_new_real=count_new_real/data_num;
NMI=0;
for i=1:center_num
for j=1:center_num
if(P_new_real(i,j) ~= 0)
NMI = NMI + P_new_real(i,j)*log2(P_new_real(i,j)/(P_id_new(i)*P_id_real(j)));
end
end
end
NMI = NMI/sqrt(H_new*H_real);
end
2)test.m:
close;
clear;
clc;
A = [1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3];
B = [1 2 1 1 1 1 1 2 2 2 2 3 1 1 3 3 3];
NMI = NMI( A,B );
3、结果
上一篇: SOA穿着华丽外衣的苦力
下一篇: 精心打造Team的组织架构