欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

聚类的评价指标

程序员文章站 2022-07-14 19:35:55
...

聚类的评价指标

归一化互信息(Normalized Mutual Information)

1、定义

归一化互信息提供了一种对称度量来量化两个聚类分布之间共享的统计信息,反映了聚类的质量,值越大,聚类效果越好。
聚类的评价指标
聚类的评价指标

2、程序

1)NMI function:

function [ NMI ] = NMI( id_new,id_real )
% Function_NMI:hard clustering measure: normalized mutual information
% [data_num,center_num]=size(U);
data_num = length(id_real);
center_num = max(id_real(:))-min(id_real(:))+1;
id_label=min(id_real(:)):max(id_real(:));
%% compute the number of each cluster
Pn_id_new = zeros(1,center_num);
Pn_id_real = zeros(1,center_num);
P_id_new = zeros(1,center_num);
P_id_real = zeros(1,center_num);
for i = 1:center_num
    Pn_id_new(i) = length(find(id_new(:)==id_label(i)));
    Pn_id_real(i) = length(find(id_real(:)==id_label(i)));
    P_id_new(i) = length(find(id_new(:)==id_label(i)))/data_num;
    P_id_real(i) = length(find(id_real(:)==id_label(i)))/data_num;
end
%% compute entropy
H_new = 0;
H_real = 0;
for i = 1:center_num
    H_new = H_new - P_id_new(i)*log2(P_id_new(i));
    H_real = H_real-P_id_real(i)*log2(P_id_real(i));    
end
%% compute the number of mutual information
count_new_real=zeros(center_num,center_num);%同时隶属于Ri与Qj的数据点的个数
for i = 1:center_num
    for j = 1:center_num
        for n = 1:data_num
            if ((id_new(n) == id_label(i)) && (id_real(n) == id_label(j)))
                count_new_real(i,j) = count_new_real(i,j) + 1;
            end
        end
    end
end
P_new_real=count_new_real/data_num;
NMI=0;
for i=1:center_num
    for j=1:center_num
        if(P_new_real(i,j) ~= 0)
            NMI = NMI + P_new_real(i,j)*log2(P_new_real(i,j)/(P_id_new(i)*P_id_real(j)));
        end
    end
end
NMI = NMI/sqrt(H_new*H_real);
end

2)test.m:

close;
clear;
clc;
A = [1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3];
B = [1 2 1 1 1 1 1 2 2 2 2 3 1 1 3 3 3];
NMI = NMI( A,B );

3、结果

聚类的评价指标

相关标签: 聚类