2.3 sklearn中的metrics.roc_curve评价指标

程序员文章站 2024-01-16 21:37:34

...

文章目录

from sklearn.metrics import roc_curve

roc_curve(y_true, y_score, *, pos_label=None, sample_weight=None,drop_intermediate=True)

 - Parameters(参数)
	y_true : ndarray of shape (n_samples,)
	二进制标签,真实数据结果;如果标签既不是{-1，1}也不是{0，1}，则应该明确给出pos_label

	y_score : ndarray of shape (n_samples,)
	预测结果数据，可以是标签数据也可以是概率值；与y_true形状一致

	pos_label : int or str, default=None
	正类的标签。当pos_label=None时，如果y_true在{- 1,1}或{0,1}中，则将pos_label设置为1，否则将引发错误	

	sample_weight : array-like of shape (n_samples,), default=None
	样本权重
	
	drop_intermediate : bool, default=True
	是否降低一些不会出现在绘制的ROC曲线上的非最佳阈值

 - Returns(返回)
	fpr——假正率
	tpr——召回率
	threshold——阈值

1.案例

>>> import numpy as np
>>> from sklearn import metrics
>>> y = np.array([1, 1, 2, 2])
>>> scores = np.array([0.1, 0.4, 0.35, 0.8])
>>> fpr, tpr, thresholds = metrics.roc_curve(y, scores, pos_label=2)
>>> fpr
array([0. , 0. , 0.5, 0.5, 1. ])
>>> tpr
array([0. , 0.5, 0.5, 1. , 1. ])
>>> thresholds
array([1.8 , 0.8 , 0.4 , 0.35, 0.1 ])

2.y_score讨论

1)y_score是标签数据

# y_score是标签数据
import numpy as np
from sklearn.metrics import roc_curve

y_true = np.array([0, 1, 1, 0])
y_score = np.array([0, 1, 0, 0])
fpr, tpr, threshold = roc_curve(y_true, y_score)
print('fpr=', fpr)
print('tpr=', tpr)
print('threshold=', threshold)

# fpr= [0. 0. 1.]
# tpr= [0.  0.5 1. ]
# threshold= [2 1 0]

threshold返回结果是y_score内的元素去重后加入’最大值+1’的值，然后降序排序后组成的数据，每一个元素为一个阈值
tpr与fpr为在不同阈值下的值.当阈值为2时，y_score = np.array([0, 0, 0, 0])；此时，TP=0,FP=0,故，tpr=0/2=0，fpr=0/2=0。其余的同理

2）y_score是概率值

# y_score是概率数据
import numpy as np
from sklearn.metrics import roc_curve

y_true = np.array([0, 1, 1, 0])
y_score = np.array([0.1, 0.3, 0.6, 0.8])
fpr, tpr, thresholds = roc_curve(y_true, y_score)
print('fpr=', fpr)
print('tpr=', tpr)
print('threshold=', thresholds)

# fpr= [0.  0.5 0.5 1. ]
# tpr= [0. 0. 1. 1.]
# threshold= [1.8 0.8 0.3 0.1]

threshold返回结果自己对应理解
当以0.8为阈值时，即y_score中大于等于0.8的为阳性(1)，其余的为阴性(0) = >y_score=[0,0,0,1];tpr=0/2=0，fpr=1/2=0.5。其余同理。

相关标签： # 机器学习应用实战 roc_curve

上一篇： Android视频播放之JiaoZiVideoPlayer框架的使用

下一篇： Android 判定网络连接状态以及监听网络链接状态的变化

2.3 sklearn中的metrics.roc_curve评价指标

文章目录

1.案例

2.y_score讨论

1)y_score是标签数据

2）y_score是概率值

2.3 sklearn中的metrics.roc_curve评价指标

使用sklearn对多分类的每个类别进行指标评价操作

机器学习中的评价指标

浅谈keras中自定义二分类任务评价指标metrics的方法以及代码

使用sklearn对多分类的每个类别进行指标评价操作

浅谈keras中自定义二分类任务评价指标metrics的方法以及代码