imutils库源码解析，看他如何调用opencv基本函数

程序员文章站 2022-06-08 13:09:26

imutils 是一个图像处理工具包，它对 opencv 的一些方法进行了二次加工，使其更加简单易用。相比较于 opencv 的学习难度，导致很多方法使用起来需要一定的基础，新手可能会起步的较慢，而 imutils 使用起来比较便利，能够辅助我们理解 opencv本文就来解析一下 imutils 的源码，看它如何调用 opencv 的方法。顺便也学习一下，这里主要讲其常用的几个图像函数平移查看源码：def translate(image, x, y): # define the trans...

imutils 是一个图像处理工具包，它对 opencv 的一些方法进行了二次加工，使其更加简单易用。相比较于 opencv 的学习难度，导致很多方法使用起来需要一定的基础，新手可能会起步的较慢，而 imutils 使用起来比较便利，能够辅助我们理解 opencv

本文就来解析一下 imutils 的源码，看它如何调用 opencv 的方法。顺便也学习一下，这里主要讲其常用的几个图像函数

平移

查看源码：

def translate(image, x, y):
    # define the translation matrix and perform the translation
    M = np.float32([[1, 0, x], [0, 1, y]])
    shifted = cv2.warpAffine(image, M, (image.shape[1], image.shape[0]))

    # return the translated image
    return shifted

translate 各参数含义：

image：输入图像
x：水平方向的移动，为正表示向右
y：竖直方向的移动，为正表示向上

该函数使用了 opencv 里面的 warpAffine 方法，来看看它各个参数的含义及作用：
cv2.warpAffine(img,M,(rows,cols),flags=cv2.INTER_,borderMode=cv2.BORDER_REFLECT,borderValue=(0,0,0))

img：输入图像
M：变换矩阵
(rows,cols)：输出图像的大小
flags：插值方法的组合（int 类型）
borderMode：边界像素模式（int 类型）
borderValue：边界填充值; 默认情况下为0

在这里 imutils 只用了前三个参数，平移功能主要体现在 M 中：
$M= \begin{bmatrix} 1 & 0 & t_x \\ 0 & 1 & t_y \\ \end{bmatrix}$

$t_x$ 为正表示向右移动， $t_y$ 为正表示向下移动

经 imutils 简化之后，只需要输入 $x,y$ 就行了，内部采用 numpy 直接将两个数转化为矩阵了

使用示例：

# 将图像向右移动 25，向上移动 75
translated = imutils.translate(image, 25, -75)

移动前后的图像：
imutils库源码解析，看他如何调用opencv基本函数

旋转

查看源码：

def rotate(image, angle, center=None, scale=1.0):
    # grab the dimensions of the image
    (h, w) = image.shape[:2]

    # if the center is None, initialize it as the center of
    # the image
    if center is None:
        center = (w // 2, h // 2)

    # perform the rotation
    M = cv2.getRotationMatrix2D(center, angle, scale)
    rotated = cv2.warpAffine(image, M, (w, h))

    # return the rotated image
    return rotated

rotate 各参数含义：

image：输入图像
angle：旋转角度
center：中心坐标（无需输入，由图像大小自动得出）
scale：缩放比例

该函数使用了 opencv 里面的 getRotationMatrix2D 和 warpAffine 方法，其中 warpAffine 已经在上个函数中说了，下面来看 getRotationMatrix2D 各个参数的含义及作用：
cv2.getRotationMatrix2D(center, angle, scale)

center：中心点坐标
angle：旋转角度
scale：缩放比例

它的作用是获得仿射变换矩阵，再交由 warpAffine 进行变换，即将变换信息转化为 warpAffine 可以看懂的矩阵

图像旋转一定角度 $\theta$ 是通过以下形式的变换矩阵实现的：
$\begin{bmatrix} cos\theta & -sin\theta \\ sin\theta & cos\theta \\ \end{bmatrix}$

但是 OpenCV 提供了可缩放的旋转以及可调整的旋转中心，因此您可以在自己喜欢的任何位置旋转。修改后的变换矩阵为：
$M=\begin{bmatrix} \alpha & \beta & (1-\alpha)\cdot center.x-\beta \cdot center.y \\ -\beta & \alpha & \beta \cdot center.x+(1-\alpha)\cdot center.y \\ \end{bmatrix}$

其中： $\alpha=scale \cdot cos\theta,\beta=scale \cdot sin\theta$ ，矩阵变化就是 getRotationMatrix2D 的功能

经 imutils 简化之后，就可以将两个过程结合，只用输入图像和旋转角度

使用示例：

# 循环旋转
for angle in xrange(0, 90, 180, 270):
	# 旋转并展示
	rotated = imutils.rotate(bridge, angle=angle)
	cv2.imshow("Angle=%d" % (angle), rotated)

旋转后的图像：
imutils库源码解析，看他如何调用opencv基本函数

调整大小

查看源码：

def resize(image, width=None, height=None, inter=cv2.INTER_AREA):
    # initialize the dimensions of the image to be resized and
    # grab the image size
    dim = None
    (h, w) = image.shape[:2]

    # if both the width and height are None, then return the
    # original image
    if width is None and height is None:
        return image

    # check to see if the width is None
    if width is None:
        # calculate the ratio of the height and construct the
        # dimensions
        r = height / float(h)
        dim = (int(w * r), height)

    # otherwise, the height is None
    else:
        # calculate the ratio of the width and construct the
        # dimensions
        r = width / float(w)
        dim = (width, int(h * r))

    # resize the image
    resized = cv2.resize(image, dim, interpolation=inter)

    # return the resized image
    return resized

resize 各参数含义：

image：输入图像
width：输出图像的宽度
height：输出图像的高度（width 和 height 选择一个就行，另外一个会随比例调整）
inter：插值方法，默认为 cv2.INTER_AREA

该函数使用了 opencv 里面的 resize 方法，来看看它各个参数的含义及作用：

image：输入图像
dim：缩放比例
interpolation：插值方法

可以看到这里只是改变了输入参数的形式，如果知道缩放比例可以直接选择 opencv 的 resize，如果知道改变后图像的宽或高可以选择 imutils 的 resize

使用示例：

# 遍历调整宽度
for width in (400, 300, 200, 100):
	# 改变并显示
	resized = imutils.resize(workspace, width=width)
	cv2.imshow("Width=%dpx" % (width), resized)

调整后图片：
imutils库源码解析，看他如何调用opencv基本函数

骨骼化

查看源码：

def skeletonize(image, size, structuring=cv2.MORPH_RECT):
    # determine the area (i.e. total number of pixels in the image),
    # initialize the output skeletonized image, and construct the
    # morphological structuring element
    area = image.shape[0] * image.shape[1]
    skeleton = np.zeros(image.shape, dtype="uint8")
    elem = cv2.getStructuringElement(structuring, size)

    # keep looping until the erosions remove all pixels from the
    # image
    while True:
        # erode and dilate the image using the structuring element
        eroded = cv2.erode(image, elem)
        temp = cv2.dilate(eroded, elem)

        # subtract the temporary image from the original, eroded
        # image, then take the bitwise 'or' between the skeleton
        # and the temporary image
        temp = cv2.subtract(image, temp)
        skeleton = cv2.bitwise_or(skeleton, temp)
        image = eroded.copy()

        # if there are no more 'white' pixels in the image, then
        # break from the loop
        if area == area - cv2.countNonZero(image):
            break

    # return the skeletonized image
    return skeleton

skeletonize 各参数含义：

image：输入图像
size：结构元素内核的大小
structuring：结构元素内核的形状
这里结构元素表示就像一个“黑板擦”，把图像当黑板从上到下擦一遍，并产生变化

该函数使用了多个 opencv 内的方法，来逐一看看：
cv2.getStructuringElement

structuring：结构元素内核的形状
size：结构元素内核的大小
这里就是定义“黑板擦”的大小和形状，变化由后续方法提供

cv2.erode(image,kernel,iterations=1)

image：输入图像
kernel：内核
iterations：迭代次数（可以不输入，选择默认次数）
这里是腐蚀图像，使线条变窄并去除噪声，改变程度根据内核大小确定

cv2.dilate(image,kernel,iterations=1)

image：输入图像
kernel：内核
iterations：迭代次数（可以不输入，选择默认次数）
这里是膨胀操作，因为腐蚀去噪声时会缩小线条，之后用膨胀操作就可以将线条恢复原来的大小

cv2.subtract(image1,image2,dst=None,mask=None,dtype=None)

image1：输入图像1
image2：输入图像2
这里将两个图像相减，将背景去除掉

cv2.bitwise_or(image1,image2,mask=noArray())

image1：输入图像1
image2：输入图像2
mask：掩膜，用选定的图像、图形或物体，对处理的图像（全部或局部）进行遮挡，来控制图像处理的区域或处理过程
这里对图像每个像素值进行二进制“或”操作，1|1=1，1|0=0，0|1=0，0|0=0

骨架化是在图像中构造对象的“拓扑骨架”的过程，其中假定该对象在黑色背景上是白色的

使用示例：

gray = cv2.cvtColor(logo, cv2.COLOR_BGR2GRAY)
skeleton = imutils.skeletonize(gray, size=(3, 3))
cv2.imshow("Skeleton", skeleton)

骨架化前后的图片：
imutils库源码解析，看他如何调用opencv基本函数

转化为Matplotlib可显示格式

查看源码：

def opencv2matplotlib(image):
    # OpenCV represents images in BGR order; however, Matplotlib
    # expects the image in RGB order, so simply convert from BGR
    # to RGB and return
    return cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

与其说是转化为 Matplotlib 格式，本质上还是由 BGR 转化为 RGB。在 Python 的 OpenCV 中，图像以 BGR 顺序表示为 NumPy 数组。使用此 cv2.imshow 功能时，此方法工作正常，但是 Matplotlib 中图像按 RGB 顺序排列。

Matplotlib 中以不同顺序显示的图像：
imutils库源码解析，看他如何调用opencv基本函数

Canny自动边缘检测

查看源码：

def auto_canny(image, sigma=0.33):
    # compute the median of the single channel pixel intensities
    v = np.median(image)

    # apply automatic Canny edge detection using the computed median
    lower = int(max(0, (1.0 - sigma) * v))
    upper = int(min(255, (1.0 + sigma) * v))
    edged = cv2.Canny(image, lower, upper)

此函数是帮助我们使用 cv2.Canny，先看看 cv2.Canny 的参数结构：
cv2.Canny(image,threshold1,threshold2,[, edges[,apertureSize[,L2gradient ]]])

image：输入图像
threshold1, threshold2：阈值用于检测图像中明显的边缘

阈值难以确定，如果直接使用 cv2.Canny，不一定可以找到合适的值，但 imutils 通过从图像本身的数据进行处理，可以提供相对合适的阈值

需要注意的是，只能读取灰度或单通道图片

使用示例：

gray = cv2.cvtColor(logo, cv2.COLOR_BGR2GRAY)
edgeMap = imutils.auto_canny(gray)
cv2.imshow("Original", logo)
cv2.imshow("Automatic Edge Map", edgeMap)

边缘检测前后图片：
imutils库源码解析，看他如何调用opencv基本函数

本文地址：https://blog.csdn.net/weixin_44613063/article/details/107043805

上一篇： python基本语法之标识符和保留字

下一篇：惠普安卓笔记本电脑怎么样？惠普SlateBook 14配置介绍

imutils库源码解析，看他如何调用opencv基本函数

平移

旋转

调整大小

骨骼化

转化为Matplotlib可显示格式

Canny自动边缘检测

imutils库源码解析，看他如何调用opencv基本函数

imutils库源码解析，看它如何调用opencv（面部识别）

imutils库源码解析，看他如何调用opencv基本函数

imutils库源码解析，看它如何调用opencv（面部识别）