OpenCV学习笔记(一)------core模块

程序员文章站 2022-05-16 10:35:44

...

Mat
遍历图像
Kernel
输入输出
像素变换
基础绘图
傅里叶变换
并行计算
读取视频
输出视频

Mat

Mat分为头部（矩阵大小、存储方法、矩阵地址等）和像素矩阵两部分，一般为了速度只会复制头部共享像素矩阵，使用引用计数，最后一个引用像素矩阵的Mat负责释放内存。
使用Mat F = A.clone();或者Mat G; A.copyTo(G);会连像素矩阵一起复制。

具体使用可以查阅官方文档

颜色空间

RGB：人眼
BGR（OpenCV默认）
HSV、HLS：更自然的方式去描述颜色
YCrCb：JPEG
CIE L*a*b* ：计算两个颜色的距离

数据类型CV_[The number of bits per item][Signed or Unsigned][Type Prefix]C[The channel number]
比如CV_8UC3

遍历图像

很多时候一幅图片的颜色过多，三通道的话就有 $256 \times 256 \times 256 = 16777216$ .计算的时候会影响效率，而很多算法实际上并不需要这么精确的颜色，所以我们可以减少颜色（color reduction）。比如对每一个像素做这个运算p = p/10*10，除法向0取整。不过由于乘除法运算慢，我们可以预先计算好对应的表格，这就是查找表（lookup table），后面需要用的时候直接读就可以了。

uchar table[256];
for (int i = 0; i < 256; ++i)
    table[i] = (uchar)(divideWith * (i / divideWith));

查看一个函数的运行时间：

double t = (double)getTickCount();
// do something ...
t = ((double)getTickCount() - t)/getTickFrequency();
cout << "Times passed in seconds: " << t << endl;

用isContinuous()判断图像是否是连续存在一行，如果是，可以加速运算。

指针遍历
迭代器（安全、慢一点）
用cv::at()遍历，随机访问，慢，不适于扫描
最快的直接用cv::LUT()，内部多线程实现

 Mat lookUpTable(1, 256, CV_8U);  //查找表
    uchar* p = lookUpTable.ptr();
    for( int i = 0; i < 256; ++i)
        p[i] = table[i];
    LUT(I, lookUpTable, J); //I是输入图像，J是输出图像

Kernel

对矩阵进行运算，创建一个核，对每个元素进行运算，比如用拉普拉斯算子对图像锐化，用filter2D函数比自己手写循环快。

 filter2D( src, dst1, src.depth(), kernel ); //输入，输出，单通道数据类型，核

输入输出

Mat img = imread(filename); //从文件读取图像，默认8UC3，第二个参数可指定数据类型，比如读取灰度图
imwrite(filename, img); //写入图像到文件
Rect r(10, 10, 100, 100);
Mat smallImg = img(r);  //得到ROI
cvtColor(img, grey, COLOR_BGR2GRAY); //转换色彩空间
src.convertTo(dst, CV_32F); //转换数据类型
img.at<uchar>(y, x); //得到像素，img.at<uchar>(Point(x, y));

namedWindow("image", WINDOW_AUTOSIZE); //it can be used to change the window properties or when using cv::createTrackbar
imshow("image", img); //显示单通道数据类型为8U图像,如果是其它类型必须自己进行转换
waitKey();//显示？？毫秒，默认0表示forever

Mat img = imread("image.jpg");
Mat sobelx;
Sobel(img, sobelx, CV_32F, 1, 0);

色彩空间
 Sobel

像素变换

add： $d s t = α \cdot s r c 1 + β \cdot s r c 2 + γ$

addWeighted( src1, alpha, src2, beta, 0.0, dst);  //注意src1和src2必须大小和数据类型都相同

改变图像亮度和对比度：
文档介绍了两种方法：They are basic techniques and are not intended to be used as a replacement of a raster graphics editor!

$g (i, j) = α \cdot f (i, j) + β$
1. 可以自己手写循环对每一个像素按上面公式变换， $α$ 是对比度系数， $β$ 控制亮度， use cv::saturate_cast to make sure the values are valid.
2. 也可调用函数convertTo：image.convertTo(new_image, -1, alpha, beta);

gamma变换：不会因为cv::saturate_cast损失图像信息

 Mat lookUpTable(1, 256, CV_8U);
    uchar* p = lookUpTable.ptr();
    for( int i = 0; i < 256; ++i)
        p[i] = saturate_cast<uchar>(pow(i / 255.0, gamma_) * 255.0);
    Mat res = img.clone();
    LUT(img, lookUpTable, res);

解析命令行的类

基础绘图

用的时候再查就好了，可以看下这个样例

Draw a line by using the OpenCV function line()
Draw an ellipse by using the OpenCV function ellipse()
Draw a rectangle by using the OpenCV function rectangle()
Draw a circle by using the OpenCV function circle()
Draw a filled polygon by using the OpenCV function fillPoly()
绘制文字：cv::putText()

傅里叶变换

空间域–>频率域，任何函数都可以用无限的 $\sin$ 和 $\cos$ 之和表示出来。
二维图像傅里叶变化：

F (k, l) = \sum_{i = 0}^{N - 1} \sum_{j = 0}^{N - 1} f (i, j) e^{- i 2 π (\frac{k i}{N} + \frac{l j}{N})}

e^{i x} = \cos x + i \sin x

离散傅里叶变化的表现取决于图像的尺寸，如果是2、3、5的倍数表现很好（？）。

The getOptimalDFTSize() returns this optimal size and we can use the copyMakeBorder() function to expand the borders of an image (the appended pixels are initialized with zero)

Mat padded;                            //expand input image to optimal size
int m = getOptimalDFTSize( I.rows ); //这里好像默认2的倍数？
int n = getOptimalDFTSize( I.cols ); // on the border add zero values
copyMakeBorder(I, padded, 0, m - I.rows, 0, n - I.cols, BORDER_CONSTANT, Scalar::all(0));

傅里叶变换从实数变为复数，先给输出的图像分配内存
merge()

Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
Mat complexI;
merge(planes, 2, complexI);         // Add to the expanded another plane with zeros

进行变换：

dft(complexI, complexI);            // this way the result may fit in the source matrix

计算振幅：

M = \sqrt[2]{{R e (D F T (I))}^{2} + {I m (D F T (I))}^{2}}

split(complexI, planes);                   // planes[0] = Re(DFT(I), planes[1] = Im(DFT(I))
magnitude(planes[0], planes[1], planes[0]);// planes[0] = magnitude
Mat magI = planes[0];

不过计算出的图像数值范围太大，计算机可能无法显示，这里取对数

magI += Scalar::all(1);                    // switch to logarithmic scale
log(magI, magI);

重排象限？（暂时不太懂）

// crop the spectrum, if it has an odd number of rows or columns
magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
// rearrange the quadrants of Fourier image  so that the origin is at the image center
int cx = magI.cols/2;
int cy = magI.rows/2;
Mat q0(magI, Rect(0, 0, cx, cy));   // Top-Left - Create a ROI per quadrant
Mat q1(magI, Rect(cx, 0, cx, cy));  // Top-Right
Mat q2(magI, Rect(0, cy, cx, cy));  // Bottom-Left
Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right
Mat tmp;                           // swap quadrants (Top-Left with Bottom-Right)
q0.copyTo(tmp);
q3.copyTo(q0);
tmp.copyTo(q3);
q1.copyTo(tmp);                    // swap quadrant (Top-Right with Bottom-Left)
q2.copyTo(q1);
tmp.copyTo(q2);

仍然是为了显示，归一化

normalize(magI, magI, 0, 1, NORM_MINMAX); // Transform the matrix with float values into a
                                            // viewable image form (float between values 0 and 1).

并行计算

可用parallel_for_进行并行计算加速。

读取视频

用cv::VideoCapture这个类。

构造函数可以直接传视频文件的路径，也可以传摄像头设备的编号（从0开始）。
open()打开文件或者摄像头
isOpened()判断是否打开成功
用重载的>>或者read(Mat)读取每一帧图片，存为Mat
set(key, val)设置参数
release()释放对象资源

输出视频

用cv::VideoWriter这个类。
用法和VideoCapture差不多，就不多说了，具体可以看下这个例子

OpenCV学习笔记(一)------core模块

Mat

遍历图像

Kernel

输入输出

像素变换

基础绘图

傅里叶变换

并行计算

读取视频

输出视频

Python中Random和Math模块学习笔记

Python学习笔记之读取文件、OS模块、异常处理、with as语法示例

Python学习笔记之os模块使用总结

Python学习笔记（一）(基础入门之环境搭建)

python网络编程学习笔记(五)：socket的一些补充

python网络编程学习笔记(一)

Redis学习笔记（一）：Redis的数据类型

Angular2学习笔记——详解NgModule模块

Python logging模块学习笔记

Python tempfile模块学习笔记（临时文件）