opencv-python Image Processing Common Methods Summary (Part 1)

The cover image is published by Pexels on Pixabay, and I added the fusion of OpenCV's logo.

After completing my graduation project, I haven't touched anything related to image processing. After joining the company, I learned and worked as a front-end developer (hence the study notes for Vue). But when I started doing image processing again, I found that I had completely forgotten how to call functions. So I decided to organize a guide.

I'm used to import cv2 as cv, so the following calls are based on cv. In the format, dst represents the mat object of the target image, and src represents the mat object of the original image (the one read in using imread).

Get the dimensions of an image#

By printing src.shape, you can see that the result is (height, width, number of channels). To get the height and width of the image, you can use the following statement:

src_height, src_width = src.shape[0:2]

Image resizing function resize()#

Common function format:

dst = cv.resize(src, dsize)

Where dsize is a tuple similar to (int(source_width / 2), int(source_height / 2)), which represents the scaling factor of the image. In the example, it is scaled down by half while maintaining the aspect ratio. The parameter can be adjusted.

Image color conversion function cvtColor()#

Common function format:

dst = cv.cvtColor(src, colorCode)

Where the colorCode part has corresponding codes in the library, such as cv.COLOR_RGB2GRAY.

This function is commonly used for grayscale conversion, etc.

Image denoising method GaussianBlur()#

Common function format:

dst = cv.GaussianBlur(src, ksize, sigmaX)

ksize is the size of the convolution kernel, which can only be a positive odd tuple, such as (3, 3), (5, 5), etc. In simple terms, it is the neighborhood size, which processes the area within a certain number of pixels around a pixel. The larger the ksize, the blurrier the result.

sigmaX is actually σX, which refers to the standard deviation in the X direction of the image. It is a necessary parameter. If σY is not specified, it is set based on σX.

Image binarization function threshold()#

Common function format:

dst = cv.threshold(src, thresh, maxval, type)

Where thresh is the threshold value, maxval is the maximum value to be set, and it only takes effect when type is cv.THRESH_BINARY or cv.THRESH_BINARY_INV.

type is the binarization method, which has corresponding values in the library, as shown below:

cv.THRESH_BINARY: Set the value to maxval if the current point is greater than thresh, otherwise set it to 0.

cv.THRESH_BINARY_INV: Set the value to 0 if the current point is greater than the threshold, otherwise set it to maxval.

THRESH_TRUNC: Set the value to the threshold if the current point is greater than the threshold, otherwise leave it unchanged.

THRESH_TOZERO: Leave the value unchanged if the current point is greater than the threshold, otherwise set it to 0.

THRESH_TOZERO_INV: Set the value to 0 if the current point is greater than the threshold, otherwise leave it unchanged.

Canny edge detection Canny()#

Common function format:

dst = cv.Canny(src, thresh1, thresh2)

Where pixels with values lower than thresh1 are considered non-edges, pixels with values higher than thresh2 are considered edges, and pixels with values between the two thresholds are considered edges if they are adjacent to pixels that are considered edges.

Contour detection findContours() and contour drawing drawContours()#

Common function formats:

contours, hierarchy = cv.findContours(src, mode, method)
dst = cv.drawContours(src, contours, contoursIdx, color, thickness)

In findContours, mode is the contour retrieval mode, for example, cv.RETR_TREE can establish a complete hierarchy of contours, and method is the contour approximation method, for example, cv.CHAIN_APPROX_SIMPLE represents using as few pixels as possible to represent the contour.

In drawContours, contours is the contours detected in the previous step, contoursIdx specifies the contour to be drawn. If it is -1, all contours will be drawn. color specifies the color, which can be represented in the format similar to (255, 0, 0) for an RGB color. thickness specifies the thickness of the contour line and is an optional parameter.

Usually, contour detection is performed on the image after edge detection, and the contour effect is generally good.

Hough transform HoughLines()#

Common function format:

lines = cv.HoughLines(src, rho, theta, thresh)

The output is a set of detected lines. rho is the distance resolution in pixels, theta is the angle resolution in radians. Here, polar coordinates are used to represent the lines. thresh is the threshold. In practical use, unless there are special circumstances, it is generally set to 1, np.pi / 180, 0.

This function is used for line detection, and the application scenario is to rotate the image to align with the angle corresponding to the slope of the line. For example, the following code is used to rotate the image according to the angle corresponding to the slope of the line:

lines = cv.HoughLines(img_canny, 1, np.pi / 180, 0)
for rho, theta in lines[0]:
    a = np.cos(theta)
    b = np.sin(theta)
    x0 = a * rho
    y0 = b * rho
    x1 = int(x0 + 1000 * (-b))
    y1 = int(y0 + 1000 * a)
    x2 = int(x0 - 1000 * (-b))
    y2 = int(y0 - 1000 * a)
    if x1 == x2 or y1 == y2:
        continue
    t = float(y2 - y1) / (x2 - x1)
    rotate_angle = math.degrees(math.atan(t))
    img_result = ndimage.rotate(img_resize, rotate_angle)