The Mat Structure

Mat is how OpenCV handles the images. It is the underlying data structures that allows OpenCV to process and manipulate the images or image sequences.

The upside of using OpenCV is that it provides you memory management out of the box, the programmer doesn’t have tow orry about it. And it all happens because of the way this unique data structure is designed.

The Mat structure contains two parts. The first is its header. Think of it as a meta store. This meta store contains information such as the size of the matrix, method used for storing, the address in memory at which the matrix is stored. The second part contains a pointer to the matrix containing the pixel values.

How is that beneficial? Say, when you copy an image to another variable,

A = imread(argv[1], IMREAD_COLOR); 
B = A; // only headers are copied.

This comes from the official OpenCV documentation

class CV_EXPORTS Mat
{
public:
    // ... a lot of methods ...
    ...

    /*! includes several bit-fields:
        - the magic signature
        - continuity flag
        - depth
        - number of channels
    */
    int flags;
    //! the array dimensionality, >= 2
    int dims;
    //! the number of rows and columns or (-1, -1) when the array has more than 2 dimensions
    int rows, cols;
    //! pointer to the data
    uchar* data;

    //! pointer to the reference counter;
    // when array points to user-allocated data, the pointer is NULL
    int* refcount;

    // other members
    ...
};

Mat can be used to store real or complex-valued vectors and matrices, grayscale or color images, voxel volumes (volumetric pixel or a voxel is a three-dimensional (3D) equivalent of a pixel and the tiniest distinguishable element of a 3D object.), vector fields, point clouds (A point cloud is a set of data points in space.), tensors, histograms.

The data layout of the array M is defined by the array M.step[]. The memory location/address of each pixel is calculated by a rather simple expression (which can be thought of with higher dimensions). Here’s one for easier to understand, 2-dimensional array:

\[addr(M_{i,j}) = M.data + M.step[0]*i + M.step[1]*j\]

It is important to understand the basic constructs of this data structure as it helps us write good code in OpenCV.

To play even more, I wrote a small program;

#include <iostream>
#include <opencv2/opencv.hpp>
#include <stdio.h>
#include <mach/mach.h>


using namespace cv;

struct task_basic_info t_info;
mach_msg_type_number_t t_info_count = TASK_BASIC_INFO_COUNT;

void getMemUsage(){
    if (KERN_SUCCESS != task_info(mach_task_self(),
                                TASK_BASIC_INFO, (task_info_t)&t_info,
                                &t_info_count))
    {
        std::cout << "Memory Usage : -1" << std::endl;
    } else {
        std::cout << "Memory Usage : " << t_info.resident_size/1000000 << "MB. " << std::endl;
    }
}

int main(int argc, char** argv ) {
    std::cout << "Hello, World!" << std::endl;

    if ( argc != 2 )
    {
        printf("usage: DisplayImage.out <Image_Path>\n");
        return -1;
    }

    Mat image;
    getMemUsage(); // Memory Usage : 14MB. 

    try{
        image = imread( argv[1], 1 );
    } catch( cv::Exception& e ) {
        const char* err_msg = e.what();
        std::cout << "exception caught: " << err_msg << std::endl;
    }

    getMemUsage(); // Memory Usage : 27MB.

    Mat another_image = image;

    getMemUsage(); // Memory Usage : 27MB. No change in memory utilization.

    Mat cloned_image = image.clone(); //Explicitly asking OpenCV to copy the data itself.

    getMemUsage(); // Memory Usage : 39MB. A jump of 12MB

    waitKey(0);
    return 0;
}