Wheat Varieties Identification Research Based on Sparse Representation Method of Dictionary Learning

With the rapid development of computer vision technology, the use of machine vision to replace artificial is widely used in product detection and classification. The conventional sparse representation methods need a large number of training samples to improve the ability of sparse representation of a dictionary. This results in a large dictionary size and an immense memory requirement, which often leads to low efficiency in actual applications. In this paper, a novel method of identification and classification of the wheat varieties is given based on the sparse representation method with the dictionary learning technique. In the given method, the K-SVD algorithm is utilized to train the feature dictionary, the number of the atoms in which is effectively reduced, compared with the method of identification and classification of the wheat varieties based on the conventional sparse representation method. The final test simulation verifies the effectiveness and feasibility of the new identification and classification method of wheat varieties and compares it with the conventional identification and classification method of wheat varieties.


Introduction
Wheat is the largest crop in the world and it is closely related to people's daily lives. The identification and classification research of different wheat varieties has very important significance. The earliest identification and classification work to identify and classify the varieties of wheat grains mainly relies on the human visual system. It appears several disadvantages, such as strong subjectivity, low efficiency and high uncertainty. With the rapid development of computer technology, machine vision is more and more widely utilized for the identification and classification of agricultural products. It effectively solves the uncertainty of identification and classification caused by man-made subjective factors, and greatly improves the efficiency of the varieties identification of agricultural products [1] [2].
The researchers have proposed varieties of identification and classification methods for varieties of agricultural products, such as rice, oatmeal, ryegrass and barley [3] [6], but the identification results of different varieties of wheat are not ideal. Yanping Yang, Ruiguang Li, and Lijuan Feng Sparse representation [7] [8] is a kind of optimization method based on the minimum of L1 norm, has been widely researched in the field of machine vision. And it is utilized for the identification and classification of wheat varieties in [10]. For the sparse representation method utilized in [10], a large number of training samples are needed to construct the dictionary to improve the sparse representation ability. But a big size dictionary is not conducive to solve sparse coefficients and requires spending long computing time. Therefore, it is low efficiency in actual applications.
To overcome this issue, in this paper, the dictionary structure method is improved, and the K-Singular Value Decomposition (K-SVD) algorithm is introduced to improve the effectiveness of the identification and classification method of wheat varieties. The K-SVD algorithm updates the dictionary column by column to avoid the inverse matrix process [9], and the amount of calculation is reduced greatly.
In this paper, 4 kinds of wheat grains are considered to introduce the identification and classification method of wheat varieties. The color features, the morphological features and the texture parameters of wheat grains are utilized as the three kinds of feature parameters of different wheat varieties. The K-SVD dictionary learning method is introduced to generate a new feature dictionary for the identification and classification method of wheat varieties, which can not only guarantee the recognition rate but also significantly reduce the amount of calculation. The final test simulation verifies the effectiveness and feasibility of the new identification and classification method of wheat varieties and compares it with the conventional identification and classification method of wheat varieties.

Wheat materials
Take four kinds of wheat grain (Zhengmai103, Kaimai21, Zhoumai20, and Yubao1) as the research object. Using IDS Germany high-definition USB 3.0 industry camera to obtain the images as test samples. In this paper, randomly select 160 wheat grains as the training samples. Four kinds of wheat grains images are shown as [ Figure 1].

Color feature s extraction
The color of the wheat grain is an important feature to distinguish the different varieties of wheat. RGB color model is a common color model utilized in digital image processing. In photometry, HIS model uses hue, intensity and saturation to describe color, which accords with the human's vision system. In this paper, the RGB color model and HIS color model are combined to identify the wheat grain. According to the color feature parameters extract formula [10], the color feature mean values of the selected wheat grain samples are shown as follows:

Morphological features extraction
Shape difference is another discriminative feature for different wheat varieties. In this paper, the perimeter, area, roundness, rectangle, elongation of wheat grains are selects as five morphological feature parameters, which are separately denoted as L, A, C, R, E, in [ Table 2]. The perimeter parameter is got by using the 8 connected chain code method. Denote the number of pixels in the vertical direction as x N , the one in the horizontal direction as y N , and the number of diagonal codes as d N . Then the perimeter parameter is The other morphological feature parameters of wheat grains can be given by in which, a is the long axis, and b is the short axis. Yanping Yang, Ruiguang Li, and Lijuan Feng As shown in [ Table 2], the wheat varieties are relative to the morphological feature parameters extracted from wheat grains, such as area, roundness and elongation. For instance, the mean area of Kaimai21 is greater than other wheat varieties. Therefore, the morphological features constitute the second kind of important feature parameter to classify and identify wheat varieties.

Texture feature extraction
The gray level co-occurrence matrix is utilized in this paper [15]. According to the requirements of the wheat varieties identification application, this paper takes contrast (CON), correlation (COR), energy (ASM) and entropy (ENT) as four texture features. [ Table 3] shows that there are obvious differences between the texture feature mean values of different wheat varieties. As a result, the wheat texture feature parameters for are feasible to classify and identify wheat varieties.

The constitute of the wheat grain feature dictionary
Assume there are n classes training samples, and each class has m training samples. Denote 12 , , ,   as the training samples which belong to the i st ( 1, 2, , in = ) class. In this paper, each kind of wheat selected 160 grains as the training samples, then the feature parameter dictionary including 4 kinds of wheat grains are given by 1 1 2  2  4  4  1  160  1  160  1  160   1  2  4 , , , , where the number of rows is the number of feature parameters of each sample. In this the size of A is 15 * 640.

Data normalization
Note that the dimensions of different feature parameters are different. Therefore, the data in the feature parameter dictionary should be normalized.
Firstly, calculate the maximum number of each feature in the training samples, which is denoted as where ( , ) i A ： represents the th i row of A ， ,max i r represents the maximum of th i row, and then the normalization result of each line of ( , ) In the normalization matrix * A of A , the value of each feature parameter is during [0, 1], which is in the same dimensional level as others. Similarly, the test sample y should also be normalized by ,max in which, * y represents the normalized vector of y .

Sparse representation based on the dictionary learning algorithm
In this paper, the K-SVD dictionary learning method is introduced to generate a new feature dictionary for the identification and classification method of wheat varieties, which can not only guarantee the recognition rate but also significantly reduce the amount of calculation.
where K xR  contains the coefficient of y , the norm p in the constraint condition can choose 1 norm, 2 norm, or infinite norm.
The sparse representation method based on dictionary learning includes the two step iteration. Build the following objective problem to get the sparse representation of signals, based on (14). where Y is the signal set to be dealt with and denoted as is a set of column vectors, each element in which is a projection of the corresponding element of Y.
2 F represents the square of the F norm, T is the threshold. Assumes that the dictionary A is known, and then the sparse coefficient matrix set X is solved by Yanping Yang, Ruiguang Li, and Lijuan Feng Due to the interference of noise and modeling error, not only the projection coefficient x in its class is not zero, but the one of x in some other classes is also nonzero. Therefore, we need to distinguish the class of y . For each wheat class i Then calculate the projection of each test sample in the dictionary, the class of the test sample is the one that minimized the projection error.

Test and analysis
4.1. Parameter setting in the sparse representation method based on dictionary learning

Parameters Settings: L
L represents the number of samples each sample used in sparse coding congener linear says and is the coding phase of the training samples by how many atoms sparse coding. The K-SVD dictionary learning algorithm of sparse representation algorithm is sparse coding phase using an Orthogonal Matching algorithm (OMP). Sparse coding is for a given signal and the calculation of how many samples need to use to achieve the best effect. In the sparse coding dictionary learning phase, from the dictionary to choose how many atoms in the training sample to encode the best effect without actual theory support. If you choose the number of atoms, not robust to represent the training sample, although the coefficient is sparse but will cause the recognition effect is not ideal. If select a large number of atoms can increasable cause of too much computational complexity, too much can cause said coefficient at the same time, with sparse representation theory. In the case of other parameters are fixed, using the validation method to obtain the parameter L one by one.
Therefore, the relationship between parameters L, and the wheat varieties recognition rate is shown in [ Figure 2]. In the case of the dictionary number unchanged, when L = 6, the wheat variety rate is the highest.

The number of each type of dictionary learning Settings: d
Use d to represent the number of each type of training dictionary. K represents the number of dictionary libraries the atomic after training. How to determine the size of the dictionary learning can reach an optimal effect is a problem, and there is no mature theory to calculate the accurate results. Hereby many experiments to determine the optimal size of the dictionary d. First fixed L = 6, therefore, the relationship between the changes number of parameter d and wheat variety identification rate in sparse representation method based on dictionary learning, when the number of each type of dictionary is 50, wheat variety rate is maximum, the experiment result is shown in Figure 3.

Sparse representation method based on dictionary learning classification identification of wheat varieties
This test runs in MATLAB2010a and solves the minimum problem of L1 norm using the MATLAB software package [18]. The method of identification and classification of the wheat varieties based on the traditional sparse representation method and the one based on the dictionary learning technique are both utilized in this simulation.
As mentioned above, Zhengmai103, Kaimai21, zhoumai 20, and Yubao1 are 4 selected wheat varieties. For each kind of wheat, 160 grains are randomly selected as the training samples, so the size of the training sample dictionary is 15 * 640.
For the given method of identification and classification of the wheat varieties based on dictionary learning technique, the size of the training sample dictionary for each kind of wheat is 15 * 50. Therefore, the size of the new training sample dictionary of four kinds of wheat is 15 * 200. The test samples are randomly selected 50 Zhoumai20 gains. The test results are shown as follows.  According to the sparse representation classification principle, the variety of the minimum residual of the test sample is the variety of the test sample. As shown in Figure 4, the smallest residual values of test samples almost all belong to Zhoumai20, which is consistent with the variety of test samples. This result verifies the effectiveness of the proposed sparse representation classification method and the recognition rate is 94%. Similar to the steps above, the other three varieties of wheat grains are tested, the results are shown in [ Figure 5] below.  In this paper, the method of identification and classification of the wheat varieties with regular features (Method 2) and the one based on the dictionary learning technique (Method 1) are also tested for the same wheat varieties. The results are shown in Table 4 below: (1) The recognition rate of the method of identification and classification of the wheat varieties based on dictionary learning algorithm (50 grain) is higher than that of the conventional sparse representation method (50 grain), spending similar time; (2) The recognition rate of the method of identification and classification of the wheat varieties based on dictionary learning algorithm (50 grain) is very close to that of the sparse representation method (160 grains), spending a shorter time. Table 4. Two methods are tested for the same test sample wheat variety identification By the above experimental steps, respectively use the regular features of the sparse representation dictionary method and the improved sparse representation method based on dictionary learning technique to identify four kinds of wheat varieties, the test results as shown in [ Table 5].
From [ Table 5], we know the average wheat recognition rate of the method of identification and classification of the wheat varieties based on the conventional sparse representation method is 92.8%, and the one based on the dictionary learning technique is 92%. Note that the average wheat recognition rates of the two compared methods are both more than 90%. The greater amount of data dictionary for Method 2 led to a larger amount of calculation and storage requirement, while for the sparse representation method based on dictionary learning, in the case of basic recognition rate unchanged, the identification time is reduced. This improves the efficiency of the wheat variety identification, and as for a larger amount of samples, this method has a more significant advantage.

Conclusions
In this paper, the sparse representation method is applied for the identification and classification of wheat varieties, and the one based on the conventional sparse representation method is improved by introducing the K-SVD algorithm on the wheat variety identification and classification. Compared with the method of identification and classification of the wheat varieties based on conventional sparse representation algorithm, the one based on dictionary learning technique not only meets the requirement of the recognition rate of wheat varieties but also greatly shortens the time of recognition and classification and improves the efficiency of recognition and classification. The experimental results show that: For the same dictionary size, the effectiveness of a new method for the identification of wheat varieties is significantly improved. For the same or similar recognition rate, the number of dictionary atoms for the proposed method is less than that of the method of identification and classification of the wheat varieties based on the conventional sparse representation algorithm.