Prediction of viscosity, density and solids content in inks employed in printing industry production chain combining infrared and neural models

In this study, some characteristics of the black and white inks that are part of the process of rotogravure were evaluated, to guarantee a good product in the printing process. Thus, an analytical method was developed that combines infrared spectroscopy with Artificial Neural Networks (ANN) to estimate the viscosity, density and solids content of inks, having the advantage of providing highly accurate results quickly and with little computational effort. The best models were those developed for density, with average percentage errors of: 1% in training and validation, and 2% in test of the black and white inks together; 1% in training and validation, and 0.7% in test of the black ink; 0.2% in training, 0.8% in validation and 0.7% in test of white ink. The method developed can to be applied in printing industries as an improvement for the production of high quality rotogravure printed material.


Introduction
The quality control of inks is essential during and after its manufacturing process and also very important in consumer industries or for inks users. The ink employed in this study is designated to the rotogravure printing process. This process is intended to be used for long publications and prints at high speed. It is usually applied in the production of packaging and magazines. Bohan, Claypole and Gethin (2000) reported an experiment using spectrophotometry where they verified the parameters that had the greatest impact on the quality of products printed by rotogravure. The experiments highlighted the sensitivity of the process to changes in ink viscosity and density, showing the importance of these parameters to the final result. Kader (2017) did an optical print quality analysis of the rotogravure ink and, in addition to viscosity, he concluded that density is a parameter of great importance for product quality; beyond that he found that the ink density variation curve changes for different values of viscosity and densities, and does not always increase with viscosity.
There are some studies correlating these inks and their parameters with neural networks. One of them address the classification of hyperspectral inks using one-dimensional (1D) Convolutional Neural Network (CNN) (VERIKAS; MALMQVIST; BERGMAN, 1997) which is something totally new, since this type of classification is usually done by Spectral Angle Mapper (SAM) or Spectral Information Divergence (SID). Also, there is a study about the classification of color images segmentation using Modular Neural Network (MNN) (DE-VASSY;GEORGE, 2019), which is a neural model that embodies the concepts and principles of modularity. It is characterized by a series of independent neural networks moderated by some intermediary. Both of these articles show how Neural Networks can facilitate a classification in a study of inks, even if they are not specifically the inks used in the gravure process.
Still correlating inks and Artificial Neural Networks (ANN), a recent study showed that is possible to determinate color changes based on time in inks applied on the surface with offset printing during drying using ANN (KOSE, 2014). In this paper, the author tried to make the same determination using an experimental method but he concluded that ANN is more advantageous because of their speed, simplicity, and capacity to learn from examples.
Based on this motivation, there was an opportunity for the development of neural models aimed at determining the properties of solvent-based ink in the printing industry. The quality parameters estimated by the neural models were: viscosity (ELDRED, 2001), density (ASTM INTERNATIONAL, 2020) and solids content (ASTM INTERNATIONAL, 2018) of black and white inks. The input variables were the 1798 signals of the infrared spectra in the range of 4000 to 650 cm -1 for the ink samples.

Samples and data processing
The samples used were black and white inks used in the gravure printing process, provided by School SENAI Foundation Zerrenner. Altogether, 80 inks samples were obtained, 40 black and 40 white, with variations in the proportions of resin, solvent and pigment.
The data acquisition was performed using a FTIR Cary 630 (Agilent Technologies) equipped with an attenuated total reflection (ATR) accessory. The spectral range was from 4000 to 650 cm -1 and a total of 32 scans were recorded for each sample with 4 cm -1 resolutions. Figure 1 shows the infrared spectra for black ( Figure 1a) and white (Figure 1b), respectively, and the reader can verify the complexity of the data.

Neural network model
To implement the ANN, the Matlab software was used with the nnstart -fitting app tool and the following algorithms have been tested: scaled conjugated gradient (trainscg) (MOLLER, 1993), Levenberg-Marquardt (trainlm) (HAGAN; MENJAI, 1994) and backpropagation (traingd) (FILLETTI; SILVA; FERREIRA, 2015; ORTEGA-ZAMORANO, 2017). The best results were obtained with the scaled conjugate gradient for the three characteristics of the inks.
From the experimental data referring to the 1798 values of the infrared spectra for each of the 80 ink samples, the input data matrices to be used in the ANN were prepared, which had dimensions of 1798x80 for the two together inks and 1798x40 for each ink separated. Table 1 illustrates the creation of the input matrix used as dataset for the development of the ANN for the 40 samples of white inks (samples W1,W2, …, W40) and for the 40 samples of black inks (samples B1, B2, …, B40) together, each sample represented in a column of the input matrix. And the 1798 lines represent the values of the infrared spectra for each of the ink samples. For the development of the ANNs for the separate inks, the input matrix for each case followed the same strategy, but with 40 columns each, representing the 40 samples of each ink color. The output vectors had dimensions 1x80 and 1x40 for the two inks together and separated, respectively, being that two different vectors 1x40 were assembled, one for white ink and one for black ink. For each case, 3 output vectors were created, one for each parameter analyzed by the neural models, that is, one vector for viscosity, one vector for density and another for solid content of each sample, as illustrated in the diagram in Table 2, for the two inks together.
For the implementation of ANN, the experimental data set formed by the inks samples was randomly divided into three subsets so that 70% of the data was used in ANN training, 15% was used in ANN validation and the remaining 15% were used in the ANN test. Few epochs were carried out during the training of ANN to avoid overtraining, which can cause little or no reproducibility of the ANN test results.

The Scaled Conjugate Gradient algorithm
Before properly introducing the algorithm of the staggered conjugate gradient (Scaled Conjugate Gradient -SCG) it is necessary to present and know the algorithm that precedes it, the conjugated gradient (CG) algorithm. The purpose of this algorithm is to accelerate the normally slow convergence rate of backpropagation by avoiding computational costs such as manipulation of the Hessian matrix as occurs in the Newton method, thus being an intermediate method between these two. Conjugated gradient algorithms require only a little more storage than other algorithms, so they are good for networks with many adjustable weights (ALMEIDA, 2007).
In its operation the adjustment of the weight does not occur in the negative gradient, as in the backpropagation, but along conjugated directions, to determine the size of the step that minimizes the function of the error along this line. Also in the backpropagation algorithm the learning rate is fixed and used to determine the size of the adjustment that will be applied to the weights (step size), while in the CG this step is adjusted to each interaction, generating a sequence of estimates and only ends when a satisfactory solution is found (ALMEIDA, 2007).
All GC algorithms begin the search towards the gradient descent in the first iteration, exemplified in Equation 1 (ALMEIDA, 2007): After this process, the online search is performed to determine the learning rate parameter (η) to move in the current direction in the search direction. The weights are updated according to Equation 2: Then, the next direction of search is determined so that the meanings of the previous searches are conjugated. The general procedure for determining the new direction of the search should combine the direction of the steepest descent with the preceding direction of the search, as shown in Equation 3: The versions of CG algorithms are distinguished by the way the constant is calculated. In the staggered conjugated gradient (SCG), line research at each stage of the iteration is not necessary as other conjugated training functions, as this algorithm combines the Levenberg-Marquardt approach with the CG, using an approximation of the Hessian matrix calculation that must be positively defined.
This mechanism makes the algorithm faster than any other second-order algorithm. The trainscg function requires more iteration to converge than the other conjugated gradient algorithms, but the number of calculations in each iteration is significantly reduced because no line lookup is performed. Also in the SCG there are two parameters that need to be defined for the operation of the algorithm. The , which is a weighting for the calculation of the second-order approximation and the parameter that helps regulate the lack of definition of the Hessian matrix.
Thus, the neural models developed in this study directly return the estimated values for viscosity, density and solids content using the algorithm SCG, having the values of the matrix in Table 1 as input variables to be used in the ANNs training.

Results and discussion
In this section we will discuss the results obtained by the ANN. For better understanding, the results will be subdivided for the two inks together and for the black and white inks separately. Our first attempt was to calculate univariate models using area or height of selected peaks, but the results were not satisfactory, and a multivariate approach is highly recommended. In the infrared spectra several bands were observed (1534 and 1639 cm -1 , for instance) that are related to N-O band from the resin.  Figure 2a shows the linear model that best fits the viscosity data for the training set which was y = 0.52x + 89.50; for the validation set was y = 0.66x + 62.89 as shown in the Figure 2b and for the test set was y = 0.43x + 74.74 and it is represented in Figure 2c.

Black and white inks together
The graph in Figure 3 shows the performance of the ANN for to estimate the viscosity of the two inks together, where it is possible to notice that 40 epochs were performed during the training. The early interruption of training was performed to prevent overfitting from occurring, which can cause the neural model to perform poorly in its generalization. Figure 4 shows the Bland-Altman plot for the viscosity data of the black and white inks together. The x-axis shows the average between the real values and those estimated by the neural model, and the y-axis shows the difference between these values. It can be noticed that, in this case, although there are some data with a big difference between the actual and estimated viscosity values, there are still a lot of data with a small difference. Ideally, all data should have a difference around zero. Viscosity was the parameter that the neural model had the most difficulty estimating, due to the fact that viscosity is mainly related to the inorganic composition of the samples.   Figures 5a, 5b and 5c show the relation between the density values estimated by ANN and the real values in the training, validation and test set, respectively. The line that best fits the density data for the training set was y = 0.97x + 0.03 as shown in the Figure 5a; the Figure 5b shows for the validation set was y = 0.91x + 0.10 and for the test set was y = 0.81x + 0.17 and it is represented in Figure 5c. Figure 6 shows the performance of the ANN for to estimate the density of the two inks together, whith 24 epochs during the training. Again, the early interruption of training was performed to prevent overfitting. In this case, Figure 7 shows the Bland-Altman plot for the density data of the black and white inks together, showing that the difference between the    Finally, Figures 8a, 8b and 8c show the relation between the solid content values estimated by ANN and the real values in the training, validation and test set, respectively. The line that best fits the solid content data for the training set was y = 0.97x + 1.09 as shown in the Figure 8a; the Figure 8b shows the linear model that best fits the solid content data for the validation set which was y = 0.94x + 1.56 and for the test set was y = 1.08x -3.26 and it is represented in Figure 8c. Figure 9 shows the performance of the ANN for to estimate the solid content of the two inks together, whith 35 epochs during the training. Again, the early interruption of training was performed to prevent overfitting. Figure 10 shows the Bland-Altman     Table 3, in relation to the data of the black and white inks together, shows the correlation and determination coefficients, R and R 2 (Pearson correlation coefficient and Coefficient of determination, respectively), the p-value and the average percentage error for the training, validation and test sets of the best neural model developed. The correlation coefficient determines the "quality of fit" between the target and predicted variables. A value of R equal to 1 means a perfect fit. It is worth mentioning that if the p-value is less than 0.05, the null hypothesis is rejected and the alternative hypothesis is accepted that the means of the two sets (estimated and real) are different, and when p-value is greater than 0.05 it is not

Black ink
For the black inks there were 40 samples, 28 of which were used for training, 6 for validation and 6 for testing. The neural model that provided the best result for viscosity had 32 neurons in the intermediate layer and 30 epochs were performed during training; for density, the ANN had 16 neurons in the intermediate layer and the training was done with 22 epochs; and for solid content were used 13 neurons in the intermediate layer, with 30 epochs.
In this case, the line that best fits the real viscosity values of the black inks and those obtained by ANN for the test set was y = 1.06x -3.92; for the validation set was y = 1.32x -92.64 and for the training set was y = 0.96x + 12.74.
For the density values we have that the line that best fits the real values of the black inks and those obtained by the neural model for the test set was y = 0.40x + 0.58; for the validation set was y = 0.22x + 0.75 and for the training set was y = 0.33x + 0.64.
The line that best fits the real solid content values of the black inks and those obtained by ANN for the test set was y = 1.42x -12.82; for the validation set was y = 1.79x -24.38 and for the training set was y = 1.20x -6.44. Table 4, in relation to the data of the black inks, shows the correlation and determination coefficients, R and R 2 , p-value and the average percentage error for the training, validation and test sets of the best neural model developed.

White ink
For the white inks there were 40 samples too, 28 of which were used for training, 6 for validation and 6 for test. The neural model that provided the best result for viscosity had 7 neurons in the intermediate layer and 23 epochs were performed during training; for density, the ANN had 12 neurons in the intermediate layer and the training was done with 70 epochs; and for solid content were used 8 neurons in the intermediate layer, with 18 epochs.
In this case, the line that best fits the real viscosity values of the white inks and those obtained by ANN for the test set was y = 0.84x + 32.12; for the validation set was y = 0.87x + 27.02 and for the training set was y = 0.84x + 33.83. The line that best fits the real density values of the white inks and those obtained by the neural model for the test set was y = 0.97x + 0.04; for the validation set was y = 0.95x + 0,05 and for the training set was y = 1.01x -0.01.
For the solid content values we have that the line that best fits the real values of the white inks and those obtained by ANN for the test set was y = 0.98x + 2.76; for the validation set was y = 0.87x + 6.91 and for the training set was y = 0.87x + 6.28. Table 5, in relation to the data of the white inks, shows the correlation and determination coefficients, R and R 2 , p-value and the average percentage error for the training, validation and test sets of the best neural model developed.

Conclusion
Neural models were developed to estimate important parameters of graphic inks used in gravure printing.
The artificial neural network developed for the two inks together showed an average percentage error of approximately 30% for the test set in relation to the viscosity parameter. As for density and solid content, the results were better, with average percentage errors of 2% and 4%, respectively, for the ANN test set.
In an attempt to improve the performance of neural models, the ink sample models were proposed separated and ANNs were developed for each of the inks. When analyzing the data only for black ink, average percentage errors of approximately 23% were observed in the viscosity test set, while for density and solid content the average percentage errors were 0.7% and 4%, respectively, in the test sets. For white inks, the ANNs of the three parameters evaluated provided values closer to the real ones, proving to be a highly efficient tool for this type of problem. The results showed average percentage errors of 7%, 0.7% and 4% for viscosity, density and solid content, respectively, for the test sets.
It can be concluded that the developed neural models showed satisfactory results, especially for estimating the density and solids content of black and white inks together and also separately. It is also important to highlight that the viscosity parameter was the one with the highest percentage errors in the three cases, possibly due to its values, for both black and white ink, vary greatly from one sample to another. This observation is due to the fact that viscosity is mainly related to the inorganic composition of the samples.