The chemical composition of wood determines the color development when applying chemical stains to the surface of wood. However, different species and individuals from the same species can show variations in the chemical composition, resulting in the risk of nonuniform color development in industrial staining processes between different batches of wood. In the present study, near-infrared (NIR) models were developed to predict wood specimen color development after applying three different concentrations of the chemical stains iron acetate and sodium bicarbonate. The modeling dataset included the NIR spectra of the untreated wood, stain treatment, concentration, and the International Commission on Illumination (CIE) L*a*b* color value before stain application for 210 specimens from five commercial wood species, including red oak (Quercus rubra), white oak (Quercus alba), yellow poplar (Liriodendron tulipifera), southern yellow pine (Pinus spp.), and western red cedar (Thuja plicata). The models were developed by partial least squares regression (PLSR), using 13 different mathematical transformations on the NIR spectra as well as the raw spectral data. Models with single stains and global-species/stain models were developed and compared. The models for iron acetate showed promising results in predicting the color development with the coefficient of determination for cross-validation ( ≥ 0.92), while the models for sodium bicarbonate showed acceptable results with of 0.71 to 0.89. However, a global model including both stains resulted in an unsatisfying prediction of the CIE L*a*b* color values, with of 0.46 to 0.76. The NIR models can be useful for online predictions of color development in industrial staining processes of wood with chemical stains.
Chemical stains for wood react with polyphenolic constituents of the wood extractives, especially with tannins, resulting in a spectrum of colors from shades of gray and brown to red-brown (Flexner 2021). The concentration and composition of these wood extractives can vary between species, between individuals of the same species, and even across the cross-section of a log in a single tree (Yanchuk et al. 1988), with the consequence that the color development due to chemical stains is difficult to control. Hence, about a century ago, chemical stains were mainly replaced by synthetic aniline dyes and pigmented stains, which have the advantage of a greater range of colors, better resistance to fading, and ease of use (Flexner 2021). In more recent years, however, chemical stains, especially those that contain iron ions such as ferrous iron sulfate and iron acetate, have experienced increasing attention, as they can develop colors that imitate a natural weathered look on wood surfaces (Dagher et al. 2020, Hundhausen et al. 2020). Likewise, sodium bicarbonate has been found to develop aged brownish colors on wood (Kropat et al. 2020).
Since the wood extractives are not distributed evenly across the cell walls, chemical stains develop a spectrum of different color shades across the stained surface. In contrast, pigmented stains and synthetic aniline dyes can partially cover the wood texture and create uniform colors across the stained surface, producing an artificial appearance (Dagher et al. 2020). Tannic acid solution can be applied as a pretreatment to wood species with low extractive concentrations to intensify the color development due to chemical stains containing iron ions (Kielmann et al. 2018). However, a uniform color development similar to pigmented dyes and synthetic aniline dyes must be expected due to the even distribution of the tannic acid across the surface. In addition to the benefits of creating a weathered appearance, iron acetate and sodium bicarbonate constitute environmentally friendly and nontoxic alternatives to conventional stains (Dagher et al. 2020).
To control the color development of chemical stains on wood, the chemical composition of the wood itself must be considered. NIR spectroscopy is a nondestructive method for rapid, indirect assessment of chemical properties for various materials, including organic matter such as wood, soil, and food products, and has been increasingly utilized in the forest and forest products industry (So et al. 2004). NIR spectroscopy has been used to assess the chemical properties of wood, including lignin content, cellulose content, and pulp yield (Garbutt 1989, Wright et al. 1990, Michell 1995, Raymond and Schimleck 2002, Hodge et al. 2018), for differentiation between wood species (Pastore et al. 2011), for differentiation between sap and heartwood (Haartveit and Flæte 2008, Sandberg and Sterley 2009), and the prediction of physical properties such as grain angle (Gindl and Teischinger 2002) and density distribution (Schimleck 2003, Via et al. 2005).
Depending on the NIR spectrometer, spectral data can be obtained from ground material or directly from the material's surface. Though grinding the material can result in better prediction diagnostics, it is a destructive process that requires additional preparation time. On the other hand, a direct measurement of the material's surface is fast, nondestructive, and thus preferable for inline measurements in industrial processes. However, the measurement of solid wood can result in different spectral patterns depending on tangential, radial, or cross-section surfaces (Gindl and Teischinger 2002). Hence, the anatomical direction of the measured specimen face must be considered in experiments.
Whittier et al. (2021) compared the performance of models predicting foliar nutrient levels in teak seedlings with NIR data gathered by destructive and nondestructive sampling. Measurements for destructive sampling were obtained by a benchtop device, while a handheld device was used to obtain measurements for nondestructive sampling. The results suggested that both destructive and nondestructive sampling resulted in useful models for predicting foliar nutrient levels. Likewise, Acosta et al. (2020) compared the predictive performance of models with NIR spectra gathered by a benchtop NIR device to models with NIR spectra collected by handheld devices to predict the nutritive value of forage. The authors concluded that the predictive performance of models developed with NIR spectra obtained by handheld devices is comparable with models developed with NIR spectra obtained by benchtop devices.
This research project aims at predicting the development of the International Commission on Illumination (CIE) L*a*b* color values on wood after applying the chemical stains iron acetate and sodium bicarbonate. For that purpose, partial least squares regression (PLSR) models were developed with NIR spectral data of the untreated solid wood as predictors. The overall goal was to assess a series of models for the inline prediction of color in industrial staining processes with chemical stains in order to generate a more uniform color outcome.
Materials and Methods
Preparation of wood specimens
Specimens from the heartwood of red oak (RO) (Quercus rubra), white oak (WO) (Quercus alba), yellow poplar (YP) (Liriodendron tulipifera), southern yellow pine (SYP) (Pinus spp.), and western red cedar (WRC) (Thuja plicata) were used in the study. The wood was purchased kiln dried from Capitol City Lumber, Raleigh, North Carolina, and conditioned in standard atmosphere (20°C, 65% relative humidity) according to ASTM 1037-12 (ASTM 2020a) until reaching the equilibrium moisture content of approximately 12 percent. The moisture content was determined according to ASTM D4442-20 (ASTM 2020b), Method A (primary ovendrying method). The wood species' ovendry density was calculated according to ASTM D2395-17 (ASTM 2017), Test Method A (volume by measurement). In total, 210 specimens (42/species) with dimensions of 150 × 70 × 5 mm (longitudinal × radial × tangential) were prepared by sawing, planing, and sanding up to grit 180 with steps 80, 100, 120, 150, 180 grit. Care was taken that the stained and measured surfaces were clear and showed a radial cut direction with annual rings oriented perpendicular to the stained surface.
Measurement of pH value
The pH values for the wood species were measured according to the method introduced by Campbell and Bryant (1941). Three randomly selected specimens of each species were ground in a mill with rotating knives (Wiley, Model 4) using a screen width of 2 mm. One gram of ground wood was dispersed in 20 mL deionized water and stored in closed glass vials at 21°C. After 24 hours, pH values were obtained in the dispersion with a general-purpose electrode (REED, PE-03). All measurements were performed in triplicate.
Chemicals for stains and analytical techniques
Iron powder (97%), tannic acid powder (ACS reagent), glacial acetic acid (17.4 M), sodium bicarbonate powder (≥99.7%), and sodium hydroxide pellets (≥97%) were purchased from MilliporeSigma (St. Louis, Missouri, USA). 1,10-phenanthroline (≥99%), hydrochloric acid (12.178 M), sulfuric acid (1.005 to 0.995 N), and ferrous ammonium sulfate hexahydrate (≥98.5%) were purchased from Fisher Scientific (Waltham, Massachusetts, USA). Hydroxylamine HCL (≥99%) was purchased from Arcos Organics (Morris Plains, New Jersey, USA).
The iron acetate stain was produced by reacting 17.5 g of iron powder with 200 mL of 25 percent acetic acid. The acetic acid was heated on a magnetic hotplate stirrer to 80°C. The solution was stirred for 6 hours and eventually filtered with a Buchner funnel under vacuum using a filter paper disc with a porosity of 3 microns. The iron acetate solution was stored in a cool and dark place. The iron concentration of the solution was determined spectrophotometrically with 1,10-phenanthroline (Harvey et al. 1955) using a ultraviolet–visible spectrophotometer (Perkin Elmer, Lambda XLS). Directly before the application to the wood specimen, the stain was diluted to target concentrations (0.05, 0.1, 0.2 g/liter) using deionized water. Sodium bicarbonate, 0.84, 4.2, and 8.4 g, was dissolved by stirring in 100 mL of deionized water to result in the target concentrations 0.1, 0.5, and 1 M. The solutions were filtered with a Buchner funnel and stored cool and dark until application. Tannic acid powder in measures of 0.03 and 0.06 g were dissolved by stirring in 100 mL deionized water to result in target concentrations 300 and 600 mg/liter. The solutions were filtered with a Buchner funnel and stored cool and dark until application.
Application of staining solutions
Staining solutions were sprayed onto the specimen using a handheld gravity feed high volume low pressure (HVLP) spray gun (Husky, No. H4840GHVSG). The average application rate was calculated from mass before and after application and surface area of randomly selected specimens. Tannic acid was applied in concentrations of 300 and 600 mg/liter as pretreatment to specimens of southern yellow pine (SYP) and yellow poplar (YP), which are naturally low in extractives/tannins. Application rates are presented in Table 1. The specimens were stored dark in standard atmosphere for 24 hours after stain application and before color measurement.
NIR diffuse reflectance spectra in the range of 1600 to 2400 nm were obtained from the untreated wood specimen with a handheld NIR spectrometer (Thermo Fisher, microPHAZIR) at a resolution of 8.7 nm. The measurement area of 1 cm2 was illuminated by a tungsten lamp oriented at 30° above the specimen. Each scan was conducted with nine replicates. Every specimen was scanned three times, and an average spectrum was calculated after removing outliers.
PCA cluster analysis
Principle component analysis (PCA) was conducted to investigate clustering in the NIR data among the different species. Classical PCA was calculated for centered and no-scaled data using the R package ChemoSpec (Hanson 2021).
NIR model development
The models for predicting the CIE L*a*b* color values were developed using a data analysis pipeline written in the R environment (R Core Team 2016). The pipeline has been successfully used for model development based on NIR spectra, such as for the prediction of chemical properties of wood (Hodge et al. 2018), the nutritive value of switchgrass (Bekewe et al. 2020), a mixture of native warm-season grasses (Castillo et al. 2020), and for prediction of forage nutritive values (Acosta et al. 2020). The pipeline is separated into two phases: (a) spectral transformation and outlier detection, (b) model training, cross-validation, and the prediction of a test dataset.
To summarize the NIR pipeline: first, different mathematical transformations, including scatter corrections, spectral derivatives, and combinations of the former (pairs of transformation), were applied to the untreated spectra (log R−1) to remove scattering associated with diffuse reflection and improve the subsequent regression analysis. Scatter correction transformations included multiplicative scatter correction (MSC), standard normal variate (SNV), and detrend (DT). Spectral derivative methods included Savitzky-Golay transformation calculated with second-order polynomial and second derivative at two different window sizes of five and seven smoothing points (SG5 and SG7). Pairs of transformations included SNV+DT, MSC + DT, SNV + SG, MSC + SG, and DT + SG. Outliers in the NIR data were determined and removed after the spectral transformation by calculating local outlier factors (LOFs) (Breunig et al. 2000) based on the spectra's 20 nearest neighbors. Spectra with LOF values greater than two were excluded from the analysis (Acosta et al. 2020).
Second, for model screening, outlier-free full datasets for all transformations and the untreated NIR spectra were used to fit NIR models between the spectral data and the categorical predictor's stain type, concentration, and CIE L*a*b* color values. Models were calculated with partial least squares regression (PLSR) using the R package PLS (Mevik and Wehrens 2016). For screening purposes, models were calculated with different combinations of species, stains, and concentrations, as well as global models including all data. The model performance was evaluated by tenfold cross-validation. The optimum number of latent variables was selected using the “onesigma” approach (Hastie et al. 2009), by which the model with the lowest number of latent variables within one standard error of the model with the absolute minimum root mean squared error of prediction (RMSEP) is selected. Desirable models are those with a small number of latent variables, maximizing the coefficient of determination for cross-validation () and minimizing the root mean squared error of prediction for cross-validation (RMSEPcv). A specific two-step selection algorithm was applied to avoid subjectivity within the selection process of the best models. First, models for different variable combinations were selected for the single CIE L*a*b* color values based on the smallest RMSEPcv among the various transformations. Eventually, the models were compared by the . The number of latent variables (factors) resulted from the “onesigma” selection process.
For evaluation of the model performance, the complete outlier-free datasets were randomly divided into training (75%) and test (25%) data sets. The training dataset was used to develop the models that were determined to be suitable during the screening process, while the test dataset was used to evaluate the predictive performance of these models. Scatterplots for the CIE L*a*b* color values with actual values (x-axis) vs. predicted values (y-axis) were plotted. In addition, the coefficient of determination (R2), RMSEP, ratio of performance to deviation (RPD), which is the standard deviation divided by the standard error of prediction (Williams et al. 2017), and bias, which is the average difference between predicted and actual value, were calculated.
Results and Discussion
Color development after staining
Staining with iron acetate resulted in shades of brown and gray to dark purple, while staining with sodium bicarbonate resulted in brownish and greenish colorations. Color values and intensity depended on wood species and the concentrations of stains and tannic acid solution applied (Fig. 1). Application rates for stains and tannic acid solutions are presented in Table 1. Wood species, density, pH value, and initial CIE L*a*b* color values are shown in Table 2. pH values were reported in a similar range for white oak, red oak, and yellow poplar by Campbell and Bryant (1941) and Johns and Niazi (1980). However, the reported pH values for western red cedar and southern yellow pine were lower than the measured results, with 2.46 reported and 3.92 measured for western red cedar and 4.66 reported and 5.54 measured for southern yellow pine. The differences in reported and measured pH values indicate that the extractive concentration, which has a major influence on the pH value, can vary significantly between individuals from the same species. Thus, the color development of wood after applying chemical stains must be predicted considering the chemical composition of the specific specimens.
The average CIE L*a*b* color values before and after staining and the average color changes (ΔE) are shown for iron acetate in Table 3 and sodium bicarbonate in Table 4. The CIE L*a*b* color values differed significantly between the different concentrations of iron acetate. However, the differences in the color values for the different sodium bicarbonate concentrations were not always significant, especially not for the values a* and b* as shown in Figure 2. ΔE was consistently the highest for white oak, followed closely by red oak and western red cedar. The species yellow poplar and southern yellow pine showed the smallest ΔE for both stains (Tables 3 and 4).
A direct comparison of color changes due to iron salts to the literature is impossible, since concentrations and application rates differ or are not consistently reported. Furthermore, the variations in extractive content among individual trees from the same species can result in differences in color development, as pointed out above. In the few publications on staining wood with iron salts, no initial color values of the untreated wood are reported, making a comparison of the development for single CIE L*a*b* color values difficult. However, Dagher et al. (2020) compared the color development of white oak, red oak, sugar maple, and yellow birch after applying a 1 percent weight/volume solution of ferric sulfate pentahydrate (97%). The authors reported the strongest ΔE for white oak, followed by red oak, consistent with the results presented in Table 3. No color changes have been reported in the literature for sodium bicarbonate on wood; hence no comparison is possible.
A negative trend was observed for the relationship between ΔE and the pH values, as the species with the higher ΔE measured at lower pH. However, no direct correlation can be assumed between ΔE and pH, since western red cedar has a pH of 3.92, similar to white oak (pH 3.84) but a significantly smaller ΔE for all stains and concentrations (Tables 2, 3, and 4).
Data analysis and modeling
The PCA cluster analysis for the raw NIR spectra of the untreated wood specimen resulted in an excellent differentiation between the wood species with three principal components (Fig. 3), indicating a good database for predictive modeling of the CIE L*a*b* color values after application of the chemical stains considering the wood chemistry. The ellipses drawn around the groups represent the 95 percent confidence intervals. Since all data, without outlier removal (LOF), were used for the cluster analysis, some data points can be located outside of the 95 percent confidence ellipses.
Summary tables with fit statistics as well as figures for all transformations and variable combinations were generated by the data analysis pipeline. An example table for the prediction of CIE color value L* for five wood species stained with three different concentrations of iron acetate is presented in Table 5. Predictors for the PLSR models in Table 5 were the NIR spectra of untreated wood and the concentrations of iron acetate and tannic acid solutions. Additionally, Figure 4 shows an example output of RMSEPcv and as a function of latent variables (factors) for the SG7 model presented in Table 5. The model SG7 was selected for further evaluation based on the lowest RMSEPcv among the model list for 14 different transformations (Table 5). It is apparent that the values for and RMSEPcv vary only slightly between the models for the different transformations. For operational use and to minimize overfitting, a researcher might want to choose a model based on the smallest number of factors instead of the smallest RMSEPcv and highest . For example, the model SNV_DT could be selected with six factors and = 0.92 instead of model SG7 with eight factors and only slightly higher = 0.94. However, the slight differences in the fit statistics do not change any conclusion for the general outcome of the CIE L*a*b* color value prediction drawn by this work.
Table 6 presents the models selected in the screening process that were further evaluated with a test data set as presented in Table 7. The model fit statistics calculated with the full dataset (Table 6) and the training/test dataset (Table 7) were very similar. However, it is noteworthy that the data analysis pipeline optimized the number of factors based on the training dataset, resulting in mostly smaller numbers of latent variables. These changes in the number of factors highlight the data analysis pipeline's flexible model building ability. Models were built, including the untreated wood's CIE L*a*b* color values as predictors, presented in Table 6, rows 2 to 4. However, the predictors CIE L*a*b* of the untreated wood did not result in increased or decreased RMSEPcv consistently compared to models without CIE L*a*b* color values of untreated wood as predictors (Table 6, rows 5 to 7). Additionally, a Jackknife analysis on the significance of predictors did not result in a consistent or high significance of the CIE L*a*b* color values of untreated wood as predictors. Hence, the color values were not used to build the models for further evaluation.
Generally, the best prediction was obtained by the models for iron acetate with of 0.92 to 0.95. Acceptable results were obtained by the models for sodium bicarbonate with of 0.71 to 0.89. However, a global model for both stains did not result in satisfactory predictions with values of 0.46 to 0.76 (Table 6). The nonsignificant differences in the CIE L*a*b* color values between the different concentrations for the sodium bicarbonate stain (Fig. 2) are assumed to be responsible for the inferior predictive performance of models that include sodium bicarbonate.
Table 7 presents the fit statistics for the regression models built with the training/test datasets. In addition to and RMSEPcv, the values for RPD and bias, as well as 95 percent confidence intervals for the intercept (β0) and slope (β1), with upper limit (UL) and lower limit (LL) are given. Figure 5 presents the scatterplots for the CIE L*a*b* color values for model building (train dataset) and prediction (test dataset). Desirable models show small bias values, slopes close to 1, and intercepts close to 0. RPD values ranged from 2.35 to 3.75 for the iron acetate models, from 1.60 to 3.04 for the sodium bicarbonate models, and from 1.19 to 1.85 for the models including iron acetate and sodium bicarbonate. RPD values >2 were considered to indicate a good prediction ability for models by Chang et al. (2001). However, a higher threshold for RPD values was stated by Williams (2014). The RPD values reflect the conclusions drawn based on the , indicating that the models for iron acetate show excellent predictive performance while the models for sodium bicarbonate are acceptable but not optimal. In contrast, the global models, including iron acetate and sodium bicarbonate, did not satisfactorily predict the CIE L*a*b* color values, considering the low RPD values (Table 7).
The bias values, describing the average by which the actual values are greater than the predicted values, range from −0.32 to −0.82 for the iron acetate models, from 0.08 to 1.35 for the sodium bicarbonate models, and from −0.13 to −1.48 for the global models. It is noteworthy that the CIE L*a*b* color space has a very high resolution. A color difference (ΔE) of 0 to 0.5 cannot be perceived by the human eye. ΔE of 0.5 to 1 is only perceivable for experienced observers, while ΔE of 1 to 2 is perceived as a minimal color difference. ΔE > 2 is recognized as an apparent color difference (Wright 1929, Wyszecki and Fielder 1971, Witzel et al. 1973). The calculation of ΔE (Eq. 1) with the bias values results in an average ΔE = 1.16 for the iron acetate model, ΔE = 1.41 for the sodium bicarbonate model, and ΔE = 1.52 for the global model, indicating that the average color differences between actual and predicted color for all models are minimal as perceived by the human eye, and the smallest for the iron acetate model.
The results at hand show a promising outlook for inline assessment of the wood chemistry by NIR spectroscopy in industrial processes that involve chemical stain application to wood surfaces. Future work needs to include more species and more individuals of the same species in the data set. Furthermore, different chemical stains and a broader range of different concentrations should be added to the data set to increase the range and accuracy of the predictions. Especially for the stain sodium bicarbonate, different concentrations have to be chosen, since the color development for the concentrations 0.5 M and 1 M was very similar.
CIE L*a*b* color values after applying chemical stains in three different concentrations to five wood species could be predicted successfully by PLSR models based on NIR spectra of the untreated wood specimen and categorical predictors describing the chemical stains. The best predictive performance was found in models for the stain iron acetate with the highest and RPD as well as lowest RMSEPcv, and bias. Models for sodium bicarbonate were acceptable but will need improvement through future work. Global models, including iron acetate and sodium bicarbonate, did not perform satisfactorily. The similarity in color development between the different sodium bicarbonate concentrations is assumed to be responsible for the inferior performance of models that include sodium bicarbonate. However, given that a color difference (ΔE) of 1 to 2 is perceived as minimal by the human eye, the predicted values are well within an acceptable range. The developed models will be useful for online prediction of color development in industrial staining processes of wood applying chemical stains.
The authors are, respectively, Ph.D. Researcher and Assistant Professor, Dept. of Forest Biomaterials, North Carolina State Univ., Raleigh, North Carolina (firstname.lastname@example.org [corresponding author] and email@example.com); and Research Assistant, Camcore, Dept. of Forestry & Environmental Resources, North Carolina State Univ., Raleigh, North Carolina (firstname.lastname@example.org). This paper was received for publication in March 2022. Article No. 22-00021.