31 jul. 2012

Curso Quimiometría UHU: Tema 6

Curso Quimiometría UHU: Tema 5

Curso Quimiometría UHU: Tema 4

Curso Quimiometría UHU: Tema 3

Curso Quimiometría UHU: Tema 2

Curso Quimiometría UHU: Tema 1

26 jul. 2012

Monitor: Using category labels

I´ve been checking recently the performance of a calibration of compound feed with  a set of samples (15): 3 samples of hen feed, 3 of pig feed, 3 of chicken feed, 3 of ovine feed and 3 of cattle feed.
The idea is to check if the calibration predicts correctly the results, but in this post I will visualize the plots in order to get conclusions.
Sample set has been imported into R with a column called "Category" (with the labels of feed types: hen, pig,).
I will check just the protein X-Y plots:

I will not go into details of the statistics this time, just interpretation.
Chicken seems quite well predicted and the 3 samples fits fine into the 0 intercept, 1 slope line. Same for pig feed.
Sheep feed is a little bit worse but could be fine.
I have problems with hen feed (high residuals and low variability in the test set).
For cattle feed I have a bias problem (one of the methods is predicting higher than the other), we need to check with more samples to confirm this tendency.

Comments are welcome about how to improve this plot with colors for the categories.

25 jul. 2012

Hierarchical Cluster Analysis (ChemoSpec) - 03

It is clear that we can discriminate between olive oil and sunflower oil, but let´s see the reason for the sub-clusters in the sunflower oil.
Samples sflw6da, sflw7da, sflw8da, sflw9da, sflw10da are refined sunflower, so it is filtered and processed, that means differences in the water content, and some other physical properties, but apart from that these samples have a similar spectra to samples sflw2da, sflw3da, sflw4da and sflw5da, and the reason is that the content of oleic acid is similar.
Samples sflw11da, sflw12da and sflw13da have a higher value of oleic acid,  so the 2140 absorption band is lower respect to the others and there is a shift of the peak position around 1720 nm to a higher wavelength . These points allow to discriminate these samples respect to sflw6da, sflw7da, sflw8da, and these points plus the water content and other physical changes makes also possible the discrimination respect to  sflw6da, sflw7da, sflw8da, sflw9da, sflw10da.

See if interested: previous posts:
Looking to the spectra (sunflower oil)
Hierarchical Cluster Analysis (ChemoSpec) - 01
Hierarchical Cluster Analysis (ChemoSpec) - 02  (interesting comment from Bryan Hanson)

Exporting form ISI Scan to LIMS

19 jul. 2012

Hierarchical Cluster Analysis (ChemoSpec) - 02

This is the second derivative spectra of the raw spectra we have sawn in the post: "Hierarchical Cluster Analysis (ChemoSpec) - 01". In that post we saw some clusters, but the distance between the clusters was not high, so it was clear that some math treatment should be applied to remove baseline shifts and to increase the differences between the clusters as much as possible.
Well, let’s see now the HCA in this case:

Well, now it looks much better, Olive samples in one cluster, and sunflower oil samples in another. We can see also two sub-cluster in the sunflower samples. Looking to the spectra we can see some reasons for that more clearly now. That will be treated in the next post.

18 jul. 2012

Sample analysis – Is there a ‘right’ result?

AllAboutFeed - Nutrition: Sample analysis – Is there a ‘right’ result?
Another interesting article from Simon Flanagan, (Application manager at Aunir, United Kingdom)   to have more clear some concepts:
Accuracy versus precision
Repeatability versus reproducibility
Comparing results

Making stable and robust calibrations in NIR analysis

This is a link to an interesting article of Prof. Tom Fearn, Department of Statistical Science, University of London
AllAboutFeed - Nutrition: Making stable and robust calibrations in NIR analysis
He give to us some interesting advices about how to develop robust calibrations.

Monitor with R: Moisture in Sunflower Seeds Intact

I had the opportunity today to check the performance of a calibration (moisture in intact sunflower seed in reflectance).
This is always a exciting moment: 
 Does the performance of the calibration for the new validation set is as expected during the calibration development?.
Can I add with the new set more variability to the calibration and improve it?
Sunflower intact seed is not an easy product to analyze by NIR, especially for fat. Much better improvements can be getting grounding the sample, but the option to eliminate the grinding process is of course very attractive.
I use some functions I have created with R software for the validation (see previous posts and videos about Monitor function) and to create some comments about the results.
Nº Validation Samples  = 1298
Nº Calibration Samples = 2046
Nº Calibration Terms   = 12
Calibration SECV       = 0.6
RMSEP    : 0.6473
Bias     : 0.4687
SEP      : 0.4466
UECLs    : 0.6252
***SEP is bellow BCLs (O.K)***
Corr     : 0.8977
RSQ      : 0.8058
Slope    : 0.8975
Intercept: 0.186
RER      : 16.1   Fair
RPD      : 2.21   Very Poor
BCL(+/-): 0.02432
***Bias adjustment is recommended***
Residual Std Dev is : 0.4351
***Slope adjustment is recommended***

This is a quite big validation set, see the plots:

As we can see the residuals histogram has a normal distribution.
RMSEP is quite similar to the Standard error of Cross Validation ( a little bit higher), and the SEP (Validation Error corrected by the Bias) is much better (an improvement of 0,2). So the Bias adjustment is recommended. There is a deviation of the slope, but the plots show that this should be treated with caution.
Anyway this is a long validation set, and the best option is to merge this validation data with the calibration data and recalibrate in order to improve the error and make a more robust calibration which probably will improve the statistics for the SECV and for the RMSEP in the next validation.

17 jul. 2012

Hierarchical Cluster Analysis (ChemoSpec) - 01

I have been in previous post using the ChemoSpec package for some oil data (olive and sunflower). My spectra has now a range from 1100nm to 2200nm and is raw (not treated mathematically) . I want to start using the ChemoSpec package to start using the “Hierarchical Cluster Analysis” in order to see  some cluster in my data. Of course I hope to see the olive oil in one cluster and the sunflower in the other. But probably other clusters can appear.
Anyway this is just a quick test and sure we´ll get much better knowledge of the data treating the spectra with derivatives (this will be done in another post).
So after importing the “csv” files into R, we can plot the raw spectra (olive oil in red and sunflower in red).

We can see some ranges of the spectra where there is a clear difference, and we explain in previous post that these differences are related to the fatty acids concentration.
 Let´s run the “Hierarchical Cluster Analysis” from ChemoSpec:
hcaSpectra(oils, title = "Raw Spectra / oils")

We can see that 3 samples of sunflower oil are quite different from the others (Olive or sunflower), and that with the rest of the samples there are two cluster (olive and sunflower oil).
We can get some other conclusions, but what I´m going to do is to treat with second derivative the spectra and try to get more conclusions with this set of the spectra at the same time I practice with R.

Related Posts:
Hierarchical Cluster Analysis (ChemoSpec) - 02

11 jul. 2012

Sugar Mill byproducts

I recently read an article in the NIR News magazine about the use of NIR to analyze some sugar mill by products. One of the products is the boiler ash. Boiler ash is generated after burning the bagasse, which is one of the sugar mill byproducts. This boiler ash is rich in potassium and silicon.
See this nice video from India about the use of bagasse to generate electricity:
Boiler ash is used in combination with mill mud (another sugar mill by product) to fertilize the sugar cane fields. Mill mud is rich in Calcium and Phosphorous.
In this video you can see in the minute 5:11 aprox the phase when the mill mud is generated.

NIR can measure inorganic constituents because there are organic molecules which interact with minerals, also by indirect correlations or by the way their influence in the hydrogen bonding within the sample matrix.

NIR News Vol3 No1 January/February 2012

NIR to monitor Wheat products processing

NIR technology can be applied to Wheat product processing to monitor the characteristics of the dough mixing. Apart from the rheological properties, chemical properties can be monitored to improve the quality of the dough with the correct formation of the gluten network .
This of course can be done on line with appropriate probes.
Qualitative models could help to determine the optimum moment for the mixing.

In this link you can see applications for the wheat product processing.
Dynamic NIR Spectroscopy to Monitor Wheat Product Processing: A Short Review

9 jul. 2012

Sunset in Asturias

Summer makes me take a little bit easier the blog posting, but more posts will come soon.
 I have been this weekend in Asturias and I was lucky to cupture this beatiful sunset in the beach.

2 jul. 2012

Looking to the spectra (sunflower oil)

This figure shows the inverse correlation between the oleic acid (C18:1) and Linoleic acid (C18:2) for this data set:

Figure 1 shows the spectra of a sample set of sunflower oils.A second derivative (Fig 2) helps to define better the spectra bands and with the zoom  option we can see certain areas of the spectra.
 There are an inverse correlation between the oleic acid and  linoleic acid in the sunflower oils. In this spectra sample set  (all sunflower oils), we can see how the band at aprox 1720 changes (figure 3). We can see clearly in this figure the sunflower oil high oleic (lower intensity at 2144 nm: Fig 4 and peak more defined at 1722 nm: Fig 3).

Olive vs. Sunflower oil Spectra - 002 (ChemoSpec)

I add other data set of “sunflower oil” to import together with the olive oil into ChemoSpec R package. Before, as I showed in a video (Preparing spectra to import into ChemoSpec), every sample has been acquired with a NIR instrument (in transmittance) and transformed into a CSV file.

Olive samples are in red, so we can see some differences visually respect to the sunflower oil (blue). Anyway we will zoom some areas of the spectra in order to see better the differences.
We can see an olive oil sample, which is different from the rest. We have the raw spectra, so there are not math treatments applied, or baseline correction.
Let’s see one of the areas (around 1720nm). Oils with high content of polyunsaturated fatty acids have the maximum peak  at 1720nm (in this case the sunflower oil), olive oil (in red) is more reach in monounsaturated fatty acids, so the band moves to the right to a higher frequency (1724 nm aprox.).

plotSpectra(oils, title = "(olive / sunflower) oils",
which = 1:19,xlim = c(1680,1800))

Other area where we can notice the difference is around 2140 nm where the oils with polyunsaturated fatty acids have a higher intensity than the oils with more monounsaturated fatty acids. (Anyway here one of the samples of olive oil is similar to the sunflower oil. So this sample could be considered as an outlier.
plotSpectra(oils, title = "(olive / sunflower) oils",
which = 1:19,xlim = c(2100,2200),yrange=c(1.0,1.8))

We can practice the option to remove problematic sample from the manual:
> oils$names
 [1] "oliv149" "oliv202" "oliv255" "oliv305" "oliv358" "oliv44"  "oliv96"
 [8] "sflw10"  "sflw11"  "sflw12"  "sflw13"  "sflw2"   "sflw3"   "sflw4" 
[15] "sflw5"   "sflw6"   "sflw7"   "sflw8"   "sflw9" 

oils1<- removeSample(oils, rem.sam = c("oliv149"))

If we plot now the “oils1” set, the outlier sample will be removed from the plots.

Maybe yo can be interested in the post: Looking to the spectra (sunflower oil), whe you can see some diferences between the sunflower oils acording to their relation oleic / lineoleic acids.

Bibliography: Manual del aceite de oliva ( Ramón Aparicio / John Harwood)