Chemometrics

Chemometrics_absorbance_spectra

Near infrared (NIR) spectroscopy is widely used for compositional analysis of bulk materials because it is relatively inexpensive, fast and, most importantly, allows for non-destructive sample evaluations. Quantitative analysis from NIR spectra requires developing calibration models, which are mathematical models based on statistical learning that enable to indirectly infer the compositional analysis of the sample from its spectrum.

NIR spectroscopy is today a mature technology and there exist a number of sophisticated commercial software applications capable of developing and implementing the statistical learning required to deploy the NIR system in the process workflow. The advent of modern, miniaturised NIR spectrometers however has opened the field towards a more democratised approach to spectroscopy, which can now be run with inexpensive devices in the field.

This, in turn, is changing the approach to chemometrics from a specialised, high-end laboratory tool to more distributed workflows capable of running in the cloud, edge computing or even mobile devices.

We use Python programming language for statistical learning, covering the entire swath of techniques from linear regression through to artificial neural networks. We specialise in developing analytics for field and portable NIR (or other spectroscopy) devices. Python is the language of choice due to its comprehensive standard library, high readability and useful external libraries.

We offer a personalised approach to data analysis and take care of all steps to deal  with instrumental noise (which is unavoidable when working in the field) , scattering effects, and inconsistencies in the measurement setup and sample presentation. With a well-designed pre-processing step, the performance of the model can be greatly improved.

Our capabilities

  • We can assist with deploying the entire traditional suite of linear models such as Principal Component Analysis (PCA) and Principal Component Regression, Partial Least Squares (PLS) regression and related methods, including variable selection techniques and model evaluation.
  • Similarly, our expertise covers the classification methods using linear models, for instance PCA, Partial Least Squares Discriminant Analysis (PLS-DA) and k-Nearest-Neighbor (kNN) approach
  • Non-linear models such as Support Vector Machines (SVM) and Random Forests or Boosting techniques can sometimes improve the performance over conventional linear systems.
  • Artificial Neural Networks (ANN) are all of the rage today. We develop regression or classification deep learning pipelines using popular libraries such as NumPy and TensorFlow.

Our process

Connect

Get in touch explaining your objective, and share some information on the spectral data you want to analyse.

Feasibility/Quote

We will work with you to perform a free feasibility study and we will finalise a quote.

Analysis

We perform the analysis and report the results to you in the format that best suits your needs. Model can be implemented into existing pipelines, or a new workflow can be developed according to your specifications.

Recent projects

Visit our NirPy Research – an educational space dedicated to Python chemometrics, where we take data science concepts down to the language of spectroscopic science.

Binary classification of spectra with a single perceptron

We will revisit a problem we already encountered: the classification of spectra belonging to a binary class. In doing so we’ll demystify the concept of perceptron and introduce the all-too-important algorithm of gradient descent.

Contact us to begin the quotation process or to gather insights into a potential project.