• Development of an open-source platform for multivariate analysis of FTIR data
The Client : Javna agencija za raziskovalno dejavnost RS
Project type: Bilateral projects
Project duration: 2017 - 2018
  • Description

Fourier Transform Infrared Spectroscopy (FTIR) is a technique for collecting absorption or emission spectra of the observed specimen. It is particularly useful for microscopy, as it allows for detection of spectra measured in areas as small as 5 microns across, thus revealing the chemical composition within samples with high spatial resolution. The use of this technique ranges from tissue sample analysis (e.g. observing the effects of stroke on rat's brain), forensics, pharmaceutical analysis and quality control in general, material science and many others. There already exist several programs that are widely used in FTIR (bio)spectroscopy and imaging, but all are rather limited. Cytospec (www.cytospec.com) is a package that is good for spectral imaging, but it requires tricks for processing other types of datasets that would benefit from similar/same processing algorithms. These are time series measurements, non-spatially correlated datasets (for example many individual cells, where the spectrum variations are interesting, but the position of the cell is not followed). On the opposite, the Unscrambler is widely used as a statistical analysis software with many nice and robust features, but it is not useful for spectroscopic imaging. Spectrometer manufacturer software usually handles their own file formats; the features and performance of these programs is not always the best as they are mostly aimed to do the measurement itself. Another feature of this software is its high cost. Within our collaboration, we are developing an application that will combine the features of the commonly used software within an open source software that will be free to use for everyone. The project will thus benefit many research groups and companies. The primary users are any synchrotrons with IR beamlines and their immediate user community: Diamond (UK), Soleil (France), ALBA (Spain), BESSY II (Germany), ELETTRA (Italy), ANKA (Germany), Daphne (Italy), ESRF (France), MLS (Germany), DESY (Germany), each about 20-50 user groups per year, which makes hundreds of groups interested. Then in the second shell, since the technique itself is not synchrotron limited the whole FTIR user community will be interested, which is on the order of 1000s in Europe. Finally, providing the file importer function for other instrument manufacturer/technique and the processing functions will extend its use to techniques include near IR imaging (from food industry to satellites), Raman spectromicroscopy, etc. The project is based on the existing open-source data mining platform Orange, developed at the University of Ljubljana. Its data-flow architecture is intuitive and easy to use, and it also allows for combining the specialized components, developed within this project, with the existing components for data processing, visualization and data mining. The two groups are complementary and cover the necessary expertise. The Slovenian partner has two decades of expertise in development of similar software for data mining, and the French partner is the user of such software with plenty of expertise about the methods that need to be incorporated in it. Project plan outline - Implementation of import/export functions for single spectra and maps for OPUS, OMNIC, Txt and ENVI (May 2016) - Plotting functions for spectra and maps (July 2016) - Peak picking and pixel selection/extraction from map (July 2016) - Implementation of spectral correction base algorithms such as baseline, scattering, offset, and base spectral manipulation functions such as normalization, derivation and smoothing (October 2016) - Implementation of advanced spectral manipulation tools, such as multiple types of baseline options, peak fitting, deconvolution, Mie scattering, water subtraction, etc. (January 2017) - Modification and implementation of multivariate spectral analysis tools as, HCA, PCA and PCA-LDA, PLS regression and multivariate curve resolution (April 2017) - Optimization for large data (September 2017) - Parallelization of algorithms (December 2017) - Adaptation of developed features for other spectroscopic mapping techniques (January 2018) - Development of wrapper functions for spectroscopic tomography reconstruction algorithms (May 2018) - Bug fixing and setting up long term support (September 2018)