Threshold selection is crucial to noise elimination in spectroscopy data processing.
A nonparametric and unsupervised method of automatic threshold selection is proposed.
The new approach is effective when the distribution of noises is not preconditioned or the gap between signals and noises is not obvious.
The method is efficient, easy to implement and widely applicable when compared with previous ones.
A nonparametric and unsupervised method of automatic threshold selection to eliminate noise for spectroscopy data processing is described in this paper.
A detecting scheme, named bi-trapezoid criteria, is devised where the threshold is selected according to the turning corner present on the uprising intensity trace. This selecting procedure is very simple, utilizing only basic summation on the sorted intensity sequence to optimize threshold for distinguishing between noises and signals.
This approach is effective in selecting appropriate value to filter noise when the distribution of noises is not preconditioned or the gap between signals and noises is not obvious. Testing on both artificial and authentic data under specific quantitative evaluation condition shows that this new method performs better than previous ones.
A maximum correntropy criterion based regression model is proposed.
A nonlinear correntropy-based metric is used to replace the traditional least-squares metric.
A half-quadratic optimization technique is developed to solve the correntropy-based model.
The nonlinear Gaussian function in MCC leads to an accurate estimation of the regression relation.
It outperforms some modified PLS algorithms and robust regression methods.
The least-squares criterion is widely used in the multivariate calibration models. Rather than using the conventional linear least-squares metric, we employ a nonlinear correntropy-based metric to describe the spectra-concentrate relations and propose a maximum correntropy criterion based regression (MCCR) model.
To solve the correntropy-based model, a half-quadratic optimization technique is developed to convert a non-convex and nonlinear optimization problem into an iteratively re-weighted least-squares problem. Finally, MCCR can provide an accurate estimation of the regression relation by alternatively updating an auxiliary vector represented as a nonlinear Gaussian function of fitted residuals and a weight computed by a regularized weighted least-squares model.
The proposed method is compared to some modified PLS algorithms and robust regression methods on four real near-infrared (NIR) spectra data sets. Experimental results demonstrate the efficacy and effectiveness of the proposed method.
A weighted dynamic decentralized PCA method is proposed for dynamic process monitoring.
The dynamic feature is characterized for each measured variable through weighting strategy.
Case studies demonstrate the priority and the promise of the proposed WDDPCA model.
Based on an argument that some process variables can influence other process variables with time-delays, dynamic decentralized principal component analysis (DDPCA) was recently proposed for modeling and monitoring dynamic processes, and it has achieved superior monitoring performance than its counterparts, such as dynamic PCA and dynamic latent variables (DLV).
Although experimental results have demonstrated the promise of selecting dynamic feature (i.e., auto-correlated and cross-correlated variables with time-delays) for each measured variable in handling dynamic process data, it can be easily verified that the dynamic feature selection suffers from a proper determination of a cutoff parameter.
To tackle this issue, an alternative formulation of DDPCA through using variable-weighted method is proposed. The dynamic feature is characterized individually by assigning different weights to different variables with time-delays. The weighted variables are then used to form a block corresponding to each variable, fault detection and diagnosis are thus implemented based on these block PCA models.
The superiority of the proposed weighted DDPCA (WDDPCA) method over dynamic PCA, DLV, and DDPCA are explored by two industrial processes. The comparisons apparently illustrate the salient monitoring performance that can be achieved by WDDPCA.
A novel ensemble method named as subagging ELM was proposed.
Subagging strategy was introduced to improve the stability of ELM.
The performance of the method was tested with fuel oil and blood samples.
The proposed method can achieve much better stability and higher accuracy than ELM.
Extreme learning machine (ELM) has been attracted increasing attentions for its fast learning speed and excellent generalization performance. However, the prediction result of a single ELM regression model is usually unstable due to the randomly generating of the input weights and hidden layer bias.
To overcome this drawback, an ensemble form of ELM, termed as subagging ELM, was proposed and used for spectral quantitative analysis of complex samples. In the approach, a series of ELM sub-models was built by randomly selecting a certain number of samples from the original training set without replacement, and then the predictions of these sub-models were combined by a simple averaging way to give the final ensemble prediction. The performance of the method was tested with fuel oil and blood samples.
Compared to a single ELM model, the results confirm that subagging ELM can achieve much better stability and higher accuracy than ELM.
A new MSPCA-KECA-based method is developed for fault detection and diagnosis.
MSPCA is proposed to extract fault-symptom features for multi-scale problem.
Each KECA classifier is dedicated to a specific fault.
The Cauchy-Schwarz (CS) divergence is a measure of the similarity between two probability density functions.
Results show that the proposed method outperforms KPCA, KICA and KECA.
As the main concerns of abnormal event management in process engineering, fault detection and diagnosis have attracted more and more attention recently. A new monitoring method based on kernel entropy component analysis(KECA) is proposed for nonlinear chemical process. Then, an angle-based statistic is designed to express the distinct angular structure that KECA reveals, which is able to measure the similarity between probability density functions.
Likewise, each KECA classifier is dedicated to a specific fault, which provides an expendable framework for incorporating new faults identified in the process. As to the fault features are submerged because of multi-scale property of process data, an enhanced KECA method for fault detection and diagnosis is developed, by adding multi-scale principal component analysis(MSPCA) for features extraction to improve the classification effect of KECA.
The effectiveness of the proposed approach is demonstrated by applying to Tennessee Eastman process. The MSPCA based method essentially captures the fault-symptom correlation, whereas KECA can be an effective method for process fault diagnosis.
Soft sensor calibration method by Just-in-time strategy based on estimation of data density.
We make precise sampling possible during the deployment of Just-in-time method.
We proposed a mechanism to partition the history data base into some differently dense zones.
Soft sensor is an efficacious solution to predict the hard-to-measure target variable by using the process variables. In practical application scenarios, however, the target feedback cycle is usually larger than that of process variables which causes a lack of sufficient prediction errors during the period of a target feedback cycle.
Consequently soft sensor cannot make calibration timely and performance deteriorates. We proposed an enhanced just-in-time (JIT) soft sensor calibration method using data density estimation. The enhanced JIT method as the core is basically implemented by the estimate of data density of the history database. First the database is divided into a plenty of data blocks. The center of each block is calculated in pair of the process and target variables respectively.
For each center we designed a criterion to preliminarily work out the corresponding optimized sampling number to indirectly represent the data density of each block and further use pooling strategy to partition the database into some differently dense zones. Ultimately we obtain the data density of the database making precise sampling feasible to improve the performance of the JIT-based method. The proposed calibration method is tested through comparative experiments on a pH neutralization facility in our laboratory and is verified feasible and effective.
A robust probability latent variable regression (RPLVR) is proposed.
A robust probability kernel latent variable regression (RPKLVR) is proposed.
RPLVR is extended to its nonlinear RKPLVR model.
Statistics of RKPLVR for fault detection is derived utilizing the kernel trick.
RPLVR and RKPLVR based monitoring methods for fault detection are applied.
In most industries, process and quality measurements with outliers are often collected. The outliers would have negative influences on data-based modelling and process monitoring. In our previous work on probability latent variable regression (PLVR), the model is constructed under the assumption that the data quality of the process characteristics is good and the operation processes are linear. In this article, a robust PLVR (RPLVR) model is developed.
Then it is extended to its nonlinear form, called robust probability kernel latent variable regression (RPKLVR). Both models can reduce the effects of outliers. RPLVR and RPKLVR are the weighted probability models. The similarity of each sample among all the collected data would be chosen as the weighting factor for each sample. Thus, the outliers for modelling are weakened.
With the weighted training data, an expectation-maximization algorithm of training RPLVR and RPKLVR are derived. The corresponding statistics are also systematically constructed for the fault detection. Two case studies are presented to illustrate the effectiveness of the proposed methods.
Development of shallow and deep ANNs for fault diagnosis.
KDE-based fault diagnosis control limit to reduce the false alarms and faulty declaration.
A minimization strategy was proposed to deal with the missing value estimation.
An ARMA model to make multi-step-ahead prediction for SPE.
This methodology was validated through highly and lowly instrumented WWTPs, respectively.
The use of large number of on-line sensors in control and automation for optimized operation of WWTPs is increasing popular, which makes manual expert-based evaluation impossible. Auto-associative Neural Networks (ANN) with shallow and deep structure are proposed for fault diagnosis in this paper.
The proposed methodology not only provides a recursive minimization strategy to deal with missing values but also offers Kernel Density Estimation (KDE) to alleviate the Gaussian assumption of derived data. The resulted fault diagnosis statistic, the sum of squared residuals (SPE) can be predicted over a long horizon by performing a multi-step ARMA model (Auto-Regressive and Moving Average Model).
The proposed fault diagnosis framework has been validated by process data collected from two WWTPs with different dynamic characteristics. The results showed that the proposed methodology is capable of detecting sensor faults and process faults with good accuracy under different scenarios (highly and lowly instrumented WWTP).
An effective nonlinear FEPLS model is proposed.
Function expansion is adopted to effectively expand the input space to high nonlinear space.
A good model is found between the expanded inputs and the outputs by using PLS.
FEPLS model is easy to construct and is applied to modeling complex chemical processes.
FEPLS could achieve good prediction performance.
A novel robust nonlinear partial least square model is proposed to handle the nonlinearity and collinearity problems of process data. The proposed model integrates a nonlinear functional link artificial neural network (FLANN) with a traditional partial least square (PLS). There are two parts in the proposed model: a nonlinear mapping part and a linear regression part. In the nonlinear mapping part, the input space is effectively extended to nonlinear space through the functional expansion block of FLANN.
The PLS regression (PLSR) is adopted in the linear part. Thus, a novel robust nonlinear PLS integrated with functional expansion (FEPLS) is built. The proposed FEPLS model is very easy to construct. First, a traditional FLANN is selected. Second, the input space is expanded to nonlinear space using the functional expansion block.
Third, the collinearity among the expanded variables and the expected outputs is eliminated by extracting input latent variables and output latent variables through PLS projection, respectively. Finally, an optimal regression model between the expanded variables and the expected outputs is established by using PLSR.
To evaluate the performance of the proposed model, case studies of modeling two complex chemical processes are provided. Four more models of FLANN, extreme learning machine based PLS (ELM-PLS), kernel PLS (KPLS), and PLSR are also developed for comparisons. Simulation results illustrated that the proposed FEPLS model could improve the prediction performance.
Multivariate adulteration detection for sesame oil by one-class support vector machine.
Lowest adulteration levels of the OC-SVM model were calculated.
One-class model is promising tool to identify authenticity of edible oil and food.
Multivariate and untargeted adulterations are real cases of oil adulteration in practice. In this study, one-class support vector machine (OC-SVM) was used to build the model for detecting multivariate and untargeted adulterations of sesame oil. The predictive model was subsequently validated by an independent test set. The results indicated that the OC-SVM model could completely detect the adulterated oils. Moreover, oils adulterated with different levels of mixed edible oils were simulated by Monte Carlo method and employed to determine the lowest adulteration level of the predictive model. Compared with earlier studies, the OC-SVM model proposed for sesame oil in this study is more robust to detect untargeted and multivariate adulteration.
答：现在市面上大致有两种托福 TPO 模考，一种是模考软件，另一种是在线模考，TPO 模考软件 Bug 和题目错误较多，建议谨慎选择，而在线模考经过仔细校对，有详细的笔记和题目解析，做完生成报告，方便查看，备考托福效率更高，题主可以加“托福急救站”企鹅裙自取。