Abstract:In order to guarantee the quality of perceptional data, an online detection framework for the abnormal agricultural data is constructed based on the sliding window and the prediction models, which including support vector regression, K-nearest neighbor, gradient boosting regression and random forest. The calculation method of the sliding window size is proposed based on data features. The applicability of the prediction models is evaluated by using entropy weight TOPSIS. Through the sheepfold’s monitoring data of the air temperature, the relative humidity, and the CO2 and H2S volume fractions, it is demonstrated that the proposed calculation method of sliding window size is superior to the calculation method simply based on the sampling interval and characteristic period. The prediction errors of these models are negatively correlated with the abnormal detection performance and could impose significant influence on false positive rate. Support vector regression model is the most appropriate candidate for detecting the abnormal data in air temperature and relative humidity with the close degree greater than 0.8, whereas the most appropriate candidates for dealing with CO2 and H2S volume fractions are gradient boosting regression model and K nearest neighbor model, both of them with the close degrees of 0.6.