Using univariate and multivariate methods to detect outliers in sediment fingerprinting method, case study: Tange Bostanak Watershed

Nohegar, Ahmad; Kazemi, Mohammad; Ahmadi, Seyed Javad; Gholami, Hamid; Mahdavi, Rasool

doi:10.22092/ijwmse.2017.113460

Document Type : Research Paper

Authors

¹ Professor, Faculty of Environment Sciences, Tehran University, Iran

² PhD Student, Faculty of Natural Resources, Hormozgan University, Iran

³ Associate Professor, Fuel Cycle Research Institute of Atomic Energy Organization, Iran

⁴ Assistant Professor, Faculty of Natural Resources, Hormozgan University, Iran

https://doi.org/10.22092/ijwmse.2017.113460

Abstract

Efficiency of sediment fingerprinting by using tracers as a successful method to determine the sources of sediment has been proved. Selection of the suite subset of tracers, capable of discriminating sediment sources, is the first and the most important step in the sediment fingerprinting method. The presence of outliers affects the selection of the suite subset and possibly prevents picking the important tracers and reducing the accuracy of classification. Therefore, the outliers must be detected in order to be corrected or omitted, if enough evidences were present. The present study aims to detect outliers in the subset of tracers, to identify the best combination. For detecting outliers, We used univariate methods such as Grubbs test, Gauss test, Dioxin test, box plot, the Median ± 3MAD, the mean ± 3standard deviation and also multivariate methods such as squared Mahalanobis distance, separate box plots of squared Mahalanobis distance for each of sediment sources, principal component analysis and plot of the squared Mahalanobis distances against the quantiles of the chi-square distribution. we consider an observation as the outlier that at least half of these methods have detected it as an outlier. The results showed that Median ± 3MAD method introduced a larger number of data as outliers Methods of multivariate outlier detection has low agreement with each other. Univariate methods to identify outliers show higher agreement with each other. To use univariate analysis techniques to detect outliers namely Median ± 3MAD, box plot, and Dioxin one can recommended to test their sensitivity. The results also showed that the maximum consensus for univariate analysis techniques is four samples (observations) and for multivariate methods is two samples (observations). In general, there is no observation that is identified as an outlier by half of the used methods.

Keywords

20.1001.1.22519300.1396.9.4.3.5

Watershed Engineering and Management

Using univariate and multivariate methods to detect outliers in sediment fingerprinting method, case study: Tange Bostanak Watershed

Volume 9, Issue 4
January 2018
Pages 398-412

Using univariate and multivariate methods to detect outliers in sediment fingerprinting method, case study: Tange Bostanak Watershed

Volume 9, Issue 4January 2018Pages 398-412

Volume 9, Issue 4
January 2018
Pages 398-412