Discovering behavioral patterns among air pollutants: A data mining approach

  • Diana Arce Universidad del Azuay
  • Fernando Lima Universidad del Azuay
  • Marcos Patricio Orellana Cordero Universidad del Azuay
  • John Ortega Universidad del Azuay
  • Chester Sellers Universidad del Azuay
  • Patricia Ortega Universidad del Azuay
Keywords: air pollutant; knowledge; data mining; correlation;

Abstract

Air pollutants affect both human health and the environment. For this reason, environmental managers and urban planners focus their efforts in monitoring air pollution. In this context, complete information is required to support the decision-making process to improve the quality of life in urban zones. Hence, it is important to extract knowledge not only on concentration levels but associations between air pollutants. Based on the Cross-industry standard process for data mining, this paper presents an approach which leads to identify correlations and incidence between the most harmful pollutants in the Andean Region: Ozone, Carbon monoxide, Sulfur dioxide, Nitrogen dioxide and, Particulate material. This paper describes an experiment using a real dataset from a monitoring station in Cuenca, Ecuador located in the Andean region.  The results show that the proposed approach is effective to extract knowledge useful to support the evaluation of air quality in urban zones. In addition, this approach provides a starting point for future data mining applications for the analysis of air pollution in the context of the Andean region.

Downloads

Download data is not yet available.

References

Cagliero, L., Cerquitelli, T., Chiusano, S., Garza, P., and Ricupero, G. (2016). Discovering Air Quality Patterns in Urban Environments. En Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct (pp. 25–28). New York, NY, USA: ACM. https://doi.org/10.1145/2968219.2971458
Clima CUENCA: Temperatura, Climograma y Tabla climática para CUENCA - Climate-Data.org. (s. f.). Recuperado 16 de julio de 2018, de https://es.climate-data.org/location/875185/
Doreswamy, G. O., and Manjaunath, B. (2015). Air pollution clustering using K-means algorithm in smart city. International Journal of Innovative Research in Computer and Communication Engineering, 3, 51–57.
Doreswamy, Ghoneim, O., and Manjaunath, B. R. (2015). Air Pollution Clustering Using K-Means Algorithm in Smart City. En International Journal of Innovative Research in Computer and Communication Engineering (Vol. Vol. 3, Special Issue 7).
Du, X., and Varde, A. S. (2016). Mining PM2.5 and traffic conditions for air quality. En 2016 7th International Conference on Information and Communication Systems (ICICS) (pp. 33-38). https://doi.org/10.1109/IACS.2016.7476082
Fukuda, K. (2007). Noise Reduction Approach for Decision Tree Construction: A Case Study of Knowledge Discovery on Climate and Air Pollution. En 2007 IEEE Symposium on Computational Intelligence and Data Mining (pp. 697-704). https://doi.org/10.1109/CIDM.2007.368944
Gao, B. J., Tung, R., and Yang, Y. (2017). Iterative matrix correlation for bisection clustering. En 2017 IEEE International Conference on Big Data (Big Data) (pp. 80-87). https://doi.org/10.1109/BigData.2017.8257914
Kampa, M., and Castanas, E. (2008). Human health effects of air pollution. Environmental Pollution, 151(2), 362-367. https://doi.org/10.1016/j.envpol.2007.06.012
Katz, M. (1970). Photochemical reactions of atmospheric pollutants. The Canadian Journal of Chemical Engineering, 48(1), 3-11. https://doi.org/10.1002/cjce.5450480102
Kim, K.-H., Choi, Y.-J., and Kim, M.-Y. (2005). The exceedance patterns of air quality criteria: a case study of ozone and nitrogen dioxide in Seoul, Korea between 1990 and 2000. Chemosphere, 60(4), 441-452. https://doi.org/10.1016/j.chemosphere.2004.12.067
Kingsy, G. R., Manimegalai, R., Geetha, D. M., Rajathi, S., Usha, K., and Raabiathul, B. N. (2016). Air pollution analysis using enhanced K-Means clustering algorithm for real time sensor data. En Region 10 Conference (TENCON), 2016 IEEE (pp. 1945–1949). IEEE.
Kumar, P., and Wasan, S. K. (2010). Analysis of X-means and global k-means USING TUMOR classification. En 2010 The 2nd International Conference on Computer and Automation Engineering (ICCAE) (Vol. 5, pp. 832-835). https://doi.org/10.1109/ICCAE.2010.5451883
Li, H., Fan, H., and Mao, F. (2016). A Visualization Approach to Air Pollution Data Exploration—A Case Study of Air Quality Index (PM2.5) in Beijing, China. Atmosphere, 7(3), 35. https://doi.org/10.3390/atmos7030035
Quick-R: Correlations. (s. f.). Recuperado 24 de julio de 2018, de https://www.statmethods.net/stats/correlations.html
Select by Weights - RapidMiner Documentation. (s. f.). Recuperado 17 de julio de 2018, de https://docs.rapidminer.com/latest/studio/operators/blending/attributes/selection/select_by_weights.html
Shazan, M., Jabbar, M., Zaïane, O. R., and Osornio-Vargas, A. (2017). Discovering Spatial Contrast and Common Sets with Statistically Significant Co-location Patterns. En Proceedings of the Symposium on Applied Computing (pp. 796–803). New York, NY, USA: ACM. https://doi.org/10.1145/3019612.3019665
Souza, F. T., and Rabelo, W. S. (2015). A data mining approach to study the air pollution induced by urban phenomena and the association with respiratory diseases. En 2015 11th International Conference on Natural Computation (ICNC) (pp. 1045-1050). https://doi.org/10.1109/ICNC.2015.7378136
Wagner, E. (1994). Impacts on air pollution in urban areas. Environmental Management, 18(5), 759-765. https://doi.org/10.1007/BF02394638
Walden, S., and Andrew, C. (2013). Publicación de los contaminantes atmosféricos de la estación de monitoreo en tiempo real de la ciudad de Cuenca, utilizando servicios estándares OGC. Recuperado de http://dspace.uazuay.edu.ec/handle/datos/2546
Wirth, R. (2000). CRISP-DM: Towards a standard process model for data mining. En Proceedings of the Fourth International Conference on the Practical Application of Knowledge Discovery and Data Mining (pp. 29–39).
Zhang, L., Deng, S., and Li, S. (2017). Analysis of power consumer behavior based on the complementation of K-means and DBSCAN. En 2017 IEEE Conference on Energy Internet and Energy System Integration (EI2) (pp. 1-5). https://doi.org/10.1109/EI2.2017.8245490
Published
2018-12-21
How to Cite
Arce, D., Lima, F., Orellana Cordero, M., Ortega, J., Sellers, C., & Ortega, P. (2018). Discovering behavioral patterns among air pollutants: A data mining approach. Enfoque UTE, 9(4), pp. 168 - 179. https://doi.org/https://doi.org/10.29019/enfoqueute.v9n4.411
Section
Computer Science, ICTs