Adaptability of regression algorithms to the behavior of protein plants
DOI:
https://doi.org/10.29019/enfoqueute.861Keywords:
Secondary metabolites, regression models, Secondary metabolites; regression models; cell wall; nutritional value., nutritional valueAbstract
The behavior of components of protein plant is of vital importance for animals that consume them in their diet. The objective of this research is to evaluate regression algorithms, to determine the behavior of the expressions that best adapt to the procedures of a traditional laboratory and to estimate the chemical components of protein plants, in this sense the MULAN library of java has been used, that contain automatic learning algorithms capable of adapting to dissimilar problems. Three data set were created for each species treated in this study; each of these include the main elements to be evaluate in each experiment, these are delimits by: secondary metabolites, cell wall components and digestibility element for training files one, two and three, respectively; subsequently, they were evaluated through learning supervised and cross-validation of each to determine the best fit by aRMSE (Average Root Mean Square Error). The learning results were compare with previous experiments, where there was a learning variant that contained in a single dataset all the components to be evaluates in a single prediction. The result of the comparison shows that the lazy algorithms based on instances have a better learning behavior than the others evaluate.
Downloads
References
Alebele, Y., Zhang, X., Wang, W., Yang, G., Yao, X., Zheng, H., Zhu, Y., Cao, W. & Cheng, T. (2020). Estimation of Canopy Biomass Components in Paddy Rice from Combined Optical and SAR Data Using Multi-Target Gaussian Regressor Stacking. Remote Sensing, 12(16), 2564. https://doi.org/10.3390/rs12162564
Alzubi, J., Nayyar, A. & Kumar, A. (2018). Machine learning from theory to algorithms: An overview. Journal of physics: conference series, 1142(1), 012012. https://doi.org/10.1088/1742-6596/1142/1/012012
Amin, M. N. & Habib, A. (2015). Comparison of different classification techniques using WEKA for hematological data. American Journal of Engineering Research, 4(3), 55-61. http://www.ajer.org/papers/v4(03)/H043055061.pdf
Barrios, H. D., Rivas, Y. A., Hernández, L. C., Hernández, A. M., Cárdenas, M. del C. C. & Cardoso, G. M. C. (2015). Algoritmos de aprendizaje automático para clasificación de Splice Sites en secuencias genómicas. Revista Cubana de Ciencias Informáticas, 9(4), 155-170. http://scielo.sld.cu/pdf/rcci/v9n4/rcci12415.pdf
Berrar, D. (2019). Cross-Validation. Encyclopedia of Bioinformatics and Computational
Biology,1, 542–545.. https://doi.org/10.1016/B978-0-12-809633-8.20349-X
Borchani, H., Varando, G., Bielza, C. & Larranaga, P. (2015). A survey on multi-output regression. Wires Data Mining and Knowledge Discovery, 5(5), 216-233. https://doi.org/10.1002/widm.1157
Cabrera, D. (2008). Manejo y uso de pastos y forrajes en ganadería tropical. Universidad de Córdoba, pp 40. http://www.uco.es/zootecniaygestion/img/pictorex/08_21_24_4.1.1.pdf
Cambronero, C. G. & Moreno, I. G. (2006). Algoritmos de aprendizaje: Knn & kmeans. Inteligencia en Redes de Comunicación, Universidad Carlos III, Madrid, Spain. pp. 8. http://blogs.ujaen.es/barranco/wp-content/uploads/2012/02/Algoritmos-de-aprendizaje-knn-y-kmeans.pdf
Camejo-Corona, J., Gonzalez, H. & Morell, C. (2019). Los principales algoritmos para regresión con salidas múltiples. Una revisión para Big Data. Revista Cubana de Ciencias Informáticas, 13(4), 118-150. http://scielo.sld.cu/pdf/rcci/v13n4/2227-1899-rcci-13-04-118.pdf
Chen, S., Gu, C., Lin, C. & Hariri-Ardebili, M. A. (2021). Prediction of arch dam deformation via correlated multi-target stacking. Applied Mathematical Modelling, 91, 1175-1193. https://doi.org/10.1016/j.apm.2020.10.028
Cleary, J. G. & Trigg, L. E. (1995). K*: An instance-based learner using an entropic distance measure. En Machine Learning Proceedings 1995, 108-114. https://sci2s.ugr.es/keel/pdf/algorithm/congreso/KStar.pdf
Coraddu, A., Oneto, L., Ghio, A., Savio, S., Anguita, D. & Figari, M. (2016). Machine learning approaches for improving condition-based maintenance of naval propulsion plants. Proceedings of the Institution of Mechanical Engineers, Part M: Journal of Engineering for the Maritime Environment, 230(1), 136-153. https://doi.org/10.1177/1475090214540874
Despotovic, M., Nedic, V., Despotovic, D. & Cvetanovic, S. (2016). Evaluation of empirical models for predicting monthly mean horizontal diffuse solar radiation. Renewable and Sustainable Energy Reviews, 56, 246-260. https://doi.org/10.1016/j.rser.2015.11.058
Díaz, A., Cayón, G. & Mira, J. J. (2007). Metabolismo del calcio y su relación con la «mancha de madurez» del fruto de banano. Una revisión. Agronomía Colombiana, 25(2), 280-287. https://revistas.unal.edu.co/index.php/agrocol/article/view/14131/14886
Džeroski, S., Demšar, D. & Grbović, J. (2000). Predicting chemical parameters of river water quality from bioindicator data. Applied Intelligence, 13(1), 7-17. https://doi.org/10.1023/A:1008323212047
Erdal, H., Erdal, M., Simsek, O. & Erdal, H. I. (2018). Prediction of concrete compressive strength using non-destructive test results. Computers and Concrete, 21(4), 407-417. https://doi.org/10.12989/cac.2018.21.4.407
Estrada-Jiménez, P. M., Diez, H. R., Cabrera, A. V., Verdecia, D. M. & Ramírez, J. L. (2018). Modelos de predicción de metabolitossecundarios para dos variedades de plantas protéicas. Memorias del VIII Congreso Iberoamericano de Ingeniería de Proyectos. Universidad de Ciencias Informáticas, Cuba. Pp 1-9. https://repositorio.uci.cu/bitstream/123456789/9499/1/A205.pdf
Estrada-Jiménez, P. M., González-Diez, H. R., Verdecia-Cabrera, A., Verdecia-Acosta, D. M. & Ramírez-de la Rivera, J. L. (2018). Modelos de predicción de metabolitos secundarios para dos variedades de plantas proteicas. Libro de Memorias: VIII Congreso Iberoamericano de Ingeniería de Proyectos. Ediciones Futuro. Universidad de Ciencias Informáticas, Cuba, pp 1-9. https://repositorio.uci.cu/jspui/bitstream/123456789/9499/1/A205.pdf
Estrada-Jiménez, P. M., Noguera-López, P. J. & Recio-Avilés, R. (2020). Aplicación de la regresión de múltiples objetivos en la estimación de componentes fitoquímicos. Pensamiento Matemático, 10(2), 7-14. https://dialnet.unirioja.es/servlet/articulo?codigo=7782227
Estrada-Jiménez, P. M., Ramírez-de la Ribera, J. L., Verdecia-Acosta, D. M. & Soler-Pellicer, Y. (2019). Aplicación de la minería de datos en la estimación de componentes fotoquímicos (Original). Roca. Revista científico-educacional de la provincia Granma, 15(2), 177-186. https://dialnet.unirioja.es/servlet/articulo?codigo=7013276
Fang, J., Li, Y., Liu, R., Pang, X., Li, C., Yang, R., He, Y., Lian, W., Liu, A.L. & Du, G.H. (2015). Discovery of multitarget-directed ligands against Alzheimer’s disease through systematic prediction of chemical–protein interactions. Journal of chemical information and modeling, 55(1), 149-164. https://doi.org/10.1021/ci500574n
González, F. A. (2015). Machine learning models in rheumatology. Revista Colombiana de Reumatología, 22(2), 77-78. http://dx.doi.org/10.1016/j.rcreu.2015.06.001
Herrera, R.S., Verdecia, D.M., Ramírez, J.L., García, M. & Cruz, A.M. (2017). Relation between some climatic factors and the chemical composition of Tithonia diversifolia. Revista Cubana de Ciencia Agrícola, 51(2), 271-279. http://cjascience.com/index.php/CJAS/article/view/719
Joshi, R. S., Jagdale, S. S., Bansode, S. B., Shankar, S. S., Tellis, M. B., Pandya, V. K., Chugh, A., Giri, A. P. & Kulkarni, M. J. (2020). Discovery of potential multi-target-directed ligands by targeting host-specific SARS-CoV-2 structurally conserved main protease. Journal of Biomolecular Structure and Dynamics, 1-16. https://doi.org/10.1080/07391102.2020.1760137
Karalič, A. & Bratko, I. (1997). First order regression. Machine learning, 26(2), 147-176. https://link.springer.com/content/pdf/10.1023/A:1007365207130.pdf
Khosravi, K., Khozani, Z. S. & Cooper, J. R. (2021). Predicting stable gravel-bed river hydraulic geometry: A test of novel, advanced, hybrid data mining algorithms. Environmental Modelling & Software, 144, 105165. , https://doi.org/10.1016/j.envsoft.2021.105165
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. Ijcai, 14(2), 1137-1145. https://www.ijcai.org/Proceedings/95-2/Papers/016.pdf
Li, J., Zhang, L., He, C. & Zhao, C. (2018). A comparison of Markov chain random field and ordinary kriging methods for calculating soil texture in a mountainous watershed, northwest China. Sustainability, 10(8), 2819. https://doi.org/10.3390/su10082819
Mahecha, L. & Rosales, M. (2005). Valor nutricional del follaje de botón de oro Tithonia diversifolia (Hemsl.) Gray, en la producción animal en el trópico. Livestock Research for Rural Development, 17(9), 1. https://www.lrrd.cipav.org.co/lrrd17/9/mahe17100.htm .
Mahecha, L., Escobar, J., Suárez, J. & Restrepo, L. (2007). Tithonia diversifolia (hemsl.) Gray (botón de oro) como suplemento forrajero de vacas F1 (Holstein por Cebú). Livestock Research for Rural Development, 19(2), 1-6. https://lrrd.cipav.org.co/lrrd19/2/mahe19016.htm .
Maliha, S. K., Islam, T., Ghosh, S. K., Ahmed, H., Mollick, Md. R. J. & Ema, R. R. (2019). Prediction of Cancer Using Logistic Regression, K-Star and J48 algorithm. 2019 4th International Conference on Electrical Information and Communication Technology (EICT), 1-6. https://doi.org/10.1109/EICT48899.2019.9068790
Mariño, A. P. (2015). GMLKNN: modelo basado en instancias para el aprendizaje multi-etiqueta utilizando la distancia VDM [PhD Thesis]. Universidad Central “Marta Abreu” de Las Villas. Facultad de Matemática. pp. 100. https://dspace.uclv.edu.cu/bitstream/handle/123456789/7551/Tesis%20Final.pdf?sequence=1&isAllowed=y .
Mastelini, S. M., Santana, E. J., Cerri, R. & Barbon Jr, S. (2020). DSTARS: a multi-target deep structure for tracking asynchronous regressor stacking. Applied Soft Computing, 91, 106215. https://doi.org/10.1016/j.asoc.2020.106215
Nogueira, M. S. & Koch, O. (2019). The development of target-specific machine learning models as scoring functions for docking-based target prediction. Journal of chemical information and modeling, 59(3), 1238-1252. https://doi.org/10.1021/acs.jcim.8b00773.
Osojnik, A., Panov, P. & Džeroski, S. (2017). Multi-label classification via multi-target regression on data streams. Machine Learning, 106, 745-770. https://doi.org/10.1007/s10994-016-5613-5
Otegui, M.B. & Totaro, M. E. (2007). Atlas de histología vegetal. EDUNAM - Editorial Universitaria de la Univ. Nacional de Misiones.
pp 42. https://editorial.unam.edu.ar/images/documentos_digitales/978-950-579-064-7.pdf
Painuli, S., Elangovan, M. & Sugumaran, V. (2014). Tool condition monitoring using K-star algorithm. Expert Systems with Applications, 41(6), 2638-2643. https://doi.org/10.1016/j.eswa.2013.11.005
Pascual, I. de los A., Ramírez, J., & Ortiz, A. (2016). Métodos de Inteligencia Artificial para la predicción del rendimiento y calidad de gramíneas. REDVET. Revista Electrónica de Veterinaria, 17(12). https://www.redalyc.org/pdf/636/63649052026.pdf
Ramírez-Lozano, R. (2010). Importancia de los taninos condensados en la nutrición del venado cola blanca. Conferencia: 5° Simposio sobre Fauna Cinegética en México At: Puebla, México (1). 1-21. https://www.researchgate.net/publication/268207092_Importancia_de_los_taninos_condensados_en_la_nutricion_del_venado_cola_blanca
Refaeilzadeh, P., Tang, L. & Liu, H. (2016). Cross-Validation. In: Liu L & Özsu MT (Eds.), Encyclopedia of Database Systems (pp. 1–6). New York, NY: Springer New York. http://leitang.net/papers/ency-cross-validation.pdf
Reyes, O., Cano, A., Fardoun, H. M. & Ventura, S. (2018). A locally weighted learning method based on a data gravitation model for multi-target regression. International Journal of Computational Intelligence Systems, 11(1), 282-295. https://doi.org/10.2991/ijcis.11.1.22
Rincón-Tuexi, J. A., Castro-Nava, S., López-Santillán, J. A., Huerta, A. J., Trejo-López, C. & Briones-Encinia, F. (2006). Temperatura alta y estrés hídrico durante la floración en poblaciones de maíz tropical. Phyton (Buenos Aires), 75, 31-40. http://www.scielo.org.ar/pdf/phyton/v75/v75a03.pdf
Ruiz, T. E., Febles, G. J., Galindo, J. L., Savón, L. L., Chongo, B. B., Torres, V., Cino, D. M., Alonso, J., Martínez, Y., Gutiérrez, D., Crespo, G. J., Mora, L., Scull, I., La O, O., González, J., Lok, S., González, N. & Zamora, A. (2014). Tithonia diversifolia, sus posibilidades en sistemas ganaderos. Revista Cubana de Ciencia Agrícola, 48(1), 79-82. https://www.redalyc.org/pdf/1930/193030122017.pdf
Ruiz, T., Febles, G., Castillo, E., Jordan, H., Galindo, J., Chongo, B., Delgado, D., Mejías, R. & Crespo, G. (2011). Tecnología de producción animal mediante Leucaena leucocephala asociada con pastos en el 100% del área de la unidad ganadera. Sitio Argentino de Producción Animal, pp 6. https://www.produccion-animal.com.ar/produccion_y_manejo_pasturas/pasturas_cultivadas_megatermicas/112-leucaena.pdf
Santana, E. J., Mastelini, S. M. & Barbon Jr, S. (2017). Deep regressor stacking for air ticket prices prediction. In Anais do XIII Simpósio Brasileiro de Sistemas de Informação, (pp. 25-31). Porto Alegre: SBC. https://doi.org/10.5753/sbsi.2017.6022
Spyromitros-Xioufis, E., Tsoumakas, G., Groves, W. & Vlahavas, I. (2016). Multi-target regression via input space expansion: Treating targets as inputs. Machine Learning, 104, 55-98. https://doi.org/10.1007/s10994-016-5546-z
Tsoumakas, G., Spyromitros-Xioufis, E., Vilcek, J. & Vlahavas, I. (2011). Mulan: A java library for multi-label learning. Journal of Machine Learning Research, 12(Jul), 2411-2414. https://www.jmlr.org/papers/volume12/tsoumakas11a/tsoumakas11a.pdf
Tuia, D., Verrelst, J., Alonso, L., Pérez-Cruz, F. & Camps-Valls, G. (2011). Multioutput support vector regression for remote sensing biophysical parameter estimation. IEEE Geoscience and Remote Sensing Letters, 8(4), 804-808. https://matlabtools.com/wp-content/uploads/p603.pdf
Verdecia, D.M., Herrera, R.S., Ramírez, J.L., Bodas, R., Leonard, I., Giráldez, F., Andrés, S., Santana, A., Méndez-Martínez, Y. & López, S. (2018). Yield components, chemical characterization and polyphenolic profile of Tithonia diversifolia in Valle del Cauto, Cuba. Cuban Journal of Agricultural Science, 52(4), 457-471. http://cjascience.com/index.php/CJAS/article/view/838
Waegeman, W., Dembczyński, K. & Hüllermeier, E. (2019). Multi-target prediction: A unifying view on problems and methods. Data Mining and Knowledge Discovery, 33(2), 293-324. https://arxiv.org/pdf/1809.02352.pdf
Wang, X., Zhen, X., Li, Q., Shen, D. & Huang, H. (2018). Cognitive assessment prediction in Alzheimer’s disease by multi-layer multi-target regression. Neuroinformatics, 16(3-4), 285-294. https://doi.org/10.1007/s12021-018-9381-1
Zhang, J., Li, Q., Caselli, R. J., Thompson, P. M., Ye, J. & Wang, Y. (2017). Multi-source multi-target dictionary learning for prediction of cognitive decline. International Conference on Information Processing in Medical Imaging, 10265, 184-197. https://doi.org/10.1007/978-3-319-59050-9_15
Zhen, X., Yu, M., He, X. & Li, S. (2017). Multi-target regression via robust low-rank learning. IEEE transactions on pattern analysis and machine intelligence, 40(2), 497-504. https://ieeexplore.ieee.org/ielaam/34/8249508/7888599-aam.pdf
Zighed, N. & Bounour, N. (2019). On The Use Of KStar Algorithm For Predicting Object-Oriented Software Maintainability. Conference Internationale sur intelligence Artificielle et les Technologies Information ICAIIT 2019. Pp. 1-5. https://dspace.univ-ouargla.dz/jspui/bitstream/123456789/20983/1/Zighed%20Narimane.pdf
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 The Authors
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
The articles and research published by the UTE University are carried out under the Open Access regime in electronic format. This means that all content is freely available without charge to the user or his/her institution. Users are allowed to read, download, copy, distribute, print, search, or link to the full texts of the articles, or use them for any other lawful purpose, without asking prior permission from the publisher or the author. This is in accordance with the BOAI definition of open access. By submitting an article to any of the scientific journals of the UTE University, the author or authors accept these conditions.
The UTE applies the Creative Commons Attribution (CC-BY) license to articles in its scientific journals. Under this open access license, as an author you agree that anyone may reuse your article in whole or in part for any purpose, free of charge, including commercial purposes. Anyone can copy, distribute or reuse the content as long as the author and original source are correctly cited. This facilitates freedom of reuse and also ensures that content can be extracted without barriers for research needs.
This work is licensed under a Creative Commons Attribution 3.0 International (CC BY 3.0).
The Enfoque UTE journal guarantees and declares that authors always retain all copyrights and full publishing rights without restrictions [© The Author(s)]. Acknowledgment (BY): Any exploitation of the work is allowed, including a commercial purpose, as well as the creation of derivative works, the distribution of which is also allowed without any restriction.