Adaptability of regression algorithms to the behavior of protein plants

Authors

DOI:

https://doi.org/10.29019/enfoqueute.861

Keywords:

Secondary metabolites, regression models, Secondary metabolites; regression models; cell wall; nutritional value., nutritional value

Abstract

The behavior of components of protein plant is of vital importance for animals that consume them in their diet. The objective of this research is to evaluate regression algorithms, to determine the behavior of the expressions that best adapt to the procedures of a traditional laboratory and to estimate the chemical components of protein plants, in this sense the MULAN library of java has been used, that contain automatic learning algorithms capable of adapting to dissimilar problems. Three data set were created for each species treated in this study; each of these include the main elements to be evaluate in each experiment, these are delimits by: secondary metabolites, cell wall components and digestibility element for training files one, two and three, respectively; subsequently, they were evaluated through learning supervised and cross-validation of each to determine the best fit by aRMSE (Average Root Mean Square Error). The learning results were compare with previous experiments, where there was a learning variant that contained in a single dataset all the components to be evaluates in a single prediction. The result of the comparison shows that the lazy algorithms based on instances have a better learning behavior than the others evaluate.

Downloads

Download data is not yet available.

References

Alebele, Y., Zhang, X., Wang, W., Yang, G., Yao, X., Zheng, H., Zhu, Y., Cao, W. & Cheng, T. (2020). Estimation of Canopy Biomass Components in Paddy Rice from Combined Optical and SAR Data Using Multi-Target Gaussian Regressor Stacking. Remote Sensing, 12(16), 2564. https://doi.org/10.3390/rs12162564

Alzubi, J., Nayyar, A. & Kumar, A. (2018). Machine learning from theory to algorithms: An overview. Journal of physics: conference series, 1142(1), 012012. https://doi.org/10.1088/1742-6596/1142/1/012012

Amin, M. N. & Habib, A. (2015). Comparison of different classification techniques using WEKA for hematological data. American Journal of Engineering Research, 4(3), 55-61. http://www.ajer.org/papers/v4(03)/H043055061.pdf

Barrios, H. D., Rivas, Y. A., Hernández, L. C., Hernández, A. M., Cárdenas, M. del C. C. & Cardoso, G. M. C. (2015). Algoritmos de aprendizaje automático para clasificación de Splice Sites en secuencias genómicas. Revista Cubana de Ciencias Informáticas, 9(4), 155-170. http://scielo.sld.cu/pdf/rcci/v9n4/rcci12415.pdf

Berrar, D. (2019). Cross-Validation. Encyclopedia of Bioinformatics and Computational

Biology,1, 542–545.. https://doi.org/10.1016/B978-0-12-809633-8.20349-X

Borchani, H., Varando, G., Bielza, C. & Larranaga, P. (2015). A survey on multi-output regression. Wires Data Mining and Knowledge Discovery, 5(5), 216-233. https://doi.org/10.1002/widm.1157

Cabrera, D. (2008). Manejo y uso de pastos y forrajes en ganadería tropical. Universidad de Córdoba, pp 40. http://www.uco.es/zootecniaygestion/img/pictorex/08_21_24_4.1.1.pdf

Cambronero, C. G. & Moreno, I. G. (2006). Algoritmos de aprendizaje: Knn & kmeans. Inteligencia en Redes de Comunicación, Universidad Carlos III, Madrid, Spain. pp. 8. http://blogs.ujaen.es/barranco/wp-content/uploads/2012/02/Algoritmos-de-aprendizaje-knn-y-kmeans.pdf

Camejo-Corona, J., Gonzalez, H. & Morell, C. (2019). Los principales algoritmos para regresión con salidas múltiples. Una revisión para Big Data. Revista Cubana de Ciencias Informáticas, 13(4), 118-150. http://scielo.sld.cu/pdf/rcci/v13n4/2227-1899-rcci-13-04-118.pdf

Chen, S., Gu, C., Lin, C. & Hariri-Ardebili, M. A. (2021). Prediction of arch dam deformation via correlated multi-target stacking. Applied Mathematical Modelling, 91, 1175-1193. https://doi.org/10.1016/j.apm.2020.10.028

Cleary, J. G. & Trigg, L. E. (1995). K*: An instance-based learner using an entropic distance measure. En Machine Learning Proceedings 1995, 108-114. https://sci2s.ugr.es/keel/pdf/algorithm/congreso/KStar.pdf

Coraddu, A., Oneto, L., Ghio, A., Savio, S., Anguita, D. & Figari, M. (2016). Machine learning approaches for improving condition-based maintenance of naval propulsion plants. Proceedings of the Institution of Mechanical Engineers, Part M: Journal of Engineering for the Maritime Environment, 230(1), 136-153. https://doi.org/10.1177/1475090214540874

Despotovic, M., Nedic, V., Despotovic, D. & Cvetanovic, S. (2016). Evaluation of empirical models for predicting monthly mean horizontal diffuse solar radiation. Renewable and Sustainable Energy Reviews, 56, 246-260. https://doi.org/10.1016/j.rser.2015.11.058

Díaz, A., Cayón, G. & Mira, J. J. (2007). Metabolismo del calcio y su relación con la «mancha de madurez» del fruto de banano. Una revisión. Agronomía Colombiana, 25(2), 280-287. https://revistas.unal.edu.co/index.php/agrocol/article/view/14131/14886

Džeroski, S., Demšar, D. & Grbović, J. (2000). Predicting chemical parameters of river water quality from bioindicator data. Applied Intelligence, 13(1), 7-17. https://doi.org/10.1023/A:1008323212047

Erdal, H., Erdal, M., Simsek, O. & Erdal, H. I. (2018). Prediction of concrete compressive strength using non-destructive test results. Computers and Concrete, 21(4), 407-417. https://doi.org/10.12989/cac.2018.21.4.407

Estrada-Jiménez, P. M., Diez, H. R., Cabrera, A. V., Verdecia, D. M. & Ramírez, J. L. (2018). Modelos de predicción de metabolitossecundarios para dos variedades de plantas protéicas. Memorias del VIII Congreso Iberoamericano de Ingeniería de Proyectos. Universidad de Ciencias Informáticas, Cuba. Pp 1-9. https://repositorio.uci.cu/bitstream/123456789/9499/1/A205.pdf

Estrada-Jiménez, P. M., González-Diez, H. R., Verdecia-Cabrera, A., Verdecia-Acosta, D. M. & Ramírez-de la Rivera, J. L. (2018). Modelos de predicción de metabolitos secundarios para dos variedades de plantas proteicas. Libro de Memorias: VIII Congreso Iberoamericano de Ingeniería de Proyectos. Ediciones Futuro. Universidad de Ciencias Informáticas, Cuba, pp 1-9. https://repositorio.uci.cu/jspui/bitstream/123456789/9499/1/A205.pdf

Estrada-Jiménez, P. M., Noguera-López, P. J. & Recio-Avilés, R. (2020). Aplicación de la regresión de múltiples objetivos en la estimación de componentes fitoquímicos. Pensamiento Matemático, 10(2), 7-14. https://dialnet.unirioja.es/servlet/articulo?codigo=7782227

Estrada-Jiménez, P. M., Ramírez-de la Ribera, J. L., Verdecia-Acosta, D. M. & Soler-Pellicer, Y. (2019). Aplicación de la minería de datos en la estimación de componentes fotoquímicos (Original). Roca. Revista científico-educacional de la provincia Granma, 15(2), 177-186. https://dialnet.unirioja.es/servlet/articulo?codigo=7013276

Fang, J., Li, Y., Liu, R., Pang, X., Li, C., Yang, R., He, Y., Lian, W., Liu, A.L. & Du, G.H. (2015). Discovery of multitarget-directed ligands against Alzheimer’s disease through systematic prediction of chemical–protein interactions. Journal of chemical information and modeling, 55(1), 149-164. https://doi.org/10.1021/ci500574n

González, F. A. (2015). Machine learning models in rheumatology. Revista Colombiana de Reumatología, 22(2), 77-78. http://dx.doi.org/10.1016/j.rcreu.2015.06.001

Herrera, R.S., Verdecia, D.M., Ramírez, J.L., García, M. & Cruz, A.M. (2017). Relation between some climatic factors and the chemical composition of Tithonia diversifolia. Revista Cubana de Ciencia Agrícola, 51(2), 271-279. http://cjascience.com/index.php/CJAS/article/view/719

Joshi, R. S., Jagdale, S. S., Bansode, S. B., Shankar, S. S., Tellis, M. B., Pandya, V. K., Chugh, A., Giri, A. P. & Kulkarni, M. J. (2020). Discovery of potential multi-target-directed ligands by targeting host-specific SARS-CoV-2 structurally conserved main protease. Journal of Biomolecular Structure and Dynamics, 1-16. https://doi.org/10.1080/07391102.2020.1760137

Karalič, A. & Bratko, I. (1997). First order regression. Machine learning, 26(2), 147-176. https://link.springer.com/content/pdf/10.1023/A:1007365207130.pdf

Khosravi, K., Khozani, Z. S. & Cooper, J. R. (2021). Predicting stable gravel-bed river hydraulic geometry: A test of novel, advanced, hybrid data mining algorithms. Environmental Modelling & Software, 144, 105165. , https://doi.org/10.1016/j.envsoft.2021.105165

Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. Ijcai, 14(2), 1137-1145. https://www.ijcai.org/Proceedings/95-2/Papers/016.pdf

Li, J., Zhang, L., He, C. & Zhao, C. (2018). A comparison of Markov chain random field and ordinary kriging methods for calculating soil texture in a mountainous watershed, northwest China. Sustainability, 10(8), 2819. https://doi.org/10.3390/su10082819

Mahecha, L. & Rosales, M. (2005). Valor nutricional del follaje de botón de oro Tithonia diversifolia (Hemsl.) Gray, en la producción animal en el trópico. Livestock Research for Rural Development, 17(9), 1. https://www.lrrd.cipav.org.co/lrrd17/9/mahe17100.htm .

Mahecha, L., Escobar, J., Suárez, J. & Restrepo, L. (2007). Tithonia diversifolia (hemsl.) Gray (botón de oro) como suplemento forrajero de vacas F1 (Holstein por Cebú). Livestock Research for Rural Development, 19(2), 1-6. https://lrrd.cipav.org.co/lrrd19/2/mahe19016.htm .

Maliha, S. K., Islam, T., Ghosh, S. K., Ahmed, H., Mollick, Md. R. J. & Ema, R. R. (2019). Prediction of Cancer Using Logistic Regression, K-Star and J48 algorithm. 2019 4th International Conference on Electrical Information and Communication Technology (EICT), 1-6. https://doi.org/10.1109/EICT48899.2019.9068790

Mariño, A. P. (2015). GMLKNN: modelo basado en instancias para el aprendizaje multi-etiqueta utilizando la distancia VDM [PhD Thesis]. Universidad Central “Marta Abreu” de Las Villas. Facultad de Matemática. pp. 100. https://dspace.uclv.edu.cu/bitstream/handle/123456789/7551/Tesis%20Final.pdf?sequence=1&isAllowed=y .

Mastelini, S. M., Santana, E. J., Cerri, R. & Barbon Jr, S. (2020). DSTARS: a multi-target deep structure for tracking asynchronous regressor stacking. Applied Soft Computing, 91, 106215. https://doi.org/10.1016/j.asoc.2020.106215

Nogueira, M. S. & Koch, O. (2019). The development of target-specific machine learning models as scoring functions for docking-based target prediction. Journal of chemical information and modeling, 59(3), 1238-1252. https://doi.org/10.1021/acs.jcim.8b00773.

Osojnik, A., Panov, P. & Džeroski, S. (2017). Multi-label classification via multi-target regression on data streams. Machine Learning, 106, 745-770. https://doi.org/10.1007/s10994-016-5613-5

Otegui, M.B. & Totaro, M. E. (2007). Atlas de histología vegetal. EDUNAM - Editorial Universitaria de la Univ. Nacional de Misiones.

pp 42. https://editorial.unam.edu.ar/images/documentos_digitales/978-950-579-064-7.pdf

Painuli, S., Elangovan, M. & Sugumaran, V. (2014). Tool condition monitoring using K-star algorithm. Expert Systems with Applications, 41(6), 2638-2643. https://doi.org/10.1016/j.eswa.2013.11.005

Pascual, I. de los A., Ramírez, J., & Ortiz, A. (2016). Métodos de Inteligencia Artificial para la predicción del rendimiento y calidad de gramíneas. REDVET. Revista Electrónica de Veterinaria, 17(12). https://www.redalyc.org/pdf/636/63649052026.pdf

Ramírez-Lozano, R. (2010). Importancia de los taninos condensados en la nutrición del venado cola blanca. Conferencia: 5° Simposio sobre Fauna Cinegética en México At: Puebla, México (1). 1-21. https://www.researchgate.net/publication/268207092_Importancia_de_los_taninos_condensados_en_la_nutricion_del_venado_cola_blanca

Refaeilzadeh, P., Tang, L. & Liu, H. (2016). Cross-Validation. In: Liu L & Özsu MT (Eds.), Encyclopedia of Database Systems (pp. 1–6). New York, NY: Springer New York. http://leitang.net/papers/ency-cross-validation.pdf

Reyes, O., Cano, A., Fardoun, H. M. & Ventura, S. (2018). A locally weighted learning method based on a data gravitation model for multi-target regression. International Journal of Computational Intelligence Systems, 11(1), 282-295. https://doi.org/10.2991/ijcis.11.1.22

Rincón-Tuexi, J. A., Castro-Nava, S., López-Santillán, J. A., Huerta, A. J., Trejo-López, C. & Briones-Encinia, F. (2006). Temperatura alta y estrés hídrico durante la floración en poblaciones de maíz tropical. Phyton (Buenos Aires), 75, 31-40. http://www.scielo.org.ar/pdf/phyton/v75/v75a03.pdf

Ruiz, T. E., Febles, G. J., Galindo, J. L., Savón, L. L., Chongo, B. B., Torres, V., Cino, D. M., Alonso, J., Martínez, Y., Gutiérrez, D., Crespo, G. J., Mora, L., Scull, I., La O, O., González, J., Lok, S., González, N. & Zamora, A. (2014). Tithonia diversifolia, sus posibilidades en sistemas ganaderos. Revista Cubana de Ciencia Agrícola, 48(1), 79-82. https://www.redalyc.org/pdf/1930/193030122017.pdf

Ruiz, T., Febles, G., Castillo, E., Jordan, H., Galindo, J., Chongo, B., Delgado, D., Mejías, R. & Crespo, G. (2011). Tecnología de producción animal mediante Leucaena leucocephala asociada con pastos en el 100% del área de la unidad ganadera. Sitio Argentino de Producción Animal, pp 6. https://www.produccion-animal.com.ar/produccion_y_manejo_pasturas/pasturas_cultivadas_megatermicas/112-leucaena.pdf

Santana, E. J., Mastelini, S. M. & Barbon Jr, S. (2017). Deep regressor stacking for air ticket prices prediction. In Anais do XIII Simpósio Brasileiro de Sistemas de Informação, (pp. 25-31). Porto Alegre: SBC. https://doi.org/10.5753/sbsi.2017.6022

Spyromitros-Xioufis, E., Tsoumakas, G., Groves, W. & Vlahavas, I. (2016). Multi-target regression via input space expansion: Treating targets as inputs. Machine Learning, 104, 55-98. https://doi.org/10.1007/s10994-016-5546-z

Tsoumakas, G., Spyromitros-Xioufis, E., Vilcek, J. & Vlahavas, I. (2011). Mulan: A java library for multi-label learning. Journal of Machine Learning Research, 12(Jul), 2411-2414. https://www.jmlr.org/papers/volume12/tsoumakas11a/tsoumakas11a.pdf

Tuia, D., Verrelst, J., Alonso, L., Pérez-Cruz, F. & Camps-Valls, G. (2011). Multioutput support vector regression for remote sensing biophysical parameter estimation. IEEE Geoscience and Remote Sensing Letters, 8(4), 804-808. https://matlabtools.com/wp-content/uploads/p603.pdf

Verdecia, D.M., Herrera, R.S., Ramírez, J.L., Bodas, R., Leonard, I., Giráldez, F., Andrés, S., Santana, A., Méndez-Martínez, Y. & López, S. (2018). Yield components, chemical characterization and polyphenolic profile of Tithonia diversifolia in Valle del Cauto, Cuba. Cuban Journal of Agricultural Science, 52(4), 457-471. http://cjascience.com/index.php/CJAS/article/view/838

Waegeman, W., Dembczyński, K. & Hüllermeier, E. (2019). Multi-target prediction: A unifying view on problems and methods. Data Mining and Knowledge Discovery, 33(2), 293-324. https://arxiv.org/pdf/1809.02352.pdf

Wang, X., Zhen, X., Li, Q., Shen, D. & Huang, H. (2018). Cognitive assessment prediction in Alzheimer’s disease by multi-layer multi-target regression. Neuroinformatics, 16(3-4), 285-294. https://doi.org/10.1007/s12021-018-9381-1

Zhang, J., Li, Q., Caselli, R. J., Thompson, P. M., Ye, J. & Wang, Y. (2017). Multi-source multi-target dictionary learning for prediction of cognitive decline. International Conference on Information Processing in Medical Imaging, 10265, 184-197. https://doi.org/10.1007/978-3-319-59050-9_15

Zhen, X., Yu, M., He, X. & Li, S. (2017). Multi-target regression via robust low-rank learning. IEEE transactions on pattern analysis and machine intelligence, 40(2), 497-504. https://ieeexplore.ieee.org/ielaam/34/8249508/7888599-aam.pdf

Zighed, N. & Bounour, N. (2019). On The Use Of KStar Algorithm For Predicting Object-Oriented Software Maintainability. Conference Internationale sur intelligence Artificielle et les Technologies Information ICAIIT 2019. Pp. 1-5. https://dspace.univ-ouargla.dz/jspui/bitstream/123456789/20983/1/Zighed%20Narimane.pdf

Published

2023-04-01

How to Cite

Uvidia, H., Estrada-Jiménez, P. M., Herrera-Herrera, R. del C. ., Hernández-Montiel, L. G., Verdecia-Acosta, D. M., Ramírez-de la Ribera, J. L., … Chacón-Marcheco, E. (2023). Adaptability of regression algorithms to the behavior of protein plants. Enfoque UTE, 14(2), pp. 20–34. https://doi.org/10.29019/enfoqueute.861

Issue

Section

Miscellaneous