Adaptability of regression algorithms to the behavior of protein plants




Secondary metabolites, regression models, Secondary metabolites; regression models; cell wall; nutritional value., nutritional value


The behavior of components of protein plant is of vital importance for animals that consume them in their diet. The objective of this research is to evaluate regression algorithms, to determine the behavior of the expressions that best adapt to the procedures of a traditional laboratory and to estimate the chemical components of protein plants, in this sense the MULAN library of java has been used, that contain automatic learning algorithms capable of adapting to dissimilar problems. Three data set were created for each species treated in this study; each of these include the main elements to be evaluate in each experiment, these are delimits by: secondary metabolites, cell wall components and digestibility element for training files one, two and three, respectively; subsequently, they were evaluated through learning supervised and cross-validation of each to determine the best fit by aRMSE (Average Root Mean Square Error). The learning results were compare with previous experiments, where there was a learning variant that contained in a single dataset all the components to be evaluates in a single prediction. The result of the comparison shows that the lazy algorithms based on instances have a better learning behavior than the others evaluate.



Download data is not yet available.


Alebele, Y., Zhang, X., Wang, W., Yang, G., Yao, X., Zheng, H., Zhu, Y., Cao, W. & Cheng, T. (2020). Estimation of Canopy Biomass Components in Paddy Rice from Combined Optical and SAR Data Using Multi-Target Gaussian Regressor Stacking. Remote Sensing, 12(16), 2564.

Alzubi, J., Nayyar, A. & Kumar, A. (2018). Machine learning from theory to algorithms: An overview. Journal of physics: conference series, 1142(1), 012012.

Amin, M. N. & Habib, A. (2015). Comparison of different classification techniques using WEKA for hematological data. American Journal of Engineering Research, 4(3), 55-61.

Barrios, H. D., Rivas, Y. A., Hernández, L. C., Hernández, A. M., Cárdenas, M. del C. C. & Cardoso, G. M. C. (2015). Algoritmos de aprendizaje automático para clasificación de Splice Sites en secuencias genómicas. Revista Cubana de Ciencias Informáticas, 9(4), 155-170.

Berrar, D. (2019). Cross-Validation. Encyclopedia of Bioinformatics and Computational

Biology,1, 542–545..

Borchani, H., Varando, G., Bielza, C. & Larranaga, P. (2015). A survey on multi-output regression. Wires Data Mining and Knowledge Discovery, 5(5), 216-233.

Cabrera, D. (2008). Manejo y uso de pastos y forrajes en ganadería tropical. Universidad de Córdoba, pp 40.

Cambronero, C. G. & Moreno, I. G. (2006). Algoritmos de aprendizaje: Knn & kmeans. Inteligencia en Redes de Comunicación, Universidad Carlos III, Madrid, Spain. pp. 8.

Camejo-Corona, J., Gonzalez, H. & Morell, C. (2019). Los principales algoritmos para regresión con salidas múltiples. Una revisión para Big Data. Revista Cubana de Ciencias Informáticas, 13(4), 118-150.

Chen, S., Gu, C., Lin, C. & Hariri-Ardebili, M. A. (2021). Prediction of arch dam deformation via correlated multi-target stacking. Applied Mathematical Modelling, 91, 1175-1193.

Cleary, J. G. & Trigg, L. E. (1995). K*: An instance-based learner using an entropic distance measure. En Machine Learning Proceedings 1995, 108-114.

Coraddu, A., Oneto, L., Ghio, A., Savio, S., Anguita, D. & Figari, M. (2016). Machine learning approaches for improving condition-based maintenance of naval propulsion plants. Proceedings of the Institution of Mechanical Engineers, Part M: Journal of Engineering for the Maritime Environment, 230(1), 136-153.

Despotovic, M., Nedic, V., Despotovic, D. & Cvetanovic, S. (2016). Evaluation of empirical models for predicting monthly mean horizontal diffuse solar radiation. Renewable and Sustainable Energy Reviews, 56, 246-260.

Díaz, A., Cayón, G. & Mira, J. J. (2007). Metabolismo del calcio y su relación con la «mancha de madurez» del fruto de banano. Una revisión. Agronomía Colombiana, 25(2), 280-287.

Džeroski, S., Demšar, D. & Grbović, J. (2000). Predicting chemical parameters of river water quality from bioindicator data. Applied Intelligence, 13(1), 7-17.

Erdal, H., Erdal, M., Simsek, O. & Erdal, H. I. (2018). Prediction of concrete compressive strength using non-destructive test results. Computers and Concrete, 21(4), 407-417.

Estrada-Jiménez, P. M., Diez, H. R., Cabrera, A. V., Verdecia, D. M. & Ramírez, J. L. (2018). Modelos de predicción de metabolitossecundarios para dos variedades de plantas protéicas. Memorias del VIII Congreso Iberoamericano de Ingeniería de Proyectos. Universidad de Ciencias Informáticas, Cuba. Pp 1-9.

Estrada-Jiménez, P. M., González-Diez, H. R., Verdecia-Cabrera, A., Verdecia-Acosta, D. M. & Ramírez-de la Rivera, J. L. (2018). Modelos de predicción de metabolitos secundarios para dos variedades de plantas proteicas. Libro de Memorias: VIII Congreso Iberoamericano de Ingeniería de Proyectos. Ediciones Futuro. Universidad de Ciencias Informáticas, Cuba, pp 1-9.

Estrada-Jiménez, P. M., Noguera-López, P. J. & Recio-Avilés, R. (2020). Aplicación de la regresión de múltiples objetivos en la estimación de componentes fitoquímicos. Pensamiento Matemático, 10(2), 7-14.

Estrada-Jiménez, P. M., Ramírez-de la Ribera, J. L., Verdecia-Acosta, D. M. & Soler-Pellicer, Y. (2019). Aplicación de la minería de datos en la estimación de componentes fotoquímicos (Original). Roca. Revista científico-educacional de la provincia Granma, 15(2), 177-186.

Fang, J., Li, Y., Liu, R., Pang, X., Li, C., Yang, R., He, Y., Lian, W., Liu, A.L. & Du, G.H. (2015). Discovery of multitarget-directed ligands against Alzheimer’s disease through systematic prediction of chemical–protein interactions. Journal of chemical information and modeling, 55(1), 149-164.

González, F. A. (2015). Machine learning models in rheumatology. Revista Colombiana de Reumatología, 22(2), 77-78.

Herrera, R.S., Verdecia, D.M., Ramírez, J.L., García, M. & Cruz, A.M. (2017). Relation between some climatic factors and the chemical composition of Tithonia diversifolia. Revista Cubana de Ciencia Agrícola, 51(2), 271-279.

Joshi, R. S., Jagdale, S. S., Bansode, S. B., Shankar, S. S., Tellis, M. B., Pandya, V. K., Chugh, A., Giri, A. P. & Kulkarni, M. J. (2020). Discovery of potential multi-target-directed ligands by targeting host-specific SARS-CoV-2 structurally conserved main protease. Journal of Biomolecular Structure and Dynamics, 1-16.

Karalič, A. & Bratko, I. (1997). First order regression. Machine learning, 26(2), 147-176.

Khosravi, K., Khozani, Z. S. & Cooper, J. R. (2021). Predicting stable gravel-bed river hydraulic geometry: A test of novel, advanced, hybrid data mining algorithms. Environmental Modelling & Software, 144, 105165. ,

Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. Ijcai, 14(2), 1137-1145.

Li, J., Zhang, L., He, C. & Zhao, C. (2018). A comparison of Markov chain random field and ordinary kriging methods for calculating soil texture in a mountainous watershed, northwest China. Sustainability, 10(8), 2819.

Mahecha, L. & Rosales, M. (2005). Valor nutricional del follaje de botón de oro Tithonia diversifolia (Hemsl.) Gray, en la producción animal en el trópico. Livestock Research for Rural Development, 17(9), 1. .

Mahecha, L., Escobar, J., Suárez, J. & Restrepo, L. (2007). Tithonia diversifolia (hemsl.) Gray (botón de oro) como suplemento forrajero de vacas F1 (Holstein por Cebú). Livestock Research for Rural Development, 19(2), 1-6. .

Maliha, S. K., Islam, T., Ghosh, S. K., Ahmed, H., Mollick, Md. R. J. & Ema, R. R. (2019). Prediction of Cancer Using Logistic Regression, K-Star and J48 algorithm. 2019 4th International Conference on Electrical Information and Communication Technology (EICT), 1-6.

Mariño, A. P. (2015). GMLKNN: modelo basado en instancias para el aprendizaje multi-etiqueta utilizando la distancia VDM [PhD Thesis]. Universidad Central “Marta Abreu” de Las Villas. Facultad de Matemática. pp. 100. .

Mastelini, S. M., Santana, E. J., Cerri, R. & Barbon Jr, S. (2020). DSTARS: a multi-target deep structure for tracking asynchronous regressor stacking. Applied Soft Computing, 91, 106215.

Nogueira, M. S. & Koch, O. (2019). The development of target-specific machine learning models as scoring functions for docking-based target prediction. Journal of chemical information and modeling, 59(3), 1238-1252.

Osojnik, A., Panov, P. & Džeroski, S. (2017). Multi-label classification via multi-target regression on data streams. Machine Learning, 106, 745-770.

Otegui, M.B. & Totaro, M. E. (2007). Atlas de histología vegetal. EDUNAM - Editorial Universitaria de la Univ. Nacional de Misiones.

pp 42.

Painuli, S., Elangovan, M. & Sugumaran, V. (2014). Tool condition monitoring using K-star algorithm. Expert Systems with Applications, 41(6), 2638-2643.

Pascual, I. de los A., Ramírez, J., & Ortiz, A. (2016). Métodos de Inteligencia Artificial para la predicción del rendimiento y calidad de gramíneas. REDVET. Revista Electrónica de Veterinaria, 17(12).

Ramírez-Lozano, R. (2010). Importancia de los taninos condensados en la nutrición del venado cola blanca. Conferencia: 5° Simposio sobre Fauna Cinegética en México At: Puebla, México (1). 1-21.

Refaeilzadeh, P., Tang, L. & Liu, H. (2016). Cross-Validation. In: Liu L & Özsu MT (Eds.), Encyclopedia of Database Systems (pp. 1–6). New York, NY: Springer New York.

Reyes, O., Cano, A., Fardoun, H. M. & Ventura, S. (2018). A locally weighted learning method based on a data gravitation model for multi-target regression. International Journal of Computational Intelligence Systems, 11(1), 282-295.

Rincón-Tuexi, J. A., Castro-Nava, S., López-Santillán, J. A., Huerta, A. J., Trejo-López, C. & Briones-Encinia, F. (2006). Temperatura alta y estrés hídrico durante la floración en poblaciones de maíz tropical. Phyton (Buenos Aires), 75, 31-40.

Ruiz, T. E., Febles, G. J., Galindo, J. L., Savón, L. L., Chongo, B. B., Torres, V., Cino, D. M., Alonso, J., Martínez, Y., Gutiérrez, D., Crespo, G. J., Mora, L., Scull, I., La O, O., González, J., Lok, S., González, N. & Zamora, A. (2014). Tithonia diversifolia, sus posibilidades en sistemas ganaderos. Revista Cubana de Ciencia Agrícola, 48(1), 79-82.

Ruiz, T., Febles, G., Castillo, E., Jordan, H., Galindo, J., Chongo, B., Delgado, D., Mejías, R. & Crespo, G. (2011). Tecnología de producción animal mediante Leucaena leucocephala asociada con pastos en el 100% del área de la unidad ganadera. Sitio Argentino de Producción Animal, pp 6.

Santana, E. J., Mastelini, S. M. & Barbon Jr, S. (2017). Deep regressor stacking for air ticket prices prediction. In Anais do XIII Simpósio Brasileiro de Sistemas de Informação, (pp. 25-31). Porto Alegre: SBC.

Spyromitros-Xioufis, E., Tsoumakas, G., Groves, W. & Vlahavas, I. (2016). Multi-target regression via input space expansion: Treating targets as inputs. Machine Learning, 104, 55-98.

Tsoumakas, G., Spyromitros-Xioufis, E., Vilcek, J. & Vlahavas, I. (2011). Mulan: A java library for multi-label learning. Journal of Machine Learning Research, 12(Jul), 2411-2414.

Tuia, D., Verrelst, J., Alonso, L., Pérez-Cruz, F. & Camps-Valls, G. (2011). Multioutput support vector regression for remote sensing biophysical parameter estimation. IEEE Geoscience and Remote Sensing Letters, 8(4), 804-808.

Verdecia, D.M., Herrera, R.S., Ramírez, J.L., Bodas, R., Leonard, I., Giráldez, F., Andrés, S., Santana, A., Méndez-Martínez, Y. & López, S. (2018). Yield components, chemical characterization and polyphenolic profile of Tithonia diversifolia in Valle del Cauto, Cuba. Cuban Journal of Agricultural Science, 52(4), 457-471.

Waegeman, W., Dembczyński, K. & Hüllermeier, E. (2019). Multi-target prediction: A unifying view on problems and methods. Data Mining and Knowledge Discovery, 33(2), 293-324.

Wang, X., Zhen, X., Li, Q., Shen, D. & Huang, H. (2018). Cognitive assessment prediction in Alzheimer’s disease by multi-layer multi-target regression. Neuroinformatics, 16(3-4), 285-294.

Zhang, J., Li, Q., Caselli, R. J., Thompson, P. M., Ye, J. & Wang, Y. (2017). Multi-source multi-target dictionary learning for prediction of cognitive decline. International Conference on Information Processing in Medical Imaging, 10265, 184-197.

Zhen, X., Yu, M., He, X. & Li, S. (2017). Multi-target regression via robust low-rank learning. IEEE transactions on pattern analysis and machine intelligence, 40(2), 497-504.

Zighed, N. & Bounour, N. (2019). On The Use Of KStar Algorithm For Predicting Object-Oriented Software Maintainability. Conference Internationale sur intelligence Artificielle et les Technologies Information ICAIIT 2019. Pp. 1-5.



How to Cite

Uvidia-Cabadiana, H. A., Estrada-Jiménez, P. M., Herrera-Herrera, R. del C. ., Hernández-Montiel, L. G., Verdecia-Acosta, D. M., Ramírez-de la Ribera, J. L., Noguera-López, P. J., & Chacón-Marcheco, E. (2023). Adaptability of regression algorithms to the behavior of protein plants. Enfoque UTE, 14(2), pp. 20-34.