Predicting Methanol Space-Time Yield from CO2 Hydrogenation Using Machine Learning: Statistical Evaluation of Penalized Regression Techniques
DOI:
https://doi.org/10.59395/ijadis.v5i2.1341Keywords:
Penalized Regression, Ridge Regression, Methanol Production, CO₂ Hydrogenation, Lasso Regression, Elastic Net RegressionAbstract
This study investigates the effectiveness of machine learning techniques, specifically penalized regression models Ridge Regression, Lasso Regression, and Elastic Net Regression in predicting methanol space-time yield (STY) from CO2 hydrogenation data. Using a dataset derived from Cu-based catalyst research, the study implemented a comprehensive preprocessing approach, including data cleaning, imputation, outlier removal, and normalization. The models were rigorously evaluated through 10-fold cross-validation and tested on unseen data. Ridge Regression outperformed the other models, achieving the lowest Root Mean Squared Error (RMSE) of 0.7706, Mean Absolute Error (MAE) of 0.5627, and Mean Squared Error (MSE) of 0.5938. In comparison, Lasso and Elastic Net Regression models exhibited higher error metrics. Feature importance analysis revealed that Gas Hourly Space Velocity (GHSV) and Molar Masses of Support significantly influence catalytic activity. These findings suggest that Ridge Regression is a promising tool for accurately predicting methanol production, providing valuable insights for optimizing catalytic processes and advancing sustainable practices in chemical engineering.
Downloads
References
A. AlNouss, G. Mckay, and T. Al-Ansari, "Utilisation of Carbon Dioxide and Gasified Biomass for the Generation of Value Added Products," Computer Aided Chemical Engineering, vol. 50, pp. 1567-1572, Jan. 2021, doi: 10.1016/B978-0-323-88506-5.50242-4. https://doi.org/10.1016/B978-0-323-88506-5.50242-4
M. N. Anwar et al., "CO2 utilization: Turning greenhouse gas into fuels and valuable products," J Environ Manage, vol. 260, p. 110059, Apr. 2020, doi: 10.1016/J.JENVMAN.2019.110059. https://doi.org/10.1016/j.jenvman.2019.110059
T. Patil, A. Naji, U. Mondal, I. Pandey, A. Unnarkat, and S. Dharaskar, "Sustainable methanol production from carbon dioxide: advances, challenges, and future prospects," Environmental Science and Pollution Research 2024 31:32, vol. 31, no. 32, pp. 44608-44648, Jul. 2024, doi: 10.1007/S11356-024-34139-3. https://doi.org/10.1007/s11356-024-34139-3
A. Saravanan et al., "A comprehensive review on different approaches for CO2 utilization and conversion pathways," Chem Eng Sci, vol. 236, p. 116515, Jun. 2021, doi: 10.1016/J.CES.2021.116515. https://doi.org/10.1016/j.ces.2021.116515
S. S. Tabibian and M. Sharifzadeh, "Statistical and analytical investigation of methanol applications, production technologies, value-chain and economy with a special focus on renewable methanol," Renewable and Sustainable Energy Reviews, vol. 179, p. 113281, Jun. 2023, doi: 10.1016/J.RSER.2023.113281. https://doi.org/10.1016/j.rser.2023.113281
A. Sonthalia, N. Kumar, M. Tomar, V. Edwin Geo, S. Thiyagarajan, and A. Pugazhendhi, "Moving ahead from hydrogen to methanol economy: scope and challenges," Clean Technol Environ Policy, vol. 25, no. 2, pp. 551-575, Mar. 2023, doi: 10.1007/S10098-021-02193-X/METRICS.
Z. Tian, Y. Wang, X. Zhen, and Z. Liu, "The effect of methanol production and application in internal combustion engines on emissions in the context of carbon neutrality: A review," Fuel, vol. 320, p. 123902, Jul. 2022, doi: 10.1016/J.FUEL.2022.123902.
https://doi.org/10.1016/j.fuel.2022.123902
A. Ullah, N. A. Hashim, M. F. Rabuni, and M. U. Mohd Junaidi, "A Review on Methanol as a Clean Energy Carrier: Roles of Zeolite in Improving Production Efficiency," Energies 2023, Vol. 16, Page 1482, vol. 16, no. 3, p. 1482, Feb. 2023, doi: 10.3390/EN16031482. https://doi.org/10.3390/en16031482
F. Sha, Z. Han, S. Tang, J. Wang, and C. Li, "Hydrogenation of Carbon Dioxide to Methanol over Non?Cu-based Heterogeneous Catalysts," ChemSusChem, vol. 13, no. 23, pp. 6160-6181, Dec. 2020, doi: 10.1002/CSSC.202002054. https://doi.org/10.1002/cssc.202002054
M. Ren, Y. Zhang, X. Wang, and H. Qiu, "Catalytic Hydrogenation of CO2 to Methanol: A Review," Catalysts 2022, Vol. 12, Page 403, vol. 12, no. 4, p. 403, Apr. 2022, doi: 10.3390/CATAL12040403. https://doi.org/10.3390/catal12040403
C. Wu et al., "Inverse ZrO2/Cu as a highly efficient methanol synthesis catalyst from CO2 hydrogenation," Nature Communications 2020 11:1, vol. 11, no. 1, pp. 1-10, Nov. 2020, doi: 10.1038/s41467-020-19634-8. https://doi.org/10.1038/s41467-020-19634-8
M. B. Gawande et al., "Cu and Cu-Based Nanoparticles: Synthesis and Applications in Catalysis," Chem Rev, vol. 116, no. 6, pp. 3722-3811, Mar. 2016, doi: 10.1021/ACS.CHEMREV.5B00482/ASSET/IMAGES/LARGE/CR-2015-004823_0039.JPEG. https://doi.org/10.1021/acs.chemrev.5b00482
E. G. Aklilu and T. Bounahmidi, "Machine learning applications in catalytic hydrogenation of carbon dioxide to methanol: A comprehensive review," Int J Hydrogen Energy, vol. 61, pp. 578-602, Apr. 2024, doi: 10.1016/J.IJHYDENE.2024.02.309. https://doi.org/10.1016/j.ijhydene.2024.02.309
M. Shehab et al., "Machine learning in medical applications: A review of state-of-the-art methods," Comput Biol Med, vol. 145, p. 105458, Jun. 2022, doi: 10.1016/J.COMPBIOMED.2022.105458. https://doi.org/10.1016/j.compbiomed.2022.105458
I. H. Sarker, "Machine Learning: Algorithms, Real-World Applications and Research Directions," SN Comput Sci, vol. 2, no. 3, pp. 1-21, May 2021, doi: 10.1007/S42979-021-00592-X/FIGURES/11. https://doi.org/10.1007/s42979-021-00592-x
C. J. Greenwood et al., "A comparison of penalised regression methods for informing the selection of predictive markers," PLoS One, vol. 15, no. 11, p. e0242730, Nov. 2020, doi: 10.1371/JOURNAL.PONE.0242730. https://doi.org/10.1371/journal.pone.0242730
U. Sharma, N. Gupta, and M. Verma, "Prediction of compressive strength of GGBFS and Flyash-based geopolymer composite by linear regression, lasso regression, and ridge regression," Asian Journal of Civil Engineering, vol. 24, no. 8, pp. 3399-3411, Dec. 2023, doi: 10.1007/S42107-023-00721-2/METRICS. https://doi.org/10.1007/s42107-023-00721-2
Q. Chen, B. Xue, and M. Zhang, "Rademacher Complexity for Enhancing the Generalization of Genetic Programming for Symbolic Regression," IEEE Trans Cybern, vol. 52, no. 4, pp. 2382-2395, Apr. 2022, doi: 10.1109/TCYB.2020.3004361. https://doi.org/10.1109/TCYB.2020.3004361
M. Nicolau and A. Agapitos, "Choosing function sets with better generalisation performance for symbolic regression models," Genet Program Evolvable Mach, vol. 22, no. 1, pp. 73-100, Mar. 2021, doi: 10.1007/S10710-020-09391-4/METRICS. https://doi.org/10.1007/s10710-020-09391-4
M. Arashi, M. Roozbeh, N. A. Hamzah, and M. Gasparini, "Ridge regression and its applications in genetic studies," PLoS One, vol. 16, no. 4, p. e0245376, Apr. 2021, doi: 10.1371/JOURNAL.PONE.0245376. https://doi.org/10.1371/journal.pone.0245376
M. Hamada, J. J. Tanimu, M. Hassan, H. A. Kakudi, and P. Robert, "Evaluation of Recursive Feature Elimination and LASSO Regularization-based optimized feature selection approaches for cervical cancer prediction," Proceedings - 2021 IEEE 14th International Symposium on Embedded Multicore/Many-Core Systems-on-Chip, MCSoC 2021, pp. 333-339, 2021, doi: 10.1109/MCSOC51149.2021.00056. https://doi.org/10.1109/MCSoC51149.2021.00056
J. K. Tay, B. Narasimhan, and T. Hastie, "Elastic Net Regularization Paths for All Generalized Linear Models," J Stat Softw, vol. 106, 2023, doi: 10.18637/JSS.V106.I01. https://doi.org/10.18637/jss.v106.i01
M. Suvarna, T. P. Araújo, and J. Pérez-Ramírez, "A generalized machine learning framework to predict the space-time yield of methanol from thermocatalytic CO2 hydrogenation," Appl Catal B, vol. 315, p. 121530, Oct. 2022, doi: 10.1016/J.APCATB.2022.121530. https://doi.org/10.1016/j.apcatb.2022.121530
V. R. Joseph and A. Vakayil, "SPlit: An Optimal Method for Data Splitting," Technometrics, vol. 64, no. 2, pp. 166-176, 2022, doi: 10.1080/00401706.2021.1921037/SUPPL_FILE/UTCH_A_1921037_SM8231.PDF. https://doi.org/10.1080/00401706.2021.1921037
V. Roshan, J. H. M. Stewart, R. Joseph, and H. M. Stewart, "Optimal ratio for data splitting," Statistical Analysis and Data Mining: The ASA Data Science Journal, vol. 15, no. 4, pp. 531-538, Aug. 2022, doi: 10.1002/SAM.11583. https://doi.org/10.1002/sam.11583
J. Luengo, D. García-Gil, S. Ramírez-Gallego, S. García, and F. Herrera, "Big Data Preprocessing: Enabling Smart Data," Big Data Preprocessing: Enabling Smart Data, pp. 1-186, Jan. 2020, doi: 10.1007/978-3-030-39105-8/COVER. https://doi.org/10.1007/978-3-030-39105-8_1
A. A. Dharmasaputro, N. M. Fauzan, M. Kallista, I. P. D. Wibawa, and P. D. Kusuma, "Handling Missing and Imbalanced Data to Improve Generalization Performance of Machine Learning Classifier," 2021 International Seminar on Machine Learning, Optimization, and Data Science, ISMODE 2021, pp. 140-145, 2022, doi: 10.1109/ISMODE53584.2022.9743022. https://doi.org/10.1109/ISMODE53584.2022.9743022
R. Dawson, "How Significant is a Boxplot Outlier?," Journal of Statistics Education, vol. 19, no. 2, 2011, doi: 10.1080/10691898.2011.11889610. https://doi.org/10.1080/10691898.2011.11889610
R. C. Pfaffenberger and T. E. Dielman, "A Comparison of Regression Estimators When Both Multicollinearity and Outliers Are Present," Robust Regression, pp. 243-270, May 2019, doi: 10.1201/9780203740538-13. https://doi.org/10.1201/9780203740538-13
L. Huang, J. Qin, Y. Zhou, F. Zhu, L. Liu, and L. Shao, "Normalization Techniques in Training DNNs: Methodology, Analysis and Application," IEEE Trans Pattern Anal Mach Intell, vol. 45, no. 8, pp. 10173-10196, Aug. 2023, doi: 10.1109/TPAMI.2023.3250241. https://doi.org/10.1109/TPAMI.2023.3250241
R. Indrakumari, T. Poongodi, and S. R. Jena, "Heart Disease Prediction using Exploratory Data Analysis," Procedia Comput Sci, vol. 173, pp. 130-139, Jan. 2020, doi: 10.1016/J.PROCS.2020.06.017. https://doi.org/10.1016/j.procs.2020.06.017
Y. Wang et al., "Regression with adaptive lasso and correlation based penalty," Appl Math Model, vol. 105, pp. 179-196, May 2022, doi: 10.1016/J.APM.2021.12.016. https://doi.org/10.1016/j.apm.2021.12.016
S. Srivatsaan, A. Sankar, and M. Karthikeyan, "Impact Of Elastic Net and Lasso Regularization Techniques on the NHANES Dataset," AIP Conf Proc, vol. 3075, no. 1, Jul. 2024, doi: 10.1063/5.0217034/3305152. https://doi.org/10.1063/5.0217034
M. Hajihosseinlou, A. Maghsoudi, and R. Ghezelbash, "Regularization in machine learning models for MVT Pb-Zn prospectivity mapping: applying lasso and elastic-net algorithms," Earth Sci Inform, pp. 1-15, Aug. 2024, doi: 10.1007/S12145-024-01404-5/METRICS. https://doi.org/10.1007/s12145-024-01404-5
S. Bates, T. Hastie, and R. Tibshirani, "Cross-Validation: What Does It Estimate and How Well Does It Do It?," https://doi.org/10.1080/01621459.2023.2197686, 2023, doi: 10.1080/01621459.2023.2197686. https://doi.org/10.1080/01621459.2023.2197686
H. Al Azies, N. Ariyanto, and I. B. Dikaputra, "Data-Driven Analytical Model Using Machine Learning Algorithms," International Journal of Advances in Data and Information Systems, vol. 5, no. 1, pp. 1-11, Mar. 2024, doi: 10.59395/IJADIS.V5I1.1309. https://doi.org/10.59395/ijadis.v5i1.1309
D. Chicco, M. J. Warrens, and G. Jurman, "The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation," PeerJ Comput Sci, vol. 7, pp. 1-24, Jul. 2021, doi: 10.7717/PEERJ-CS.623/SUPP-1. https://doi.org/10.7717/peerj-cs.623/supp-1
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Harun Al Azies, Muhamad Akrom, Setyo Budi, Gustina Alfa Trisnapradika, Aprilyani Nur Safitri
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.