基于生成式对抗网络的机器学习预测Re()的表观扩散系数

    Predicting Apparent Diffusion Coefficient of Re() Using Machine Learning With Generative Adversarial Networks

    • 摘要: 表观扩散系数(Da)是高放废物处置库安全评价的关键参数。然而,由于样本数有限、扩散机制不明确等问题,难以满足复杂地质条件下高精度预测的需求。本工作采用机器学习算法预测膨润土中Re(Ⅶ)的Da值。数据集包含1 073组实验样本和26个输入特征量。通过引入高斯噪声与生成式对抗网络(GAN)技术进行数据增强,最终将样本数扩充到4 292组。探讨了样本数对Da预测精度的影响,并比较了集成算法LGBM-XGBoost与深度神经网络(DNN)算法对预测性能的影响。回归预测结果表明,LGBM-XGBoost集成模型的预测性能优于DNN模型,最优模型的决定系数R2为0.94。通过沙普利可加性特征解释方法(SHAP)分析和特征重要性评估,发现总孔隙率与有效压实密度是影响Da预测精度的主要因素。为了验证模型的泛化能力,采用贯穿扩散法测量了压实膨润土中Re(Ⅶ)的Da值,随着压实干密度从1 800 kg/m3降低到1 200 kg/m3Da值从1.09×10−10 m2/s增加到2.49×10−10 m2/s。LGBM-XGBoost模型预测的Da相对标准偏差低于17%,表明该模型在未见样本上保持稳定预测性能。该方法为高放废物地质处置安全性评价提供了一种潜在的预测方法和机理分析工具。

       

      Abstract: Diffusion is the predominant transportation mechanism of radionuclides in compacted bentonite, which is attributed to the low permeability, high swelling capacity, and strong adsorption characteristics. The apparent diffusion coefficient(Da) is a crucial parameter in the safety evaluation of high-level radioactive waste repositories. However, it remains challenging to accurately predict the Da value under complex geological conditions due to scarce experimental data and unclear diffusion mechanisms. In this study, machine learning models were employed to predict the Da values of Re(Ⅶ) in compacted bentonites. The dataset included 1 073 experimental instances with 26 input features. Feature engineering techniques were applied to standardize the data, including outlier removal, logarithmic transformation, and max-min normalization. Data augmentation was performed using both Gaussian noise injection and the generative adversarial network(GAN) techniques, expanding the datasets to 4 292 instances with 26 input features. The influence of instance quantity on predictive accuracy was systematically analyzed, with comparative performance evaluation conducted between an integrated light gradient boosting machine-extreme gradient boosting(LGBM-XGBoost) algorithm and a deep neural network(DNN) architecture. It shows that the predictive accuracy increases with increasing quantity of instances. The predictive accuracy increases significantly after using Gaussian noise injection and GAN techniques. However, Gaussian noise injection resultes in a decrease of model robust. In addition, the LGBM-XGBoost model outperforms the DNN model in predictive accuracy, achieving a coefficient of determination(R2) of 0.99 for training set, 0.94 for validation set, and 0.94 for test sets. 95% of the instance predictions fell within a factor of 2 of the experimental values. Shapley additive explanation(SHAP) and feature importance(FI) techniques were applied in the LGBM-XGBoost model to analyze the predictive contribution of input features. It shows that the total porosity and compaction dry density are the top-two contributors. To evaluate the model’s generalization capability, through-diffusion experiments were conducted to measure the Da values of Re(Ⅶ) in saturated compacted bentonite. The Da values increase from 1.09×10−10 m2/s to 2.49×10−10 m2/s with decreasing compacted dry density from 1 800 kg/m3 to 1 200 kg/m3. The negative relationship between Da and compacted dry density is consistent with the results of SHAP and FI analysis. It can be explained that the increase in total porosity facilitates Re(Ⅶ) diffusion. The LGBM-XGBoost model exhibites excellent generalization capability, with relative errors of Da below 17%. This study establishes a potential predictive approach and mechanistic analysis tool for the safety assessment of high-level radioactive waste repositories.

       

    /

    返回文章
    返回