洱海叶绿素a浓度的半监督学习反演

展开
  • 1. 上海交通大学 自动化与感知学院,上海 200240;2.上海交通大学 环境科学与工程学院,上海 200240;3.云南洱海湖泊生态系统国家野外科学观测研究站,大理 云南671000
刘瑶(2001—),研究生,从事计算机视觉和机器学习的相关研究
时良仁,高级工程师,博士生导师;E-mail:shiliangrenmail@126.com

网络出版日期: 2026-02-03

Inversion of Chlorophyll-a Concentrations in Erhai Lake Based on Semi-supervised Learning

Expand
  • 1. School of Automation and Intelligent Sensing, Shanghai Jiao Tong University, Shanghai 200240, China; 2. School of Environmental Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China; 3. National Observation and Research Station of Erhai Lake Ecosystem in Yunnan, Dali 671000, Yunnan, China

Online published: 2026-02-03

摘要

叶绿素a浓度是评估水生生态系统水质状况的关键指标,其遥感反演对湖泊的大范围动态监测至关重要。为解决现有方法依赖大量标记数据、计算复杂且难以适应多波段遥感特性的问题,一种基于随机森林的半监督回归框架ForestSimReg被提出。该框架通过波段遮蔽增强策略提升模型对光谱干扰的鲁棒性,并结合基于袋外估计的伪标签过滤与基于决策路径相似性的校准双重机制,有效抑制噪声、提升伪标签质量,增强小样本下的泛化能力。在洱海叶绿素a数据集上的实验结果表明:1)ForestSimReg在R2MAERMSE指标上均优于主流对比模型,在80%标注比例下R2达到0.627;2)消融实验验证了伪标签双重优化机制的有效性与协同性;3)计算效率实验表明ForestSimReg参数量少、推理快且稳,但伴随较高的内存占用,在效率与实用性之间取得了平衡。所提出的ForestSimReg框架为中小样本遥感回归问题提供了创新性解决方案,在水环境遥感动态监测中展现出良好的应用潜力。叶绿素a浓度是评估水生生态系统水质状况的关键指标,其遥感反演对湖泊的大范围动态监测至关重要。为解决现有方法依赖大量标记数据、计算复杂且难以适应多波段遥感特性的问题,一种基于随机森林的半监督回归框架ForestSimReg被提出。该框架通过波段遮蔽增强策略提升模型对光谱干扰的鲁棒性,并结合基于袋外估计的伪标签过滤与基于决策路径相似性的校准双重机制,有效抑制噪声、提升伪标签质量,增强小样本下的泛化能力。在洱海叶绿素a数据集上的实验结果表明:1)ForestSimReg在R2MAERMSE指标上均优于主流对比模型,在80%标注比例下R2达到0.627;2)消融实验验证了伪标签双重优化机制的有效性与协同性;3)计算效率实验表明ForestSimReg参数量少、推理快且稳,但伴随较高的内存占用,在效率与实用性之间取得了平衡。所提出的ForestSimReg框架为中小样本遥感回归问题提供了创新性解决方案,在水环境遥感动态监测中展现出良好的应用潜力。

本文引用格式

刘瑶1, 许钰佳1, 时良仁1, 李元龙1, 王欣泽2, 3 . 洱海叶绿素a浓度的半监督学习反演[J]. 上海交通大学学报, 0 : 1 . DOI: 10.16183/j.cnki.jsjtu.2025.356

Abstract

Chlorophyll-a concentration is a key indicator for assessing the water quality of aquatic ecosystems, and its remote sensing inversion is crucial for large-scale dynamic monitoring of lakes. In response to the limitations of existing methods, such as their reliance on large amounts of labeled data, being computationally intensive, and insufficient adaptation to multi-band remote sensing characteristics, this paper proposes a semi-supervised regression framework named ForestSimReg based on random forest. The framework enhances the model’s robustness to spectral interference through a spectral band masking augmentation strategy and integrates a dual mechanism combining pseudo-label filtering based on out-of-bag estimation and calibration based on decision path similarity. This effectively suppresses noise, improves pseudo-label quality, and enhances generalization capability with limited labeled samples. Experimental results on the Erhai Lake chlorophyll-a dataset show that: 1) ForestSimReg outperforms mainstream comparison models in terms of R², MAE, and RMSE, achieving anR2

文章导航

/