家庭能源系统(Home energy system,HES)需求响应有助于改善人类生活并促进节能减排。由于HES存在不确定性且人们的生活习惯各异,需求响应优化策略需要具备快速自适应能力且满足人们使用不同家庭设备的偏好需求。因此,本文提出了一种人类偏好强化学习(Deep reinforcement learning from human preferences,DRLHP)与演化计算融合的智能优化算法。该算法首先利用收集到的偏好信息来训练基于人类偏好的奖励生成器。然后,使用奖励生成器代替传统深度强化学习算法的奖励函数,并与HES模型进行交互,对模型的复杂规律进行持续学习。最后,为进一步提升算法的个性化能力,使用演化计算算法在DRLHP的基础上进一步优化调度方案。给出的算例表明,该智能优化算法求解速度快,寻优能力和鲁棒性强,在满足人类偏好需求的同时,实现了节能减排。
The demand response of Home Energy Systems (HES) contributes to improving human life and promoting energy conservation and emission reduction. Due to the uncertainty of HES and the diverse living habits of people, demand response optimization strategies need to have the ability to quickly adapt and satisfy people's preferences for using different home appliances. Therefore, this paper proposes an intelligent optimization algorithm that integrates Human Preference Reinforcement Learning (DRLHP) with evolutionary computation. The algorithm first uses collected preference information to train a reward generator based on human preferences. Then, the reward generator replaces the reward function of traditional deep reinforcement learning algorithms and interacts with the HES model, continuously learning the complex patterns of the model. Finally, to further enhance the personalization capability of the algorithm, evolutionary computation is employed based on DRLHP to further optimize the scheduling plan. The case study demonstrates that the proposed intelligent optimization algorithm has fast solving speed, strong optimization ability, and robustness, achieving energy conservation and emission reduction while satisfying human preference needs.