Coupling User Preference with External Rewards to Enable Driver-centered and Resource-aware EV Charging Recommendation

2025-05-06 0 0 884.08KB 16 页 10玖币
侵权投诉
Coupling User Preference with External Rewards
to Enable Driver-centered and Resource-aware
EV Charging Recommendation
Chengyin Li, Zheng Dong, Nathan Fisher, and Dongxiao Zhu [ ]
Department of Computer Science, Wayne State University, Detroit MI 48201, USA
{cyli, dong, fishern, dzhu}@wayne.edu
Abstract. Electric Vehicle (EV) charging recommendation that both
accommodates user preference and adapts to the ever-changing external
environment arises as a cost-effective strategy to alleviate the range anxi-
ety of private EV drivers. Previous studies focus on centralized strategies
to achieve optimized resource allocation, particularly useful for privacy-
indifferent taxi fleets and fixed-route public transits. However, private
EV driver seeks a more personalized and resource-aware charging rec-
ommendation that is tailor-made to accommodate the user preference
(when and where to charge) yet sufficiently adaptive to the spatiotem-
poral mismatch between charging supply and demand. Here we propose
a novel Regularized Actor-Critic (RAC) charging recommendation ap-
proach that would allow each EV driver to strike an optimal balance
between the user preference (historical charging pattern) and the exter-
nal reward (driving distance and wait time). Experimental results on
two real-world datasets demonstrate the unique features and superior
performance of our approach to the competing methods.
Keywords: Actor critic ·Charging recommendation ·Electric vehicle
(EV) ·User preference ·External reward.
1 Introduction
Electric Vehicles (EVs) are becoming popular due to their decreased carbon
footprint and intelligent driving experience over conventional internal combus-
tion vehicles [9] in personal transportation tools. Meanwhile, the miles per charge
of an EV is limited by its battery capacity, together with sparse allocations of
charging stations (CSs) and excessive wait/charge time, which are major driving
factors for the so-called range anxiety, especially for private EV drivers. Recently,
developing intelligent driver-centered charging recommendation algorithms are
emerging as a cost-effective strategy to ensure sufficient utilization of the existing
charging infrastructure and satisfactory user experience [17, 19].
Existing charging recommendation studies mainly focus on public EVs (e.g.,
electric taxis and buses) [2, 17]. With relatively fixed schedule routines, and no
privacy or user preference consideration, the public EV charging recommenda-
tion for public transits can be made completely to optimize CS resource utiliza-
tion. In general, these algorithms often leverage a global server, which monitors
arXiv:2210.12693v1 [cs.LG] 23 Oct 2022
2 Li et al.
Server
CSN
CS4
CS3
Info. of all CSs
(A)
(B)
SOC
(i) Spatiotemporal charging behavior
(ii) Inherent user preference and external reward
Day
Hour
EV
CS3
CS1 CS2 CS3 CS2
Next CS
recommendation
CS1 CS2
CS
Reward
1 2 3 4 N
CS
Preference
1 2 3 4 N
E.g., occupancy
information
Fig. 1. Driver-centered and resource-aware charging recommendation. (A) Centralized
charging recommendation enables optimized resource allocation, where bi-directional
information sharing between the sever and EVs is assumed. (B) Driver-centered charg-
ing recommendation considers user preference and external reward, where only monodi-
rectional information (e.g., the occupancy information of all CSs) sharing from the sever
to each EV is required (green dotted line). Therefore, private information of an EV,
like GPS location, is not uploaded to the server (pink dotted line).
all the CSs in a city (Fig. 1A). Charging recommendation can be fulfilled upon re-
quests for public EVs by sending their GPS locations and state of charge (SOC).
This kind of recommendation gives each EV an optimal driving and wait time
before charging. Instead of using one single global server, many servers can be
distributed across a city [1, 2] to reduce the recommendation latency for public
EVs.
Although server-centralized methods have an excellent resource-aware prop-
erty for the availability of charging for CSs, for private EVs, they rarely ac-
commodate individual user preferences of charging and even have the risk of
private data leakage (e.g., GPS location). Thus, the centralized strategy would
also impair the trustworthiness [6, 11, 12] of the charging recommendation. A
driver-centered instead of a server-centralized charging recommendation strat-
egy would be preferred for a private EV to follow its user preference without
leaking private information. In this situation (Fig. 1B), there would be a se-
quence of on-EV charging events records (when and which CS) that reflect the
personal preference of charging patterns for a private EV driver. To enable the
resource-aware property for a driver-centered charging recommendation, creating
a public platform for sharing availability of CSs is needed.
Motivated by the success of recent research on the next POI (Point Of Inter-
est) recommendation centered on each user, these studies can also be adapted to
solve the charging recommendation problem for private EVs when viewing each
CS as a POI. Different from collaborative filtering, based on the general recom-
mendation that learns similarities between users and items [10], the following
POI recommendation algorithms attempt to predict the most likely next POI
that a user will visit based on the historical trajectory [4,13,23,24,26]. Although
Coupling User Preference with External Rewards for EV Charging Rec. 3
these methods indeed model user preferences, they are neither resource-aware
nor adapted to the ever-changing external environment.
As such, a desirable charging recommender for a private EV requires: (1)
learning the user preference from its historical charging patterns for achieving
driver-centered recommendation, and (2) having a good external reward (op-
timal driving and wait time before charging) to achieve resource-aware recom-
mendation (Fig. 1 B). By treating the private EV charging recommendation as
the next POI recommendation problem, maximizing external rewards (with a
shorter time of driving and wait before charging) by exploring possible CSs for
each recommendation, reinforcement learning can be utilized. To leverage user
preference and external reward, we propose a novel charging recommendation
framework, Regularized Actor-Critic (RAC), for private EVs. The critic is based
on a resource-saving over all CSs to give a evaluation value over the prediction
of actor representing external reward, and the actor is reinforced by the reward
and simultaneously regularized by the driver’s user preference. Both actor and
critic are based on deep neural networks (DNNs).
We summarize the main contributions of this work as follows: (1) we design
and develop a novel framework RAC to give driver-centered and resource-aware
charging recommendations on-EV recommendation; (2) RAC is tailor-made for
each driver, allowing each to accommodate inherent user preference and also
adapt to ever-changing external reward; and (3) we propose a warm-up training
technique to solve the cold-start recommendation problem for new EV drivers.
2 Related Work
Next POI recommendation has attracted much attention recently in location-
based analysis. There are two lines of POI recommendation methods: (1) follow-
ing user preference from sequential visiting POIs regularities, and (2) exploiting
external incentive via maximizing the utility (reward) of recommendations.
For the first line of research, the earlier works primarily attempt to solve
the sequential next-item recommendation problem using temporal features. For
example, [13] introduces Factorizing Personalized Markov Chain (FPMC) that
captures sequential dependency between the recent and next items as well as the
general taste of a user using a combination of matrix factorization and Markov
chains for next-basket recommendation. [26] proposes a time-related Long-Short
Term Memory (LSTM) network to capture both long- and short-term sequential
influence for next item recommendation. [5] attempts to model user’ preference
drift over time to achieve a better user experience in next item recommendation.
These next-item recommendation approaches only use temporal features whereas
next POI recommendation would need to use both temporal and geospatial
features.
More recent studies of next POI recommendation not only model temporal
relations but also consider geospatial context, such as ST-RNN [7] and ATST-
LSTM [3]. [4] proposes a hierarchical extension of LSTM to code spatial and
temporal contexts into the LSTM for general location recommendation. [24] in-
4 Li et al.
troduces a spatiotemporal gated network model where they leverage time gate
and distance gate to control the effect of the last visited POI on next POI recom-
mendation. [23] extends the gates with a power-law attention mechanism with
more attention on the nearby POIs and explores the subsequence patterns for
next POI recommendation. [21] develops a long and short-term preference learn-
ing model considering sequential and context information for next POI recom-
mendation. User preference-based methods can achieve significant performance
for the following users’ previous experience; however, they are restricted from
making novel recommendations beyond users’ previous experience.
Although few studies exploit external incentive, these methods can help ex-
plore new possibilities for next POI recommendation. Charging Recommenda-
tion with multi-agent reinforcement learning is applied for public EVs [16, 22],
in which private information from each EV is inevitably required. [8] proposes
an inverse reinforcement learning method for next visit action recommendation
by maximizing the reward that the user gains when discovering new, relevant,
and non-popular POIs. This study utilizes the optimal POI selection policy (the
POI visit trajectory of a similar group users) as the guidance. As such, it is only
applicable for the centralized charging recommendation for privacy-indifferent
public transit fleets where charging events are aggregated to the central server
to learn the user group. However, this approach is not applicable to the driver-
centered EV charging recommendation problem that we are tackling since the
individual charging pattern is learned without data sharing across drivers. Be-
sides the inverse reinforcement learning approach, [25] introduces deep reinforce-
ment learning for news recommendation, and [18] proposes supervised reinforce-
ment learning for treatment recommendation. These methods are also based on
learning similar user groups thus not directly applicable to the driver-centered
EV charging recommendation task, the latter is further subject to resource and
geospatial constraints.
Despite the existing approaches utilized spatiotemporal, social network, and/or
contextual information for effective next POI recommendations, they do not pos-
sess the desirable features for CS recommendation, which are (1) driver-centered:
the trade-off between the driver’s charging preference and the external reward is
tuned for each driver, particularly for new drivers, and (2) resource-aware: there
is usually capacity constraint on a CS but not on a social check-in POI.
3 Problem Formulation
Each EV driver is considered as an agent, and the trustworthy server that col-
lects occupancy information of all the CSs represent the external ever-changing
environment. We considered our charging recommendation as a finite-horizon
MDP problem where a stochastic policy consists of a state space S, an action
space A, and a reward function r:S×A → R. At each time point t, an EV driver
with the current state st∈ S, chooses an action at, i.e., the one-hot encoding of
a CS, based on a stochastic policy πθ(a|s)where θis the set of parameters, and
receives a reward rtfrom the spatiotemporal environment. Our objective is to
摘要:

CouplingUserPreferencewithExternalRewardstoEnableDriver-centeredandResource-awareEVChargingRecommendationChengyinLi,ZhengDong,NathanFisher,andDongxiaoZhu[]DepartmentofComputerScience,WayneStateUniversity,DetroitMI48201,USA{cyli,dong,fishern,dzhu}@wayne.eduAbstract.ElectricVehicle(EV)chargingrecommen...

展开>> 收起<<
Coupling User Preference with External Rewards to Enable Driver-centered and Resource-aware EV Charging Recommendation.pdf

共16页,预览4页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:16 页 大小:884.08KB 格式:PDF 时间:2025-05-06

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 16
客服
关注