Abstract
In order to improve the reliability and economy of decentralized trade economy dynamic scheduling on e-Commerce platforms and shorten the running time of decentralized trade economy dynamic scheduling on e-Commerce platforms, a decentralized trade economy dynamic scheduling method based on the reinforcement learning algorithm is proposed. In this paper, we analyze the basic theory of the reinforcement learning algorithm, study the Q-learning algorithm, build a neural network to fit the value model, and initialize the reinforcement learning algorithm. With Markov decision process as the framework model, the optimal state behavior value function is updated by using the modeless discounted reward reinforcement learning algorithm Q-learning as the value iteration method. Gibbs distribution is used to construct exploratory random strategies to select behaviors with probability. Using the reinforcement learning algorithm and the three-layer feedforward neural network as the approximator of the state behavior value function, this paper studies the generalization of the value function faced by the decentralized trade economy dynamic scheduling of e-Commerce platforms and realizes the decentralized trade economy dynamic scheduling of e-Commerce platforms. The experimental results show that the proposed method can effectively improve the reliability and economy of the decentralized trade economy dynamic scheduling of e-Commerce platforms.