Customisable Control Policy Learning for Robotics

File Description SizeFormat 
PID5964581.pdfFile embargoed until 01 January 10000743.66 kBAdobe PDF    Request a copy
Title: Customisable Control Policy Learning for Robotics
Authors: Guo, C
Luk, W
Warren, A
Loh, QS
Levine, J
Item Type: Conference Paper
Abstract: Deep reinforcement learning algorithms integratedeep neural networks with traditional reinforcement learningmethodologies. These techniques have been developed and usedfor various applications to produce exciting results in manyfields, including robotics. However, physical robots require alarge amount of training episodes which can damage the robotif directed by immature policies. Training using simulations canserve as a viable alternative before a robot is deployed in thefield. This study addresses a computational challenge of deepreinforcement learning by developing a hardware architecturefor the Deep Deterministic Policy Gradient (DDPG) algorithm.Additionally, we identify the customisation opportunities for afull-stack development framework with reinforcement learningto discover control policies for robotic arms. Finally, we transferpolicies encoded in fixed-point numbers from our FPGA DDPGimplementation to a robotic arm to evaluate the feasibility of ourlearning platform.
Issue Date: 15-Jul-2019
Date of Acceptance: 11-May-2019
URI: http://hdl.handle.net/10044/1/70933
Publisher: IEEE
Journal / Book Title: Proceedings of International Conference on Application-Specific Systems, Architectures and Processors
Copyright Statement: This paper is embargoed until publication.
Sponsor/Funder: Engineering & Physical Science Research Council (E
Engineering & Physical Science Research Council (EPSRC)
Funder's Grant Number: 516075101 (EP/N031768/1)
EP/P010040/1
Conference Name: The 30th IEEE International Conference on Application-Specific Systems, Architectures and Processors
Publication Status: Accepted
Start Date: 2019-07-15
Finish Date: 2019-07-17
Conference Place: New York, USA
Embargo Date: publication subject to indefinite embargo
Appears in Collections:Computing