# Deep Reinforcement Learning

- Accelerated Methods for Deep Reinforcement Learning. [`arxiv`](https://arxiv.org/abs/1803.02811)
- A Deep Reinforcement Learning Chatbot (Short Version). [`arxiv`](https://arxiv.org/abs/1801.06700)
- AlphaX: eXploring Neural Architectures with Deep Neural Networks and Monte Carlo Tree Search. [`arxiv`](https://arxiv.org/abs/1805.07440) :star:
- A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress. [`arxiv`](https://arxiv.org/abs/1806.06877) 
- Composable Deep Reinforcement Learning for Robotic Manipulation. [`arxiv`](https://arxiv.org/abs/1803.06773)
- Cooperative Multi-Agent Reinforcement Learning for Low-Level Wireless Communication. [`arxiv`](https://arxiv.org/abs/1801.04541)
- Deep Q learning for fooling neural networks. [`arxiv`](https://arxiv.org/abs/1811.05521)
- Deep Reinforcement Fuzzing. [`arxiv`](https://arxiv.org/abs/1801.04589)
- Deep Reinforcement Learning of Cell Movement in the Early Stage of C. elegans Embryogenesis. [`arxiv`](https://arxiv.org/abs/1801.04600)
- Deep Reinforcement Learning For Sequence to Sequence Models. [`arxiv`](https://arxiv.org/abs/1805.09461) [`code`](https://github.com/yaserkl/RLSeq2Seq/)
- Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods. [`arxiv`](https://arxiv.org/abs/1802.10264)
- Deep Reinforcement Learning in Portfolio Management. [`arxiv`](https://arxiv.org/abs/1808.09940) [`code`](https://github.com/qq303067814/Reinforcement-learning-in-portfolio-management-)
- Deep Reinforcement Learning using Capsules in Advanced Game Environments. [`arxiv`](https://arxiv.org/abs/1801.09597)
- Deep Reinforcement Learning with Model Learning and Monte Carlo Tree Search in Minecraft. [`arxiv`](https://arxiv.org/abs/1803.08456)
- Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes. [`arxiv`](https://arxiv.org/abs/1801.02852) [`code`](https://github.com//anonymous-author1/DDRL)
- Diversity is All You Need: Learning Skills without a Reward Function. [`arxiv`](https://arxiv.org/abs/1802.06070)
- Faster Deep Q-learning using Neural Episodic Control. [`arxiv`](https://arxiv.org/abs/1801.01968)
- Feedback-Based Tree Search for Reinforcement Learning. [`arxiv`](https://arxiv.org/abs/1805.05935)
- Feudal Reinforcement Learning for Dialogue Management in Large Domains. [`arxiv`](https://arxiv.org/abs/1803.03232)
- Forward-Backward Reinforcement Learning. [`arxiv`](https://arxiv.org/abs/1803.10227)
- Graph Convolutional Reinforcement Learning for Multi-Agent Cooperation. [`arxiv`](https://arxiv.org/abs/1810.09202)
- Hierarchical Reinforcement Learning: Approximating Optimal Discounted TSP Using Local Policies. [`arxiv`](https://arxiv.org/abs/1803.04674)
- IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures. [`arxiv`](https://arxiv.org/abs/1802.01561)
- Kickstarting Deep Reinforcement Learning. [`arxiv`](https://arxiv.org/abs/1803.03835)
- Learning a Prior over Intent via Meta-Inverse Reinforcement Learning. [`arxiv`](https://arxiv.org/abs/1805.12573)
- Meta Reinforcement Learning with Distribution of Exploration Parameters Learned by Evolution Strategies. [`arxiv`](https://arxiv.org/abs/1812.11314)
- Meta Reinforcement Learning with Latent Variable Gaussian Processes. [`arxiv`](https://arxiv.org/abs/1803.07551)
- Multi-Agent Reinforcement Learning: A Report on Challenges and Approaches. [`arxiv`](https://arxiv.org/abs/1807.09427)
- Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations. [`arxiv`](https://arxiv.org/abs/1801.10459)
- Psychlab: A Psychology Laboratory for Deep Reinforcement Learning Agents. [`arxiv`](https://arxiv.org/abs/1801.08116)
- Recommendations with Negative Feedback via Pairwise Deep Reinforcement Learning. [`arxiv`](https://arxiv.org/abs/1802.06501)
- Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review. [`arxiv`](https://arxiv.org/abs/1805.00909)
- Reinforcement Learning from Imperfect Demonstrations. [`arxiv`](https://arxiv.org/abs/1802.05313)
- Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application. [`arxiv`](https://arxiv.org/abs/1803.00710)
- RUDDER: Return Decomposition for Delayed Rewards. [`arxiv`](https://arxiv.org/abs/1806.07857) [`code`](https://github.com/ml-jku/baselines-rudder)
- Semi-parametric Topological Memory for Navigation. [`arxiv`](https://arxiv.org/abs/1803.00653) [`tensorflow`](https://github.com/nsavinov/SPTM)
- Shared Autonomy via Deep Reinforcement Learning. [`arxiv`](https://arxiv.org/abs/1802.01744)
- Setting up a Reinforcement Learning Task with a Real-World Robot. [`arxiv`](https://arxiv.org/abs/1803.07067)
- Simple random search provides a competitive approach to reinforcement learning. [`arxiv`](https://arxiv.org/abs/1803.07055) [`code`](https://github.com/modestyachts/ARS)
- Unsupervised Meta-Learning for Reinforcement Learning. [`arxiv`](https://arxiv.org/abs/1806.04640)
- Using reinforcement learning to learn how to play text-based games. [`arxiv`](https://arxiv.org/abs/1801.01999)