Python Reinforcement Learning

Name: Python Reinforcement Learning | Solve complex real-world problems by mastering reinforcement learning algorithms using OpenAI Gym and TensorFlow
Brand: Packt Publishing
Price: 37.99 EUR
Availability: OnlineOnly

Solve complex real-world problems by mastering reinforcement learning algorithms using OpenAI Gym and TensorFlow

Ravichandiran Sudharsan Ravichandiran(Author)

Packt Publishing

Published on 18. April 2019

496 pages

E-Book

ePUB with Adobe-DRM

System requirements

978-1-83864-014-9 (ISBN)

€37.99incl. 7% vat

System requirements

for ePUB with Adobe-DRM

E-Book Single Licence

Available for download

Description

Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.

Alles über E-Books, Kopierschutz & Dateiformate finden Sie in unserem Info- & Hilfebereich.

Apply modern reinforcement learning and deep reinforcement learning methods using Python and its powerful librariesKey FeaturesYour entry point into the world of artificial intelligence using the power of PythonAn example-rich guide to master various RL and DRL algorithmsExplore the power of modern Python libraries to gain confidence in building self-trained applicationsBook DescriptionReinforcement Learning (RL) is the trending and most promising branch of artificial intelligence. This Learning Path will help you master not only the basic reinforcement learning algorithms but also the advanced deep reinforcement learning algorithms.The Learning Path starts with an introduction to RL followed by OpenAI Gym, and TensorFlow. You will then explore various RL algorithms, such as Markov Decision Process, Monte Carlo methods, and dynamic programming, including value and policy iteration. You'll also work on various datasets including image, text, and video. This example-rich guide will introduce you to deep RL algorithms, such as Dueling DQN, DRQN, A3C, PPO, and TRPO. You will gain experience in several domains, including gaming, image processing, and physical simulations. You'll explore TensorFlow and OpenAI Gym to implement algorithms that also predict stock prices, generate natural language, and even build other neural networks. You will also learn about imagination-augmented agents, learning from human preference, DQfD, HER, and many of the recent advancements in RL.By the end of the Learning Path, you will have all the knowledge and experience needed to implement RL and deep RL in your projects, and you enter the world of artificial intelligence to solve various real-life problems.This Learning Path includes content from the following Packt products:Hands-On Reinforcement Learning with Python by Sudharsan RavichandiranPython Reinforcement Learning Projects by Sean Saito, Yang Wenzhuo, and Rajalingappaa ShanmugamaniWhat you will learnTrain an agent to walk using OpenAI Gym and TensorFlowSolve multi-armed-bandit problems using various algorithmsBuild intelligent agents using the DRQN algorithm to play the Doom gameTeach your agent to play Connect4 using AlphaGo ZeroDefeat Atari arcade games using the value iteration methodDiscover how to deal with discrete and continuous action spaces in various environmentsWho this book is forIf you're an ML/DL enthusiast interested in AI and want to explore RL and deep RL from scratch, this Learning Path is for you. Prior knowledge of linear algebra is expected.

More details

Other editions

Content

Cover
Title Page
Copyright and Credits
About Packt
Contributors
Table of Contents
Preface
Chapter 1: Introduction to Reinforcement Learning
What is RL?
RL algorithm
How RL differs from other ML paradigms
Elements of RL
Agent
Policy function
Value function
Model
Agent environment interface
Types of RL environment
Deterministic environment
Stochastic environment
Fully observable environment
Partially observable environment
Discrete environment
Continuous environment
Episodic and non-episodic environment
Single and multi-agent environment
RL platforms
OpenAI Gym and Universe
DeepMind Lab
RL-Glue
Project Malmo
ViZDoom
Applications of RL
Education
Medicine and healthcare
Manufacturing
Inventory management
Finance
Natural Language Processing and Computer Vision
Summary
Questions
Further reading
Chapter 2: Getting Started with OpenAI and TensorFlow
Setting up your machine
Installing Anaconda
Installing Docker
Installing OpenAI Gym and Universe
Common error fixes
OpenAI Gym
Basic simulations
Training a robot to walk
OpenAI Universe
Building a video game bot
TensorFlow
Variables, constants, and placeholders
Variables
Constants
Placeholders
Computation graph
Sessions
TensorBoard
Adding scope
Summary
Questions
Further reading
Chapter 3: The Markov Decision Process and Dynamic Programming
The Markov chain and Markov process
Markov Decision Process
Rewards and returns
Episodic and continuous tasks
Discount factor
The policy function
State value function
State-action value function (Q function)
The Bellman equation and optimality
Deriving the Bellman equation for value and Q functions
Solving the Bellman equation
Dynamic programming
Value iteration
Policy iteration
Solving the frozen lake problem
Value iteration
Policy iteration
Summary
Questions
Further reading
Chapter 4: Gaming with Monte Carlo Methods
Monte Carlo methods
Estimating the value of pi using Monte Carlo
Monte Carlo prediction
First visit Monte Carlo
Every visit Monte Carlo
Let's play Blackjack with Monte Carlo
Monte Carlo control
Monte Carlo exploration starts
On-policy Monte Carlo control
Off-policy Monte Carlo control
Summary
Questions
Further reading
Chapter 5: Temporal Difference Learning
TD learning
TD prediction
TD control
Q learning
Solving the taxi problem using Q learning
SARSA
Solving the taxi problem using SARSA
The difference between Q learning and SARSA
Summary
Questions
Further reading
Chapter 6: Multi-Armed Bandit Problem
The MAB problem
The epsilon-greedy policy
The softmax exploration algorithm
The upper confidence bound algorithm
The Thompson sampling algorithm
Applications of MAB
Identifying the right advertisement banner using MAB
Contextual bandits
Summary
Questions
Further reading
Chapter 7: Playing Atari Games
Introduction to Atari games
Building an Atari emulator
Getting started
Implementation of the Atari emulator
Atari simulator using gym
Data preparation
Deep Q-learning
Basic elements of reinforcement learning
Demonstrating basic Q-learning algorithm
Implementation of DQN
Experiments
Summary
Chapter 8: Atari Games with Deep Q Network
What is a Deep Q Network?
Architecture of DQN
Convolutional network
Experience replay
Target network
Clipping rewards
Understanding the algorithm
Building an agent to play Atari games
Double DQN
Prioritized experience replay
Dueling network architecture
Summary
Questions
Further reading
Chapter 9: Playing Doom with a Deep Recurrent Q Network
DRQN
Architecture of DRQN
Training an agent to play Doom
Basic Doom game
Doom with DRQN
DARQN
Architecture of DARQN
Summary
Questions
Further reading
Chapter 10: The Asynchronous Advantage Actor Critic Network
The Asynchronous Advantage Actor Critic
The three As
The architecture of A3C
How A3C works
Driving up a mountain with A3C
Visualization in TensorBoard
Summary
Questions
Further reading
Chapter 11: Policy Gradients and Optimization
Policy gradient
Lunar Lander using policy gradients
Deep deterministic policy gradient
Swinging a pendulum
Trust Region Policy Optimization
Proximal Policy Optimization
Summary
Questions
Further reading
Chapter 12: Balancing CartPole
OpenAI Gym
Gym
Installation
Running an environment
Atari
Algorithmic tasks
MuJoCo
Robotics
Markov models
CartPole
Summary
Chapter 13: Simulating Control Tasks
Introduction to control tasks
Getting started
The classic control tasks
Deterministic policy gradient
The theory behind policy gradient
DPG algorithm
Implementation of DDPG
Experiments
Trust region policy optimization
Theory behind TRPO
TRPO algorithm
Experiments on MuJoCo tasks
Summary
Chapter 14: Building Virtual Worlds in Minecraft
Introduction to the Minecraft environment
Data preparation
Asynchronous advantage actor-critic algorithm
Implementation of A3C
Experiments
Summary
Chapter 15: Learning to Play Go
A brief introduction to Go
Go and other board games
Go and AI research
Monte Carlo tree search
Selection
Expansion
Simulation
Update
AlphaGo
Supervised learning policy networks
Reinforcement learning policy networks
Value network
Combining neural networks and MCTS
AlphaGo Zero
Training AlphaGo Zero
Comparison with AlphaGo
Implementing AlphaGo Zero
Policy and value networks
preprocessing.py
features.py
network.py
Monte Carlo tree search
mcts.py
Combining PolicyValueNetwork and MCTS
alphagozero_agent.py
Putting everything together
controller.py
train.py
Summary
References
Chapter 16: Creating a Chatbot
The background problem
Dataset
Step-by-step guide
Data parser
Data reader
Helper methods
Chatbot model
Training the data
Testing and results
Summary
Chapter 17: Generating a Deep Learning Image Classifier
Neural Architecture Search
Generating and training child networks
Training the Controller
Training algorithm
Implementing NAS
child_network.py
cifar10_processor.py
controller.py
Method for generating the Controller
Generating a child network using the Controller
train_controller method
Testing ChildCNN
config.py
train.py
Additional exercises
Advantages of NAS
Summary
Chapter 18: Predicting Future Stock Prices
Background problem
Data used
Step-by-step guide
Actor script
Critic script
Agent script
Helper script
Training the data
Final result
Summary
Chapter 19: Capstone Project - Car Racing Using DQN
Environment wrapper functions
Dueling network
Replay memory
Training the network
Car racing
Summary
Questions
Further reading
Chapter 20: Looking Ahead
The shortcomings of reinforcement learning
Resource efficiency
Reproducibility
Explainability/accountability
Susceptibility to attacks
Upcoming developments in reinforcement learning
Addressing the limitations
Transfer learning
Multi-agent reinforcement learning
Summary
References
Assessments
Other Books You May Enjoy
Index

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Python Reinforcement Learning

Description

More details

Other editions

Additional editions

Content

System requirements