Dr. Mark Humphrys

School of Computing. Dublin City University.

Home      Blog      Teaching      Research      Contact

Online coding site: Ancient Brain

coders   JavaScript worlds


CA170      CA668      CA686

Online AI coding exercises

Project ideas

CA425 - Artificial Intelligence



Reinforcement Learning

State-space control
  1. Continuum of Autonomy
  2. State-space control
  3. RL as Pattern Classification
  4. Reinforcement Learning - Reference

Reinforcement Learning - Intro

  1. RL - The world
  2. RL - The task
  3. Exercise - long-term reward
  4. Q-learning
  5. Building up a running average
  6. How Q-learning works

Movie demo

  1. Movie demo of W-learning contains within it a demo of basic Q-learning.

Program code in C++

  1. Coding the state-space as a lookup-table
  2. Sample code for lookup-table Q-learning (Includes Boltzmann "soft max" option)

Reinforcement Learning - More

  1. Convergence

  2. The control policy
  3. Boltzmann "soft max" distribution
  4. How to make a decision probabilistically

  5. Building a model of Pxa(r)
  6. Building a model of Pxa(y)
  7. Learning rate that does not start at 1

Reinforcement Learning with Neural Networks

Reinforcement Learning with Neural Networks
  1. Neural Networks course

  2. Using a Neural Network as a generalisation in RL
  3. Q-learning with a Neural Network
  4. Using a Neural Network with RL

Multiple Minds

Multiple Minds
  1. Multi-Module Reinforcement Learning
  2. Multiple Minds in the same body - Test of Hierarchical Q-learning
  3. The general form of a Society of Mind based on Reinforcement Learning
  4. Open Issues in AI
  5. Architectures of Autonomous Agents
  6. The World-Wide-Mind

Notes on Assignment Notation

I often use   :=   for assignment to distinguish from   =   for equality.

Notes on Assignment Notation


I will hold one or two labs for the practical.


Practical - Play "X's and O's" with RL


Experiments in Adaptive State-Space Robotics, Clocksin and Moore, 1989. A simple introduction to the very idea of state-space robotic or agent control.

How to Make Software Agents Do the Right Thing: An Introduction to Reinforcement Learning, Singh et al, 1996. A simple introduction to the idea of RL.

"Reinforcement Learning: A Survey", Kaelbling et al, Journal of Artificial Intelligence Research, 4:237-285, 1996. A survey.

Action Selection methods using Reinforcement Learning. My PhD thesis, 1997, has an intro to RL.


Reinforcement Learning: An Introduction, Sutton and Barto, 1998. Also here.

Reinforcement Learning: State-of-the-Art, Marco Wiering and Martijn Otterlo (Editors), 2012.

Library categories

ancientbrain.com      w2mind.org      humphrysfamilytree.com

On the Internet since 1987.

Wikipedia: Sometimes I link to Wikipedia. I have written something In defence of Wikipedia. It is often a useful starting point but you cannot trust it. Linking to it is like linking to a Google search. A starting point, not a destination. I automatically highlight in red all links to Wikipedia and Google search and other possibly-unreliable user-generated content.