Dr. Mark Humphrys

School of Computing. Dublin City University.

Home      Blog      Teaching      Research      Contact

My big idea: Ancient Brain


CA216      CA249      CA318

CA400      CA651      CA668

CA425 - Artificial Intelligence

Quickest way to find this web page:
Google ca425 dcu
Think about it: All other ways (starting at the CA site, Googling my name, etc.) will take much longer to get here.
Just Google module code and DCU and it is the first hit.

Course Descriptor

How to contact me

See How to contact me.


My notes contain many hyperlinks to background material. Some students get confused about what is the core course. The core course is anything that is linked to directly on this front page. All other links are just background material.


  1. Background

    1. Introduction to AI
    2. Survey of AI
    3. History of AI

    4. AI Links
    5. Robotics Links

  2. State-space control
    1. Continuum of Autonomy
    2. State-space control
    3. RL as Pattern Classification
    4. Reinforcement Learning - Reference

  3. Reinforcement Learning - Intro
    1. RL - The world
    2. RL - The task
    3. Exercise - long-term reward
    4. Q-learning
    5. Building up a running average
    6. How Q-learning works

  4. Movie demo
    1. Movie demo of W-learning contains within it a demo of basic Q-learning.

  5. Program code (for practical)
    1. Coding the state-space as a lookup-table
    2. Sample code for lookup-table Q-learning (Includes Boltzmann "soft max" option)

  6. Reinforcement Learning - More
    1. Convergence

    2. The control policy
    3. Boltzmann "soft max" distribution
    4. How to make a decision probabilistically

    5. Building a model of Pxa(r)
    6. Building a model of Pxa(y)
    7. Learning rate that does not start at 1

  7. Reinforcement Learning with Neural Networks (Pre-requisite needed.) - NOT ON COURSE THIS YEAR

    1. Neural Networks (Revision)
    2. Using a Neural Network as a generalisation in RL
    3. Q-learning with a Neural Network
    4. Using a Neural Network with RL

  8. Multiple Minds

    1. Multi-Module Reinforcement Learning
    2. Multiple Minds in the same body - Test of Hierarchical Q-learning
    3. The general form of a Society of Mind based on Reinforcement Learning
    4. Open Issues in AI
    5. Architectures of Autonomous Agents
    6. The World-Wide-Mind

Notes on Assignment Notation

I often use   :=   for assignment to distinguish from   =   for equality.

Notes on Assignment Notation


Not WWM-based this year:

If the practical is based on the WWM server, I will hold one or two hands-on labs. Dates will be announced.


Practical - Play "X's and O's" with RL

Deadline - last lecture in week 12.


Experiments in Adaptive State-Space Robotics, Clocksin and Moore, 1989. A simple introduction to the very idea of state-space robotic or agent control.

How to Make Software Agents Do the Right Thing: An Introduction to Reinforcement Learning, Singh et al, 1996. A simple introduction to the idea of RL.

"Reinforcement Learning: A Survey", Kaelbling et al, Journal of Artificial Intelligence Research, 4:237-285, 1996. A survey.

Action Selection methods using Reinforcement Learning. My PhD thesis, 1997, has an intro to RL.


Reinforcement Learning: An Introduction, Sutton and Barto, 1998. Also here.

Reinforcement Learning: State-of-the-Art, Marco Wiering and Martijn Otterlo (Editors), 2012.

Library categories


Sometimes I link to Wikipedia. I need to write something in defence of this.

On the one hand, Wikipedia is deeply flawed, so you should use all links to Wikipedia with extreme caution. Many people refuse to link to it.

On the other hand, it is often clearly the best thing to link to on a topic. I say: Link to it, but use with caution and scepticism.

Mark calculator

Exam (70): Practical (30):

Total mark:      

Exam results

The notes are online, but you need to go to every lecture. You will not understand this course from the notes alone.

2014 exam results summary:

Feeds      w2mind.org      ancientbrain.com

On the Internet since 1987.