Dr. Mark Humphrys

School of Computing. Dublin City University.

Home      Blog      Teaching      Research      Contact

Search:

CA249      CA318      CA425      CA651

w2mind.computing.dcu.ie      w2mind.org


Mark Humphrys - Teaching - CA425


Artificial Intelligence



Course Descriptor




How to contact me

See How to contact me.


Notes

My notes contain many hyperlinks to background material. Some students get confused about what is the core course. The core course is anything that is linked to directly on this front page. All other links are just background material.


THE SECTIONS IN YELLOW ARE NOT ON THE COURSE THIS YEAR.


  1. Background

    1. Introduction to AI
    2. Survey of AI
    3. History of AI

    4. AI Links
    5. Robotics Links


  2. State-space control
    1. Continuum of Autonomy
    2. State-space control
    3. RL as Pattern Classification
    4. Reinforcement Learning - Reference


  3. Reinforcement Learning - Intro
    1. RL - The world
    2. RL - The task
    3. Exercise - long-term reward
    4. Q-learning
    5. Building up a running average
    6. How Q-learning works


  4. Movie demo
    1. Movie demo of W-learning contains within it a demo of basic Q-learning.

  5. Program code (for practical)
    1. Coding the state-space as a lookup-table
    2. Sample code for lookup-table Q-learning (Includes Boltzmann "soft max" option)

    
    
    
    [PRACTICAL CAN NOW BE LAUNCHED]
    
    
    
  6. Reinforcement Learning - More
    1. Convergence

    2. The control policy
    3. Boltzmann "soft max" distribution
    4. How to make a decision probabilistically

    5. Building a model of Pxa(r)
    6. Building a model of Pxa(y)
    7. Learning rate that does not start at 1


  7. Reinforcement Learning with Neural Networks (Pre-requisite needed.) - NOT ON COURSE THIS YEAR

    1. Neural Networks (Revision)
    2. Using a Neural Network as a generalisation in RL
    3. Q-learning with a Neural Network
    4. Using a Neural Network with RL


  8. Multiple Minds

    1. Multi-Module Reinforcement Learning
    2. Multiple Minds in the same body - Test of Hierarchical Q-learning
    3. The general form of a Society of Mind based on Reinforcement Learning
    4. Open Issues in AI
    5. Architectures of Autonomous Agents
    6. The World-Wide-Mind



Notes on Assignment Notation

I often use   :=   for assignment to distinguish from   =   for equality.

Notes on Assignment Notation



Labs

Not WWM-based this year:

If the practical is based on the WWM server, I will hold one or two hands-on labs. Dates will be announced.



Practical

Practical - Play "X's and O's" with RL

Deadline - last lecture in week 12.




Reading

Experiments in Adaptive State-Space Robotics, Clocksin and Moore, 1989. A simple introduction to the very idea of state-space robotic or agent control.

How to Make Software Agents Do the Right Thing: An Introduction to Reinforcement Learning, Singh et al, 1996. A simple introduction to the idea of RL.

"Reinforcement Learning: A Survey", Kaelbling et al, Journal of Artificial Intelligence Research, 4:237-285, 1996. A survey.

Action Selection methods using Reinforcement Learning. My PhD thesis, 1997, has an intro to RL.


Books

Reinforcement Learning: An Introduction, Sutton and Barto, 1998. Also here.

Reinforcement Learning: State-of-the-Art, Marco Wiering and Martijn Otterlo (Editors), 2012.



Library categories




Wikipedia

Sometimes I link to Wikipedia. I need to write something in defence of this.

On the one hand, Wikipedia is deeply flawed, so you should use all links to Wikipedia with extreme caution. Many people refuse to link to it.

On the other hand, it is often clearly the best thing to link to on a topic. I say: Link to it, but use with caution and scepticism.



Mark calculator

Exam (70): Practical (30):

Total mark:      




Exam results

The notes are online, but you need to go to every lecture. You will not understand this course from the notes alone.

2014 exam results summary:



Feeds      HumphrysFamilyTree.com

Bookmark and Share           On Internet since 1987.