How to contact me
How to contact me
My notes contain many hyperlinks to background material.
Some students get confused about what is the core course.
The core course is anything that is linked to directly on this front page.
All other links are just background material.
THE SECTIONS IN YELLOW ARE NOT ON THE COURSE THIS YEAR.
Continuum of Autonomy
RL as Pattern Classification
Reinforcement Learning - Reference
Reinforcement Learning - Intro
RL - The world
RL - The task
Exercise - long-term reward
Building up a running average
How Q-learning works
Movie demo of W-learning
contains within it a demo of basic Q-learning.
Program code (for practical)
Coding the state-space as a lookup-table
Sample code for lookup-table Q-learning
(Includes Boltzmann "soft max" option)
[PRACTICAL CAN NOW BE LAUNCHED]
Reinforcement Learning - More
The control policy
Boltzmann "soft max" distribution
How to make a decision probabilistically
Building a model of Pxa(r)
Building a model of Pxa(y)
Learning rate that does not start at 1
Reinforcement Learning with Neural Networks
- NOT ON COURSE THIS YEAR
I often use
to distinguish from
Notes on Assignment Notation
Not WWM-based this year:
If the practical is based on the
I will hold one or two
Dates will be announced.
Practical - Play "X's and O's" with RL
Deadline - last lecture in week 12.
Same project again.
Leave it for me at school or faculty office.
Deadline 15 Aug 2014.
Experiments in Adaptive State-Space Robotics
Clocksin and Moore,
A simple introduction to the very idea of state-space robotic
or agent control.
Singh et al,
A simple introduction to the idea of RL.
How to Make Software Agents Do the Right Thing: An
Introduction to Reinforcement Learning
"Reinforcement Learning: A Survey",
Kaelbling et al,
Journal of Artificial Intelligence Research,
My PhD thesis, 1997, has an intro to RL.
Action Selection methods using Reinforcement Learning
Reinforcement Learning: An Introduction
Sutton and Barto, 1998.
Marco Wiering and Martijn Otterlo (Editors),
Reinforcement Learning: State-of-the-Art
006 - Special computer methods
006.3 - Artificial Intelligence
519 - Probabilities & applied mathematics
Sometimes I link to Wikipedia.
I need to write something in defence of this.
On the one hand, Wikipedia is deeply flawed,
so you should use all links to Wikipedia with
Many people refuse to link to it.
On the other hand, it is often clearly the best thing to link to
on a topic.
I say: Link to it, but use with caution and scepticism.
The notes are online, but you need to
go to every lecture
You will not understand this course from the notes alone.
2014 exam results summary:
0 to 10 percent: 0
10 to 20 percent: 0
20 to 30 percent: 2
30 to 40 percent: 0
40 to 50 percent: 3
50 to 60 percent: 2
60 to 70 percent: 3
70 to 80 percent: 2
80 to 90 percent: 0
90 to 100 percent: 0