Dr. Mark Humphrys

School of Computing. Dublin City University.

Home      Blog      Teaching      Research      Contact

Search:

CA216      CA249      CA318

CA400      CA651      CA668


WWM - Part 1 - Part 2




Part 2 - The World-Wide-Mind


2. The World-Wide-Mind

The proposed scheme to address these issues is called the "World-Wide-Mind" (WWM). The name has a number of meanings, as will be discussed. In this scheme, it is proposed that researchers construct their agent minds and their agent worlds as servers on the Internet.

Types of servers

In the basic scheme, there are the following types of server:

  1. A World and Body server together. This server can be queried for the current state of the world:   x   as detected by the body, and can be sent actions:   a   for the body to perform in the world.

  2. A Mind server, which is a behavior-producing system, capable of suggesting an action:   a   given a particular input state:   x.   Note this does not mean it is stimulus-response. It may remember all previous states. It may take actions independent of the current state, according to its own independent plans. The Mind server may work by any AI methodology or algorithm and may contain within itself any degree of complexity. It may itself call other Mind servers. In the latter case we call it a MindM server to reflect the fact that it communicates with another server or servers (which may themselves communicate with many more servers) unknown to the client.

  3. A special type of Mind server (in fact a special type of MindM server):

    1. An Action Selection or AS server (or MindAS server), which resolves competition among multiple Mind servers. Each Mind server   i   suggests an action   ai   to execute. The AS server produces a winning action   ak.   This may be one of the suggested actions ("winner-take-all") or it may be a new, compromise action [PhD, §14.2].

      The client talks to the AS server to get an action, and the AS server talks to its list of Minds, using whatever queries it needs to resolve competition (perhaps repeated queries). Because it can produce an action   a   given a particular input state   x,   the AS server is in fact a type of Mind server itself. This is why we call it a "MindAS" server.

      But it may be a very different thing from the other Mind servers. The other servers actually try to solve some problem or pursue some goal. Whereas the MindAS server may be simply a generic competition-resolution device, that can be applied to any collection of Minds without needing any actual understanding of what the Minds are doing. We have 2 possibilities:

      1. The list of Minds is hard-coded into the MindAS server.
      2. The list of Minds is provided as an argument to the MindAS server at startup.
      In either case, typically all the actual problem-solving intelligence (suggestions of actions to execute) resides in the Minds.

It is imagined that each of these types of server may be customised with a number of parameters or arguments at startup, so that a client may ask for slightly different versions of what is essentially the same server. We have already seen possible arguments for a MindAS server (the list of Minds). Other possible server arguments will be discussed later.

Types of Societies

By allowing Mind servers call each other we can incrementally build up more and more complex hierarchies, networks or societies of mind. We will call any collection of more than one Mind server acting together a Society. A Society is built up in stages. At each stage, there is a single Mind server that serves as the interface to the whole Society:

  1. A MindM server calls other Mind servers. To run this Society you talk to the MindM server.
  2. A MindAS server adjudicates among multiple Mind servers. To run this Society you talk to the MindAS server.

and so on, recursively. At each stage, there is a single Mind server which serves as the interface to the Society, to which we send the state, and receive back an action. For example, in the simplest type of Society, the client talks to a single Mind and World server, who talk to no one else:

  1. client talks to:
    1. Mind
    2. World

In a more complex society, the Mind server that the client talks to is itself talking to other Mind servers:

  1. client talks to:
    1. MindM, which talks to:
      1. Mind
      2. Mind
    2. World

or:

  1. client talks to:
    1. MindAS, which talks to:
      1. Mind
      2. Mind
      3. Mind
    2. World

and so on, with an endless number of combinations, e.g.:

  1. client talks to:
    1. MindM, which talks to:
      1. Mind
      2. MindM, which talks to:
        1. Mind
      3. MindAS, which talks to:
        1. Mind
        2. MindM, which talks to:
          1. Mind
        3. Mind
    2. WorldW, which talks to:
      1. World

(We will discuss "WorldW" servers later.)

Each Society has precisely one Mind server at the top level. This makes sense because the competition must get resolved somewhere, so that an action is produced. And the client can't resolve it. So it must be resolved by the time it gets to the client.

Types of users

There are the following types of users of this system:

  1. A non-technical Client user - essentially any user on the Internet. Basically, the client will run other people's minds in other people's worlds. Without needing any technical ability, the client will be able to do the following:

    1. Pick one Mind server to run in one World. Even this apparently simple choice may be the product of a lot of hard work - in picking 2 suitable servers that work together, and choosing suitable arguments or parameters for each server. So it is suggested that the client can present the results of this work for others to use at some URL. In this case, no new server is created, but rather a "link" to 2 existing servers with particular arguments is set up. And at that link, the client may promote it, explain why it works, etc. User-friendly software will make it as easy as possible for the non-technical to both experiment with different combinations, and link to the result.

    2. At a more advanced level, even a non-technical client may be able to construct a Society. For instance a client may select a combination of remote Mind servers, a remote Action Selection server to resolve the (inevitable) competition between these multiple minds, and finally select a remote Body/World server to run this Society in. To be precise, what the client does is: Pick a MindAS server, pass it a list of Mind servers to adjudicate, and then simply pick a World to run the MindAS server in. The particular combination of Mind servers chosen may be the product of a lot of hard work (in searching for good combinations). So again it is suggested that the client can present the results of this work for others to use at some URL. Again, no new server is created, but rather a "link" to 2 existing servers with particular arguments is set up. User-friendly software will make it as easy as possible for the non-technical to both experiment with different combinations, and link to the result.

      This is like constructing a new Mind server: When a client constructs a new combination of Mind servers like this, it looks very much as if a new Mind server has been created, with totally new behaviour. But in fact it has been done by supplying new parameters to an existing MindAS server. Admittedly some of these parameters (the addresses of other Mind servers) may not have existed when the MindAS server was written, and the choice of the combination may be an extremely creative act.

    In the above, the client does not necessarily need to know anything about how the servers work, or even anything much about AI. The client just observes that certain combinations work very well, and others don't. The role that large numbers of clients acting over long periods may play in experimentation and artificial selection may be very important, as will be discussed in detail later.

  2. A technically-proficient server author - again any user on the Internet, if they have the technical ability (and also access to a machine to host their new server). They will need to understand how to construct a server, but their understanding of AI does not necessarily have to be profound.

    As we have seen above, just specifying an existing server with new parameters can be very much like creating a totally new Mind server with new behaviour. The server author, however, will be writing an actually new server. For example:

    1. A technically-proficient server author could write a wrapper around an existing, working Mind server, i.e. Write a new MindM server. The most simple form of wrapper would not provide any actions itself, but would just selectively call other servers. For instance, the server author observes that one Mind server tends to perform better in one area of the input space, and another server performs better in a different area. The server author could then experiment with writing a wrapper of the form: "If input is a particular x then do whatever the Mind server M1 does - otherwise do whatever the Mind server M2 does." The author needs little to no understanding of how either server works, yet still might be able to create a MindM server that is better than either Mind server itself. For a discussion of such "Nested" systems see [PhD, §18].

    2. An AI-proficient server author might try writing a MindM server that attempts to provide some actions itself. For example, a server that only modifies the existing behaviour in some area of the inputs, such as: "If input is a particular x then take this action a - otherwise do whatever the old server does." The author may need little understanding of how the existing Mind server works. If overriding it in one area of the input space doesn't work (doesn't perform better) he may try overriding it in a different area.

    3. At the most advanced level, AI researchers would write their own servers from scratch. But it is envisaged that even AI researchers will find it useful (in fact, probably essential) to write limited wrappers around other people's servers whose insides they don't fully understand (or want to understand).

That is the basic WWM idea. How this scheme will work, and what does it mean, will now be discussed in detail.


Using other people's agent worlds

First, we note that, among many other things, the scheme is trying to define a protocol whereby researchers can use each other's agent worlds. Not by installing the world at your own site, but by leaving the world running on a server at its author's site, and using it remotely. One issue we will have to solve is how does the client see what is going on in the remote world where he is running his selected mind combination?

No user interface

Note the client may not wish to see what is happening - he may only want a report back at the end as to how well his agent did - for example, if he is running an automated evolutionary search of different combinations. For example, the client, instead of watching his robot picking up cans, may just want to know how many cans it picked up over the course of a run. In fact, in this case the client may be able to calculate how many cans his robot picked up by examining the state of the world after every action, and keeping his own running total as he goes along. In which case no extra report back from the World server would be required. For an example of an automated search with no user interface see [PhD, §4.1.1].

Perhaps the World server provides a URL where the client can see in real time what is going on, if he wants to. If this URL is not connected to, no user interface is displayed.

Using other people's agent minds

The very definition of this system may seem strange to the reader. We are taking for granted that simply being able to match a single Mind server with a World server would be of limited use. Instead we are building a system where clients can put multiple minds in the same body. The problem with these multiple minds is that none of them are aware of the others' existence, and each is designed to control the body on its own. Instead of allowing them do that, we simply take their wishes into account when running the Action Selection server. For an introduction to these kind of models see [Humphrys, 1997]. The Adaptive Behavior audience will be long familiar with these kind of multiple-mind models. Other readers may find them very wasteful and unusual.

Yet, we argue, if 3rd parties are to construct societies of mind using others' components, it will be impossible to prevent massive overlap in function and conflict between these components. The realistic approach is to accept overlap as inevitable and work on conflict-resolution methods (the AS servers). The AS server will try to allow the expression at different times of most or all of the goals pursued by each mind. With some totally-conflicting goals this may not be possible, so the human clients will finish the job, by looking for specific combinations of minds whose conflicts, when resolved by the AS algorithm, results in a useful overall multi-goal creature.

The alternative to a multiple-conflicting-minds model would be to get the components to agree on their function in advance, which is totally impractical.

This is not to say that a group of Mind servers cannot agree to divide up function in advance, and the agent mind can consist of a single headquarters MindM server that knows about this division, and that calls each Mind server, carefully switching control from Mind to Mind. This is possible, but may be very rare if Societies are to be constructed by large numbers of researchers. The model cannot assume in general that minds make any allowance for the existence of other minds. Users will construct combinations of new minds and old minds, where the new minds did not even exist when the old minds were being written. For instance, even with a perfect division of function like the collection we just described, no sooner will it be put online than users will start constructing societies involving it and some new Minds. And then we are back to an Action Selection problem again.

In fact, as has been discussed elsewhere - see [PhD, §18.3], and also the "brittleness" criticism of classic AI [Holland, 1986] - wastefulness and multiple unexpressed ideas is generally a sign of intelligence rather than the opposite. When it comes to intelligence, parsimony may be the enemy, rather than wastefulness.




3. Further issues on agent minds

MindAS server queries the Mind servers (not Client)

One question then is how complex does the top-level client algorithm have to be? How many servers does it have to talk to? For instance, when there is a collection of Mind servers and an AS server, the client could talk to the Mind servers itself, gather the results and repeatedly present them to the AS server for resolution.

However, when we consider the vast number of possible algorithms for resolving competition, some involving multiple queries of each Mind server with different suggestions, it seems more logical for the client to pass the list of Mind servers to the AS server at startup, and then let the AS server query them itself, i.e. let the complexity of the competition-resolution algorithm be buried in the AS server rather than in the client. In our model, the client manages a single Mind server and a single World server. Which leads to the next question:

Client talks to the World (not Mind server)

Perhaps the client should pass the World server address as an argument to the top-level Mind server, and let the Mind talk to the World directly.

One reason we do not do this is that - unlike the situation with the AS server talking to the Mind servers - the interaction between the Mind server and World server is not bounded - the run of this Mind in this World may go on indefinitely. It seems better to have this logic in the client rather than in the server. The servers respond to short queries, and the client is responsible for how many such queries there are, and what the overall control algorithm is, e.g. implementing time-outs and repeated queries if servers do not respond. Or imagine a client where a user is watching the user interface of an infinite run, and deciding, by clicking a button, when to issue the "End run" command.

Another reason we may prefer to define the server queries, and then let a separate algorithm control how many queries take place, is that we may like two Minds to communicate with each other in a conversation, in which case each Mind serves as the World for the other. Instead of redefining the Mind server so it can respond to questions such as "Get state", the client algorithm manages this, querying the Mind server for an action, and then sending that action as the state for another Mind server. Imagine two chat programs connected together. Or a "tit-for-tat" online competition [Axelrod and Hamilton, 1981, Axelrod, 1984].

Low-bandwidth communication

Crucial to the whole scheme of using minds from diverse sources is that we do not impose any restrictions on the type of Mind servers that can be written. Minds can be written in any language and according to any methodology. Minds do not have to explain themselves or how they work. Minds are hidden binaries on remote servers. Minds cannot know about each other's goals or insides. Or, to be precise, some minds may know how other Mind servers work (and act accordingly), but we cannot in general demand this.

If Mind servers do not understand each other, then they can only communicate by low-bandwidth attempts. That is, there is a limit to how much information they can usefully communicate to each other or to the AS server. If, as has already been argued, it will be impossible to prevent large-scale conflicts between Minds, it is the AS server that has to resolve the competition between these strangers who speak no common language. The central question is: What information does the AS server need from the Mind servers to resolve the competition? For example, if it just gets a simple list of their suggested actions:   ai   it seems it could do little more than just pick the most popular one (if any appears twice). If none appears twice, it seems it could only pick a random one. Perhaps the AS server rotates around the Minds, allowing each free reign for a time in rotation. Any such time-based Action Selection scheme will be very crude [PhD, §15.1].

Numeric communication - Q-values and W-values

For any more sophisticated Action Selection than the above, it seems that the Mind server needs to provide more information. We will first consider schemes where the servers pass simple numeric quantities around to resolve the competition, yet still do not have to understand each other's goals.

For example, Mind server   i   may tell the AS server what action   ai   it wants to take, plus a weight   Wi   expressing how important it is for them to win the competition on this timestep. This can be seen as a "payment" in a common currency to try to win the Action Selection (see the "Economy of Mind" ideas of [Baum, 1996]). Or perhaps the AS server, in a bid to resolve conflict, could suggest to all the Mind servers a compromise action   ak   and each Mind server could reply with another weight   Wi   illustrating how much they would dislike taking that action (assuming it is not the action they were originally suggesting). We may define the following weights:

  1. The "Q-value" defines how good this action is in pursuit of the Mind server's goal, i.e. expected reward or benefit from this action. Mind server   i   might build up a table   Qi(x,a)   showing the expected value for each action in each state.

  2. The "W-value" defines how bad it would be for this Mind server to lose the competition on this timestep, and have another action taken. This rather depends on what action will be taken if it loses. Mind server   i   may maintain a table   Wi(x)   defining how bad it is to lose (or how much it will "pay" to win) in each state. Or it may judge the badness of a specific action   a   by the quantity:   Qi(x,ai) - Qi(x,a).  

The usage of Q and W comes from [Humphrys, 1997]. High Q does not imply high W. Q could be high and yet W = 0. For the differences between Q and W see [PhD, §5.5, §6.1, §16.2]. Given a set of Q-values, there are many possible schemes for deriving W-values (discussed throughout [Humphrys, 1997]). The WWM server queries defined in this paper will allow all of these numeric schemes to be implemented.

Higher-bandwidth communications than this would seem difficult if we are not to impose some structure on what is inside each Mind server. Hence I will begin the WWM implementation with a sub-symbolic, numeric Society of Mind, rather than a symbolic one.

The role of MindM servers

Competition resolution, however, does not all have to be done by AS servers looking at Q-values and W-values. Much of the work of combining multiple minds will also be done by hand-coded MindM servers, which state explicitly who wins in which state:

"As long as the input state is in the region   X   do what Mind server no. 4006 wants to do,
otherwise, if the input state is in the region   Y   do what Mind 33000 wants to do,
otherwise (for all other states) give me a "strong" Mind 8000 (i.e. it has lots of currency to spend) and a "weak" Mind 11005, and let them compete under AS server 300."

In this case the knowledge that Mind server no. 4006 should always "win" the competition when the input state is in the region   X   is something that the server author has hard-coded. The Mind server did not have to convince the other competing minds (or the AS server) of this fact. In general, a Mind server can implement any general-purpose algorithm that interprets the incoming state, and can call another Mind server at any point in the algorithm. For a discussion of such "Nested" systems see [PhD, §18]. One issue in nested servers will be the possibility of a circular call, leading to an infinite loop. It is not proposed to build anything into the protocol to prevent this. It is up to the server authors to prevent. [PhD, §18.1] discusses infinite loops in a society of mind.

To simplify, the above algorithm would be written as a MindM server whose "New run" command would be:

 send "New run" command to M1
 send "New run" command to M2
 send "New run" command to M3 with arguments (strong M4, weak M5) 

where the server   M3   is a MindAS server that can be passed its list of servers as arguments at startup. Then the MindM server's "Get action" command (with argument   x)   would be:

      if x in region X1 send "Get action" command to M1 with argument x
 else if x in region X2 send "Get action" command to M2 with argument x
                   else send "Get action" command to M3 with argument x

M3's "Get action" command sends a "Get action" command with argument   x   to both M4 and M5, and then makes a decision based on what they return, perhaps querying them further before deciding.

What is the definition of state and action?

We have so far avoided the question of what is the exact data structure that is being passed back and forth as the state or action. It seems that this definition will be different in different domains. This scheme proposes that we allow different definitions to co-exist. Each server will explain the format of the state and action they generate and expect (most of the time this will just involve linking to another server's definition).




4. Further issues on agent worlds

Why not separate World and Body servers?

The above model simplified things by having a joint World-Body server. Why not separate further into World servers and Body servers?

The fact is that a World server is a Body server. The world only "exists" in so far as it is perceived by the senses. Imagine if we did split the model into World servers and Body servers. The World server would respond to requests for the state of the world with an output   x   and would be sent inputs   a   for execution. The Body server - would do the same thing. It would respond to requests for the state of the world as perceived by the senses with an output   x   and would be sent inputs   a   for execution. From the point of view of the client, the two servers are the same type of thing. It's a matter of style which they advertise themselves as.

But there is still a problem with this model of joint Mind-Body servers. It seems to indicate that only the World author can write Bodies - which goes against the philosophy of the WWM, where 3rd parties can write add-ons without permission. The question is: How do we write new Bodies for the same World (if we are not the World author)?

The answer is that it depends what we mean by writing "new Bodies" for the same World. There is a limit to what we can do with someone else's World without writing our own. The world may have a fixed number of actors - say, one (e.g. a robotic world) - and what we are meant to do is provide the mind for the single actor. We cannot will another actor into existence, so all we can do is give the single actor a different type of Body than the Body that the World server has given it by default.

And even then, we cannot give it senses that the World has not already given it. All we can do is write Bodies that sense a sub-space of the original state   x   provided by the World.

Changing the Body for the World

So when inventing new Bodies, we write them as new World-Body servers, which are wrappers around the old World-Body server. We will call such a World server a WorldW server to reflect the fact that it communicates with another server (which may itself communicate in a chain of World servers) unknown to the client.

The client only sees the senses as presented by the wrapper WorldW server, and sends its actions to the WorldW server too. The WorldW server talks to the "real" World server. Presumably to it it seems like just another client, requesting   x   and reporting   a.   And to the client, the WorldW server looks like just another World server. The senses that the WorldW server presents to the client are strictly a subset of the raw senses presented by the original World server.

Working with only a sub-space of the original state   x   may sound restrictive, but remember that in general, a "World-Body" server just provides a stream of output   x   when queried about the state of the world. We have not really defined what this output should be, and in fact people may write World servers whose outputs   x   are deliberately designed not to be used as the raw senses of any agent, but rather are designed to be filtered by many different species of 3rd-party Body, in wrapper WorldW servers. Note that even having a long chain of WorldW servers does not break our basic model that the client deals with a single top-level Mind server and a single top-level World server.

Multiple Bodies in the same World

Having accepted that a 3rd party cannot necessarily control the number of actors in a World, how would the World server itself deal with multiple actors?

The joint World-Body model is no restriction

In conclusion, our joint World-Body model is no (or at least, little) restriction for a 3rd party:

  1. New Bodies: We can change the Body for the World (within limits) by writing a wrapper WorldW server.
  2. Heterogenous Bodies: We can write many different versions (within limits) of these WorldW wrapper servers, thus creating many different possible Bodies for the same World.
  3. Multiple Bodies in Same World: We can add many Bodies to the same instance of the World, if the World permits it, by using multiple clients.
  4. Multiple Heterogenous Bodies in Same World: And since our clients can connect each to different wrapper WorldW servers, we can add many Bodies of different types (within limits) to the World.

What if the Mind cannot make sense of the World?

If we are to allow 3rd parties place Minds in any World-Body they like, and run the result, we must accept that the Mind server may not be able to make sense of the chosen World. For instance the format of the incoming state   x   may be different to the format the Mind server was expecting. In fact, far from being a special case, this will probably be true of 99 percent of all combinations that a client could choose.

The basic scheme is that we allow this. The 3rd parties need to be given total freedom in their experiments in artificial selection. Combinations that don't work are not a problem. The 3rd parties will only of course advertise the combinations that work. World authors will explain the structure of the state   x   that they generate, and Mind authors will document the structure of the   x   they expect, and further servers and clients will act accordingly.

Real robots

As mentioned, this model could even be used to control real physical robots or other real hardware. There are already a number of robots that can be controlled over the Internet. For an introductory list see the [Yahoo list of robots online]. "Internet tele-robotics" raises some special issues:

  1. We may want a scheme where only one client can control the robot at a time. Whereas with a software-only world one can always allow multiple clients (e.g. by creating a new instance of the world and body for each client). If the robot allows only one client at a time, we need WWM server commands to start a run (block all other clients) and stop a run (mark free for another client).

  2. The robot owner may want to restrict who is able to run a mind on his machine, since some control programs may cause damage. Methods of security and payment may therefore need to be integrated.

  3. A virtual world server requires little or no maintenance. The author can put the virtual world up on the server and then forget about it. It runs forever, servicing clients, even perhaps after the author has left the site.

    A robotic world server, however, demands much more of a commitment. You need to maintain the hardware, recharge batteries, supervise use of it, repair damaged parts of the robot or things in the world, tidy up objects in the world that have been scattered or put out of place, perhaps only allow client access when someone is watching, and so on. It is a lot more of a commitment. You need to fund it and set aside a room for it. As a result most of the Internet-controlled robots so far have run for a limited time only. Longer projects will presumably be possible with payment.

These issues have already been encountered in the first experimental Internet robots. For a discussion of the issues see [Taylor and Dalton, 1997]. For example, [Stein, 1998] allows remote control of the robot until the client gives it up, or until a timeout has passed (for clients that never disconnect). [Paulos and Canny, 1996] operate in a special type of problem space where each action represents the completion of an entire goal, and so actions of different clients can be interleaved.

Ken Goldberg and colleagues have operated a number of Internet telerobotics projects. In the first "tele-excavation" project, clients queued for access to the robot. In the robotic tele-garden [Goldberg et al., 1996] users could submit discrete requests at any time, which were executed later by the robot according to its own scheduling algorithm. [Simmons et al., 1997] do something similar. The robotic Ouija board [Goldberg et al., 2000] is a special type of problem where the actions of multiple clients can be aggregated. We will define the WWM server commands in this paper with a view to being able to implement all of these systems.

Time

The nature of time in this system is interesting. The agent senses a state of the world   x   and then talks to perhaps a large number of servers in order to find an action   a   to execute. Some of these servers may be down. Others may take a long time to respond. The question is: What happens to the world while the agent is waiting to act? It seems there are 2 possibilities:

  1. Synchronous World - The world waits for the agent's next action before it makes the transition to the next state.
  2. Asynchronous World - The world changes state according to its own timetable, independent of the agent. If the agent wants to behave adaptively, it must sample the state of the world often enough. In this case, what the world server provides is a window onto an independently-existing, changing world.

We consider which of these would be used in each type of world:


The name "The World-Wide-Mind"

The name "The World-Wide-Mind" makes a number of important points:

  1. The mind stays at the server: The name highlights the fact that the mind is not copied but rather stays at the server. We believe that the Web, by allowing documents remain at the remote server, and accessing them remotely, provides an outstanding example of reuse of data that is applicable to reuse of software as well. Under the "Web-like" model of software reuse, instead of the complexity of installing your own copy, upgrading version 4.0 run-time libraries to version 5.0 libraries, and so on, you link to a remote service. In a Society of Mind constructed according to this principle, the mind will be literally decentralised across the world, with parts of the mind at different remote servers. Hence the name.

  2. Parts of the mind are separate from each other: The name also highlights that the important thing is not the separation of mind from world, but the separation of different parts of the mind from each other, so that, for example, they can be written and maintained by different authors.

  3. This is separate from the Web: The name also indicates that this is a different thing to the World-Wide-Web. During the recent rise of the Internet, many people have talked about seeing some sort of "global intelligence" emerge. For a survey see [Brooks, 2000]. But these writers are in fact talking about the intelligence being embodied in the humans using the network, plus the pages they create [Numao, 2000], or at most perhaps the intelligence being embodied implicitly in the hyperlinks from page to page [Heylighen, 1997, Goertzel, 1996]. Claims that the network itself might be intelligent are at best vague and unconvincing analogies between the network and the brain [Russell, 2000]. Indeed the whole idea of "global intelligence" has had a bad press (rightly, I believe) since the days of Teilhard de Chardin. It is simply incorrect that a large number of individuals in communication will simply "emerge" as a mind. Not every society is a mind. Minds are highly structured things, and need to be deliberately constructed, as this paper will attempt to do.

    For a real society of mind or network mind, we need a network of AI programs rather than a network of pages and links. We may actually be able to implement this on the existing network of Web servers, running over HTTP, but it must be designed as such. It will not simply "emerge" from schemes for linking data on the Web.

  4. This may not even interact with the Web: By separating this from the Web, the name also separates this from existing work that might go under the name of "AI on the Web", namely, AI systems learning from the Web. There are many such systems, the most impressive perhaps being the citation indices CiteSeer [Lawrence et al., 1999] and Cora [McCallum et al., 2000]. The "global intelligence" researchers who have concrete models as opposed to just metaphors [Heylighen, 1997, Goertzel, 1996] have also started by looking at the existing Web.

    But a WWM system is not necessarily interested in learning from or interacting with the current Web or its users. We are embedding the WWM in the network not so much because of the prior existence of the Web (though we may make use of that), but mainly because of the future potential of other WWM servers.




5. How the WWM will be used in AI

We envisage of course that progress in AI will continue to be driven mainly by professional researchers in AI laboratories. But if these researchers write their AI minds and worlds as servers, the AI project could be massively decentralised across different AI labs. As well as allowing AI labs to share work, and specialise, there would also be a world-wide experiment always on, with constant re-combination and testing (in public) of everything that has been put online so far.

Dividing up the work in AI

First of all, AI researchers can more easily specialise. Minds are separate from Worlds, so that researchers can reuse each other's Worlds. Anyone doing, say, AI experiments in learning, can, by writing to this protocol, use as a test bed someone else's World server, and does not have to write his own. He can concentrate purely on devising new learning algorithms. Re-using other people's Worlds [Bryson et al., 2000] will probably be the most common use initially. Learning how to reuse other people's Minds may take time.

Making AI Science - 3rd party experimentation

As discussed above, having to invent your own test environment to test your new algorithm is not only a lot of extra work - it also makes it harder to objectively evaluate the new algorithm. Even if you take care to do a run of the old algorithms in your new test world, the test world might (however unintentionally) be designed to illustrate the good points of the new algorithm. Clearly, being able to compare different algorithms on the same pre-built World server goes a long way towards helping AI run objective comparisons of algorithms and models of mind.

But the WWM goes much further than that. By its emphasis on 3rd party experimentation, algorithms will be subjected to constant examination by populations of testers with no vested interest in any outcome. 3rd parties will ensure that the results can be repeated again and again. They will compare many more different combinations of servers under control conditions. (Currently we are still assuming that 3rd parties will be other AI researchers. Whether the general public will carry out useful tests will be discussed shortly.)

The whole question of how to prove one autonomous agent architecture is better than another has become an important issue recently. [Bryson, 2000] points out that, essentially, no one uses each other's architectures: "There have been complaints .. about the over-generation of architectures" and (among behavior-based models): "no single architecture is used by even 10 percent of .. researchers." Most architectures have performance statistics to support them, but these statistics have had little success in convincing rival researchers to abandon their own favourite models. In particular, if the tester invented the test world to show off his new algorithm, there are simply too many uncontrolled variables for the test results to be totally conclusive. Test results can only become conclusive when there is repeated objective 3rd party evaluation. Currently 3rd parties have to go to great effort to recreate the test situation, and this rarely happens. For instance, Maes' model [Maes, 1989, Maes, 1989a] waited years for Tyrrell to re-implement it in a performance comparison [Tyrrell, 1993]. Tyrrell does not get impressive performance for it, but reading his thesis one might argue there are still uncontrolled variables in Tyrrell's implementation. Re-implementing Tyrrell itself is a difficult job [Bryson, 2000a] which rather adds to the lack of conclusion about his results.

In any branch of AI, the existence of objective tests that cannot be argued with tends to provide a major impetus to research. This has been one of the main reasons for the popularity of rule-based games in AI [Pell, 1993]. Robotic soccer has also taken off in recent years for this reason. Noda's soccer server [Noda et al., 1998] is probably the closest in spirit to the WWM, though the user must still download and install the agent world.

On the WWM, mind and world must be presented publicly, and 3rd parties will drive objective evolution of the best solutions. They will test servers in environments their authors never thought of, and combine them with other servers that their authors did not write. The experiments will be outside of the server authors' control. They will do this in public and, if properly written up, this will implement an ongoing objective programme of artificial selection (i.e. selection by hand) and "natural selection" (machine-automated selection) of agent solutions.

Artificial Selection

I imagine that many will see the ability of any 3rd party on the Internet to choose and run their own combination of servers as a "cute" but unessential feature. That it is some essentially patronising scheme of allowing the public think they are helping with science.

But this misses a couple of points. First, that the only way of allowing any professional AI researcher to experiment with servers without permission is to allow every user on the Internet to experiment with the servers.

Secondly, in fact, I believe it will be essential to the success of the idea that more than just AI researchers can work this system. AI researchers are focused on specific projects, with deadlines and many responsibilities. There are not many of them, and they don't have much time. With 3rd parties, there are millions of them, and they have a vast amount of free time. Many schemes have harnessed the power of the millions of idle and curious Net users, e.g. "metacomputing" projects such as large-scale cryptography cracks or SETI data analysis. Within AI, there have been some evolutionary experiments attempting to recruit large numbers of users. See the [Yahoo list of ALife programs online]. Perhaps the most impressive is the "Talking Heads" language evolution project [Steels and Kaplan, 1999] in which 4000 agents have been constructed and tested by online users [Steels, 2000]. Such ideas are in fact fundamental to the Internet. Even the idea of linking itself in the Web is a classic example of harnessing the power of large numbers of people. Other people do some of the work for you, by tracking down sites and presenting you with a pre-built list of links.

Here with the WWM, the idea of a client presenting at a URL a pre-built combination of servers is deliberately modelled on the Web idea of a pre-built selection of links. Working the system will not be as easy as making a link on the Web. It will require some ability and interest - though perhaps no more so than the interest ordinary people have in raising animals and infants. People will write user-friendly software to make it easy to be a non-technical client, experimenting with combinations of pre-existing servers, taking part in ongoing competitions to beat the highest score in a particular World. Obviously the amateurs will report their successes in what will probably be a somewhat haphazard fashion. It will be up to the AI professionals to make sense of what they see, and investigate promising leads and write them up in a scientific manner.

But the power of artificial selection by large numbers of amateurs should not be underestimated by science. Putting unseen-before minds into unseen-before worlds, years after the research groups that made both minds and worlds have vanished, may be of real benefit to science. 3rd parties will run old minds in new worlds, and new minds in old worlds. They will drop new minds into old collections, and run combinations that make no sense. It is certain that they will run combinations of servers that the scientists never thought of.

The professionals may be sceptical of the value of this, but millions of people experimenting with different mind/body/world combinations year after year on the network would represent a richer experimental milieu than anything AI has ever yet built. Thousands of years of artificial selection by farmers and breeders across the world (non-scientists) is now recognised as one of the most thorough scientific experiments in history. Since modern science arose, hardly any new animals or plants have been successfully domesticated, which indicates that pre-scientists really did do all the major experiments [Diamond, 1997]. And of course the scope of what artificial selection has been able to produce, from fruit and vegetables to dogs or racehorses, is breathtaking, and was one of the central inspirations for Darwin's theory [Darwin, 1859, §1, "Variation under Domestication"]. With AI, large-scale artificial selection projects cannot begin unless the AI work comes online.

How 3rd party AI researchers will use the scheme

To continue our theme in defence of 3rd parties, it is often forgotten that many AI researchers are excluded from the AI project as well. The WWM scheme will allow AI researchers in poorly-funded labs, in distant and poor countries, isolated postgraduate students, and so on, to participate in the great AI adventure. Not for charitable reasons, but because their exclusion is a loss to the science.

They will make a more sophisticated use of it than the amateurs. They will write new servers themselves, either from scratch, or by partial reuse of existing servers, by writing MindAS servers, MindM servers and WorldW servers. They will re-use others' work in controlled experiments. They could take an existing world, body, problem, and basic collection of minds, and just work on simply adding one more mind to the seething collection. Things like:

  1. 1st party makes World.
  2. 2nd party makes Mind for World.
  3. 3rd party makes MindM which in state   x   does something, otherwise does what 2nd party Mind does.
  4. 4th party makes different Mind for World.
  5. 5th party makes MindM which in state   y   does what 4th party Mind does otherwise does what 3rd party MindM does.

And so on, with people modifying and cautiously overriding what already works. Of course they are not actually modifying what already exists in the sense of changing it for other users and removing the old version. Making a wrapper server simply means that 2 servers now exist instead of one. The wrapper author cannot force anyone to use the new wrapper in preference to the old server.

By setting up a system whereby many authors, acting over different times, will contribute to constructing a Society, the WWM provides AI researchers with the ability to do a lot more than just reuse other people's test worlds. In every field, the Internet has already allowed marginalised, distant and poorly-funded researchers participate in international research like never before, from access to primary literature that may not exist in any library in the client's country, to access to remote databases and software libraries. The WWM is simply continuing this trend.

Bring every agent online

Part of this proposal is a plea for recognition of the untapped potential in AI - the vast number of minds and worlds that are offline. Some of this comes from my own experience with putting an agent mind online. For I was one of the first people to put an AI mind on the network, an "Eliza"-type chat program in 1989 [Humphrys, 1989].

The original "Eliza" was introduced in [Weizenbaum, 1966]. Now I must say upfront that Eliza-type programs have little to do with serious AI - their understanding of conversation is an illusion. But this does not affect the argument. They are behaviour-producing systems, like our postulated Mind servers. My point is that many people conversed with the original Eliza (not online of course, for there was no network), but it did not stay accessible. Soon, the experiment was over, written up, and the original version of Eliza remained largely inactive until the modern era. In 1989 I put my own Eliza-type program, "MGonz", online on BITNET. Many people talked to it, but soon (in fact, by 1990), MGonz had ceased to interact with the world.

A brief, finite interaction with the world, seen by only a few people, and normally not even online, is the norm in autonomous agents research. In this field it has become acceptable not to have direct access to many of the major systems under discussion. How many action selection researchers have ever seen Tyrrell's world running, for example? [Tyrrell, 1993] How many robotics researchers have ever seen Cog move (not in a movie)? [Brooks et al., 1998] Due to incompatibilities of software and expense of hardware, we accept that we will never see many of these things ourselves, but only read papers on them, and sometimes watch movies. This situation seems normal, but if we ever get used to direct access to remote agent minds and worlds, it may come to seem like bad science not to allow it, and to only report offline experiments that were seen only by the creator of the agent.

But it is not just watching the agents that this is aimed at, it is interacting with them. The problem with inactive agents is that the only experiments run with them were the ones their creator thought of doing. But as we have argued, AI researchers are often too limited in time and resources to explore fully the possibilities of their creations. It is as if animal species only got to live through one individual and one lifespan. As AI develops, we should begin to regard the inactivity of our growing list of old creations as a loss, like the silence of extinct species. The WWM aims to put an end to this inactivity.

The invention of CGI and other technologies has recently resurrected some of the old agent minds of AI, including the Eliza-type programs [see Yahoo list of AI programs online]. The WWM will vastly accelerate this process, by bringing many of the recent autonomous agents minds online in re-usable forms where they can be driven by remote programs. We aim to take all of the Minds and Worlds that human ingenuity can create, and get them all online and interacting with the world indefinitely. To get AI to move away from isolated experiments, and instead develop its own rich, world-wide, always-on, ecosystem to parallel the natural one.




6. Objections to the model

It is important to ask, if this scheme is going to be so useful, why AI has not taken this direction in the past. The following may be reasons why (or possible objections):

  1. Co-operating is too much trouble. - In the past researchers have not seen the benefits of dividing up the work. As discussed above, many researchers still have the impossible dream of doing everything themselves, such as the CYC project [cyc.com]. Two similar projects, GAC [mindpixel.com] and Open Mind Commonsense [commonsense.media.mit.edu and now openmind.org] are online, but are attempting to use the Net to get people to teach the centralised agent mind, rather than having the agent mind distributed on the Net.

    Indeed many presentations in the Animats or evolutionary fields still seem to assume that one lab can do it all. Some of them recognise the immensity of the problem as we scale up, but when faced with the complexity of dividing up the work, defining communication protocols, and coordinating the results, most have retreated back into either designing whole agents (but saying this is alright for simple agents) or else producing specialist components which may or may not ever be used as part of a larger system.

  2. How do we divide up the work? - Part of the problem, I believe, is in the mental model of reuse that is being used. It is imagined perhaps that the components being reused need to be understood. That their source code needs to be merged with other source codes. That all binaries need to be installed at one location. That components will engage in high-level reasoning and negotiation with each other (rather than simply be mutually-incomprehending minds). And finally, that components will not overlap - that each will have its own well-defined function.

    These are all reasons why re-use is difficult in the software world in general. But in the model of reuse proposed here, none of these things are necessary. It is not necessary to have a clear definition of how the work should be divided up. It is not necessary for components to understand each other. It is not necessary to install anything. Components being re-used can remain at the remote server, are used as a service from there, and are not fully under the client's control. Which leads to the next reason:

  3. Researchers do not want to be dependent on other people's work. - What if the remote server is down? Or the author has made changes to it without telling us? Or removed it permanently?

    Part of the problem, I believe, is models of mind in which the loss of a single server would be a serious issue. Instead of models of mind where hundreds of similar servers compete to do the same job, researchers have been assuming the use of parsimonious minds where each component does a particular task that is not done by others. Certainly, in the early stages of the WWM, with few servers online, clients may feel that their constructed minds are very fragile and dependent on the servers. But some clients will continue to add more and more "unnecessary" duplicated minds to their societies. In a model of mind with enough duplication, the temporary network failure (or even permanent deletion) of what were once seen as key servers may never even be noticed.

  4. But some servers will be indispensable. - Yes, this is true. While duplicate models of mind can take us a long way, some servers will be indispensable. We could have a MindAS server that collects suggested actions from many Minds, and if some of them are gone it will run with whatever suggestions are left. But the MindAS server itself is essential, as is the World server. Like pages that we link to disappearing from the Web [Humphrys, 1999], how can we cope with the disappearance of a server that we need?

    The basic answer is that if it is important to us, we will copy it (if it is free) or buy it or rent it. Then we either set up our own server, or continue to use the remote server and just keep our copy offline as backup. Here's how in practice one might be running a large, complex Society of Mind with actually very little risk: Imagine that our top-level AS server is a well-known, standard type that we can get our own copy of. (Not that we actually use it. We continue to use a remote server. But we have our own copy just in case.) Say the World server is a popular test world that is implemented at multiple sites (we just use our favourite one). And we use hundreds of remote minds in a complex society. For all of these we will take the risk of some of them vanishing, and see little reason to buy or copy any of them ourselves. After all, other new and interesting Mind servers are coming online all the time.

  5. Models of Broken links and Brain Damage - Broken hyperlinks are a problem with the Web model of remote data, and the equivalent of broken links will happen with any scheme that uses remote servers.

    As discussed above, one way of making a Society more robust would be to add "unnecessary" duplication. For this to work smoothly, we need an Action Selection scheme where a Society with   n   identical Mind servers trying to do the same thing, plus other servers, will behave more or less the same as a Society with   1   Mind server trying to do that thing, plus other servers. Then we can add extra copies of the Mind server located at different sites, and won't even notice if some of those sites are down. This property is not true of all AS schemes, though. [Humphrys, 1997] shows that it is true of individual-driven AS schemes such as Minimize the Worst Unhappiness, but is not true of the more common collective AS schemes.

    So using a model like the above, Societies will degrade gradually as the number of broken links increases. In the above work I explicitly addressed the issue of brain damage in a large society of mind [PhD, §17.2.2, §18]. The reader might have wondered what is the point of a model of AI that can survive brain damage. After all, if the AI is damaged, you just fix it or reinstall it surely? Here is the point - a model of AI that can survive broken links. This leads to the more general reason why this whole approach to remote re-use has not been used:

  6. Models of duplicated mind are poorly developed. - We have argued throughout that if the work is to be divided up in AI it will be impossible to avoid massive overlap and duplication of function, and resulting conflict. We need models of mind in which a state of conflict is totally expected. Unable to get server authors to agree, we will instead selectively override, censor, and subsume old servers instead of re-writing them (or vainly trying to get the server author to re-write them). Such duplicated models have been argued for [Brooks, 1986, Brooks, 1991, Minsky, 1986] but parsimony is still popular. We will also need Action Selection that can resolve competition between minds that barely understand each other [Humphrys, 1997]. In a traditional system where a single designer writes the whole system, he can make deliberate global-level decisions in the interests of the whole creature, and there is no need for the decision to emerge from local rules. But with mind servers from diverse sources, the need for Action Selection based on local rules re-emerges.

    It is clear enough to see how sub-symbolic conflict resolution can occur via numeric weights. But if Minds are strangers written by different authors, in what symbolic language could they communicate? Which leads to the following objection:

  7. It is premature at symbolic level to attempt to define mind network protocols. - This is probably true. Since even before the Web, researchers have debated the possibility of standardised symbolic-AI knowledge-sharing protocols, with [Ginsberg, 1991] arguing that it is premature to define such protocols. Recently this debate has continued in the Agents community as the debate over defining standardised agent communication languages. See a recent survey of many approaches in [Martin et al., 2000], who then define their own approach. Agreement is weak, and it may be that the whole endeavour is still premature. For example, some of Minsky's students [Porter et al.] attempt to implement a Society of Mind on the Internet, but insist on a symbolic model, with which they make limited progress. Indeed, Minsky's work may have had little impact in the sub-symbolic world because of his hostility to that world [Pollack, 1989].

    We argue, though, that it is not totally premature to start defining mind network protocols at the sub-symbolic level. There are already many schemes of numeric weights, and Action Selection based on weights, in the literature. The sub-symbolic WWM in this paper has been designed so that all current numeric agent architectures (that the author is aware of) can be implemented under the scheme.

    There will no doubt be further sub-symbolic (and later, symbolic) protocols. But designing the early ones for simple numeric weights will give us an idea of how to do it in the future for more heterogenous agents with different representations [Minsky, 1986, Minsky, 1991]. This will be the first in a long family of protocols.

  8. "Agents" researchers (or other branches of AI) have already done this. - No they haven't. Consider how the field of Distributed AI has developed. For surveys see [Stone and Veloso, 2000, Nwana, 1996]. DAI has split into two camps:

    1. Distributed Problem Solving (DPS) - where the Minds are cooperating to solve the same problem in one Body.

    2. Multi-Agent Systems (MAS) - where the Minds are in different Bodies. We have 1 mind - 1 body actors, and then coordination of multiple actors. This is what the field of "Agents" has come to mean. Indeed, [Nwana, 1996, §4.3] makes explicitly clear that our servers are not Agents.

    This is neither of these two, but rather is multiple minds solving multiple problems in one body. If anything, it is closer to the field of Adaptive Behavior and its interest in whole, multi-goal creatures whose goals may simply conflict. Agents researchers also tend to work at the symbolic level only, rather than the sub-symbolic as we do here (and as many people do in Adaptive Behavior).

    Artificial Life and evolutionary researchers are certainly interested in collective, and even collective network-based models [Ray, 1995], but again the minds are localised, as in the MAS approach. In [Ray, 1995] it is a society of agents that is distributed across the network, not a single agent mind.

    Machine Learning (e.g. Reinforcement Learning) researchers have tended to focus on solving a dedicated problem, rather than juggling many partially-solved conflicting problems. That it, they tend to take the DPS approach.

  9. Virtual-world researchers have already done this. - No they haven't. They have tried to establish standard technology for displaying worlds, such as VRML. Here we establish a framework within which state and action data may be sent back and forth, but the format of the state is left to be defined by each server. These researchers also concentrate on user-driven avatars, with a graphical UI. This considers AI-driven actors, with possibly no UI at all.

  10. Tele-robotic researchers have already done this. - No they haven't. They have concentrated on user-driven control from web pages, rather than remote machine-driven control.

  11. The network is not up to this yet. - Possibly true. Simple WWM societies can certainly run on today's network. It may be that a Society with a large number of servers in multiple layers will operate very slowly on today's network. But that will change.

  12. There is a chicken-and-egg problem. - It is true that until other people put up their worlds, minds and AS mechanisms as servers, there is not much attraction in converting to using servers oneself. One thing that may speed the adoption of this scheme, though, is currently there is no easy alternative method in many areas. For example, how would one make Tyrrell's agent world [Tyrrell, 1993] easily available to researchers? Going the server route seems almost as easy as any other.




7. Miscellaneous issues

Hidden server insides

The internal server workings do not have to be made public. The only demands are that the server replies to external requests according to the protocol. The Mind server can be a symbolic mind, a neural network, a genetically evolved program, or anything else. It does not have to tell us. This will be important for commercial servers who want to protect their investment, or sell access to their server. It may also be important for academic projects. Although it could be argued that all academic projects should publish their source code (and there is no excuse not to, now that the Web exists) so that experiments can be replicated.

An interesting side effect of hidden server internals may be a more level playing field between different types of AI. Currently people tend to look for algorithms only within a particular sub-field - neural net researchers look for other types of neural net, symbolic AI researchers look for symbolic routines, and so on. Here, each algorithm will stand on its merits, and it may be better science if we do not know, at least initially, what is inside the server. The doctrinaire neural net researcher may be embarrassed to discover that the excellent server he has been using for years is in fact a symbolic AI server. Such objective symbolic v. non-symbolic competition has in fact already occurred in robot soccer [Stone and Veloso, 2000].

It will be interesting to see how this develops. Some researchers may refuse to use servers unless they know how they work. They will argue that it is bad science, since any systems built using such servers can only be replicated as long as the server owner continues to provide the secret and unknown service. Others may argue the above case - that it is a breath of fresh air (and good science) to be able to judge algorithms entirely on performance without, at least initially, knowing anything else about them. The answer really is that it will be good science if, while server authors do explain in some detail what is inside, such information is routinely ignored as different combinations of systems are constructed and tested on merit.

Credit

Allowing servers to be used as components in larger systems is central to the WWM idea. One issue though is: Can you track how your server is being used? Imagine that your server is being called by a popular MindM server. To the outside world, does the MindM server get all the credit? Can a 3rd party look at a successful Society, and see all of the servers involved in it at all levels? Or do they just have to assign all the credit to the top-level server, not knowing what is behind it? We suggest the following:

  1. Every server has a URL.
  2. At that URL they link to the URLs of every server they are calling.
  3. When they call another server, they provide it with their URL as part of the query.
  4. So each server, at its URL, can link to all servers it calls and all servers that call it.
  5. 3rd parties may read these lists, and follow chains of credit through the Society, even if the server authors are not involved.

The latter point is important. On the WWM we expect that thousands of server authors will leave their servers to be run in the research community long after they have gone. Hence we want these lists to be published online, rather than simply known internally by the programs or recorded in logfiles.

An interesting question is whether you could write a malicious wrapper server, where the MindM server does not acknowledge that it is calling another Mind server, and tries to take the sole credit for the functionality. There are many possible answers to this (e.g. servers publish their usage logs online), but we doubt it will be an important issue. More likely the AI community will simply ignore servers unless they come (i) from serious and respectable sources and (ii) explain in exhaustive detail how they work and what other servers they are calling.

Learning servers

If Mind servers learn from interacting with a World, while part of a Society, where is that new knowledge stored? One might say it should be stored at the Mind server, and that seems reasonable in many cases (if the clients can tolerate the fact that the server may change). There are, however, a couple of problems:

  1. First, the server may learn erroneous things. For instance, a client uses a Mind server, and gets a suggested action   a.   The client reports the new state   y   and the Mind server learns that action   a   led to state   y.   But the client forgot to tell the Mind server that its action was not actually obeyed (it lost the competition), so it was some other action that lead to state   y.   The client may even be malicious. AI programs that learn from user input online have found that many users input nonsense or misleading information [Hutchens, undated]. The integrity of the Mind server cannot be compromised by a buggy or malicious client. So we suggest that the Mind server learns relative to a particular client only. i.e. It stores a file of knowledge that is only used with that client. What you teach it may only change how it interacts with you, not how it interacts with others.

  2. Some of the learning may only make sense relative to a particular Society (e.g. the W-values), and so again we suggest it learns relative to this client only.

This does not mean that other clients cannot access the new knowledge. It just means they have to do so explicitly: "Give me the Mind that you learnt with client   c".  

Learning Temperature

A Mind server that learns also raises the question of which of these we want:

  1. A pre-built server that has already learnt what it wants to do [pure exploitation].
  2. A new version to start learning from scratch, i.e. whose initial actions may be completely random [pure exploration].
  3. A server that has already learnt some preferences, but still engages in some new exploration [some exploitation, some exploration].

We may pick one of these at the start of the run, or even half-way through the run we may decide to get the server to go back and re-learn. We can control all this by passing a "Temperature" parameter to the server. Temperature = 0 means we want the server to exploit its current knowledge with no exploration. High Temperature means we want the server to do a lot of exploration. As Temperature   -> infinity   the server tends to engage in maximum exploration. The use of the word "Temperature" is explained in [PhD, §2.2.3]. Exactly what counts as a "high" Temperature for this server will be explained by the server at its URL. Rather than just provide the Temperature at the start, we might provide it on a step-by-step basis, so that at any point the client may send Temperature = 0, which means "Send me your best action, based on your learning so far". Generally, the temperature will decline over the course of the learning run. We have two basic strategies:

  1. Server maintains temperature - The temperature is initialised, either explicitly by the client, or the server is left to pick a reasonable high temperature. The server maintains an internal value for temperature, and decreases it every step so that by the end of the learning run it will have reached the minimum temperature (it stops learning). The client will need to tell the server how long the run will be. The client makes requests of the form   Get action (x)  
  2. Client maintains temperature - The client maintains the temperature value, and passes it to the server with every request. The client makes requests of the form   Get action (x,temperature)  

Q-Temperature and W-Temperature

If the world changes, we may ask the Mind server to re-learn its Q-values from scratch, i.e. increase the Q-Temperature. If the collection of minds changes (i.e. the competition changes), we may ask the Mind server to re-learn its W-values from scratch, i.e. increase the W-Temperature.




Part 3

Return to Contents page.



Feeds      w2mind.org

On Internet since 1987.