How Creatures Sense Their Environment

To a creature the world is divided up into 40 categories, food, fungi, machines, lifts and the like, in which all objects within each category have roughly the same behaviour as far as the creature is concerned.

At any point in time the creature only 'knows about' 40 objects at most, one for each category.

There are two main ways these 'category representative' objects are chosen for the creature. For things like fungi which all look the same (homogeneous I guess you could call it) it makes sense for the creature to pick the nearest one whereas for things like machinery where each has a different purpose (heterogeneous) it makes more sense for the creature to pick a random one in visual range.

Sensory inputs to the brain are in these 40 categories, e.g. the "vision" input says how close each category rep is, the "smell" input says how much you can smell it, the "noun" lobe says how much you can hear it.

Creature decisions are of the form do X to Y, e.g. "eat food", "push toy". These actions actually set off a script which makes the creature navigate to the object, pick it up (if necessary) and then eat it, push it or whatever.

So the creature doesn't decide itself to approach the object, pick it up etc - it decides what to do in a more general way and the script does all the navigational details.

For objects which the creature can only smell, not see, the navigational details include moving in the direction of the smell gradient. For about 7 key categories the engine makes a smell gradient through the map's "rooms" from the objects through to the far reaches of the ship so the creature can navigate towards objects more easily.

The motor output in the brain is given by the winning neuron (i.e. the highest firing one) in the attention ("attn") and decision ("decn") lobes. Biochemistry The brain's function is to choose actions which makes its drives (e.g. hunger or tiredness) go down.

On the other hand the most important function the biochemistry has as far as the brain is concerned is making drives go up, e.g. sex drive going up when in contact with members of the opposite sex, hunger going up when active, sleepiness all the time. Without at least one high drive the creature won't feel motivated to do anything.

Drives are inputted through the "driv" (drive) lobe.

The world also interacts with the creature in the form of feedback from the agents to the creature to aid it to learn, e.g. "eating food" reduces your "hunger for carbohydrate".

Specifically, the agent reduces (or increases) drives when it gets eaten, touched or whatever and the code passes through this drive change to the brain which rewards itself or punishes itself accordingly.

These training drive changes go through the "resp" or response lobe. Note that the brain has certain genetically specified instincts (e.g. "eat food when hungry") which get trained in when the creature is hatching or dreaming.

Brain Model
The 40 neurons in the "stim" (stimulus) lobe measure how interested the creature is in each of the 40 categories, food, toys etc. They fire from 0.0 to 1.0 where 0.0 means not at all interested in that category (no object available) and 1.0 means maximally interested in that category.

The stim lobe is not an input lobe, rather, it depends on visual (proximity and movement), auditory and olfactory sensory input by a small chain of lobes and tracts as we shall see later. In practice, to aid action selection, a stim neuron fires in proportion to how easy it is to get to the object representing that category. Thus the brain can make informed decisions about what to do given its drive levels, i.e. the food is far away but I am hungry versus I'm not too bored but the toy is nearby.

Input lobes
As mentioned above, the stim source lobe takes inputs from auditory (the "noun" lobe), visual (the "visn" lobe) and olfactory (the "smel" lobe) stimuli. These three inputs come in from the engine as follows:

Although the Creatures 3 engine reads the output of the brain from the decision ("decn") and attention ("attn") lobes the action (push button or whatever) has actually already been decided in the combination ("comb") lobe.

The combination lobe is where all possible actions are weighed up. This lobe is essentially an array of objects versus actions with one neuron representing each possible (transitive) action like "push gadget", "hit machinery". Each action is given a weight in proportion to three things:
1. How close the object is involved in this action.
2. How high my drive is that is suggesting this action.
3. How much reward the creature expects by completing this action.

These come from: 1 is stored in the incoming stimulus lobe, 2 comes from the drive lobe and 3 comes from the weights on the dendrites from the drive lobe to the combination lobe.
Therefore the creature biases its action selection based on those actions which are:
1. Easier to do (take less time)
2. Satisfying drives which are currently high
3. Most rewarding (to the extent they satisfy the drives)

These three criteria are weighed up in each neuron in the combination lobe. The kinds of weighing up the brain does is: "I'm hungry and a steak is better than an apple but the steak is further away". "I'm hungry and slightly bored but I'll play with the toy because that's nearer than the food".
This style of action selection allows for automatic persistence of action as all actions make the creature go closer to the object which in turn makes it easier (and thus more desirable) to do that action, i.e. the more the creature does the action the more it is attracted to continue doing it.

This style of action selection allows for automatic opportunism as if a creature is heading off for food far away (through the smell system) if it chances upon a toy to play with because actions are biased to those easiest to do it will play with the toy, reduce its boredom drive, and then continue on with finding food.

There's two kinds of learning in the Creatures 3 Brain: one that affects individual drives and one that affects all drives. An example of the former is eating an apple and getting Hunger for Starch Decrease. An example of the latter is tickling your creature when he's going to eat an apple and his learned concepts being accentuated because of this.

Both types of learning act on dendrites from the drive lobe to the combination lobe. A dendrite, say, from Boredom (in the drive lobe) to Push Toy (in the combination lobe) means "I know something about what pushing toys does to boredom". A positive weight means push toy decreases boredom whereas a negative weight means we believe it increases boredom. In general, the weight will tend over time to the (negation of) the average change in that drive level when doing that action.

Also, there is general reward/punishment (used for slapping and tickling your creatures) which through two chemicals referenced in SV-Rules serves to increase/decrease the weights on all the dendrites that happen to be connected to the winning neuron. So this type of learning serves to accentuate or attenuate (de-accentuate) what is already known.

The reason for these two types of learning is to solve a few other things. Note that the general problem is that an input of zero into an individual drive learning means (to the brain) "don't change the weight on this drive". There's no way of saying this action actually gave me no reward.

Short / Long Term Weights
Short term and long term weights is a way of making sure the creature's knowledge isn't thrown away without sufficient evidence. It's not short and long term memory really, more of a way of coping with actions with are temporarily not viable (e.g. you're waiting for apples to fall off a tree before you can reach them). Essentially, a disappointment stim sends punishment in (implemented as described above).

The essence of the C3 navigation system is extra drives which represent wanting to go in a particular direction, up, down, in etc. These are set high by the engine when trying to navigate a norn along a smell gradient.

The navigational drives are:
- Need to Go Down
- Need to Go Up
- Need to Go In
- Need to Go Out
The advantage of mapping these onto regular drives is that the brain can then learn things like "push lift reduces need to go down" or "pull lift reduces need to go up". In fact, in C3 these were all specified as instincts, turned on when the creature reached its "child" life-stage.

There were two other navigation style drives: Homesickness and Need to Wait. Homesickness smell was emitted by the creatures home and so provided an easy way to navigate back there. Need to Wait was set high after calling a lift or pressing a door to make the creature stay still (and stop repeating calling the lift or pressing the door) until the lift appeared or the door sucked the creature through respectively.

Reactive Planning
The only other thing to mention is the "half-strength instinct trick". This is a kind of reactive planning type thing: you want action A followed by action B to occur where A results in the appearance of a new object, O, say which B acts on. An example: A is push cheese machine (producing O, the cheese) and B is eat cheese.

Another example: A is push call button (producing O, the lift) and B is push lift. Anyway, the trick is to have a half-strength instinct for A and a full strength one for B: at the start the object for B doesn't exist so A is invoked (push cheese machine) producing the object in question (the cheese) which allows the stronger instinct for B to overwhelm A and do B (eat the cheese).