- chapter 8 Interacting with Physical Objects
- section 8.1 Affordance revisited – what we can do and what we think we can do
- section 8.2 affordances of the artificial
- figure 8.1 Glued coins and door handles
- section 8.3 adapted for new actions
- figure 8.2 John Locke’s “An Essay Concerning Humane Understanding” [Lo90].
- figure 8.3 Screen icons imitate physical controls
- section 8.4 action as investigation
- figure 8.4 Embodiment in Tetris (top) mirror image pieces especially hard to rotate mentally (bottom) where does it fit? rotating the piece makes it easier to tell
- section 8.5 letting the world help
Objects in the world may have all sorts of properties and impact on our senses in many ways, but when thinking about design, the most critical is what we can do with them (e.g. pick up, throw), or what they may do to us (hit us!). When we interact with objects they become part of our own activities and lives.
We briefly mentioned James Gibson’s affordance theory in Chapter 5. Recall that affordance is about the potential for action of an object or the world in general, what it is possible to do with the thing. [Gi79] Note that what the environment affords depends on the person or animal looking at it. Whether or not a rock shelf affords sitting upon depends upon your leg length; whether a stone affords throwing depends on whether you are an Olympic shot putter. The differences between humans are relatively small, but certainly what the environment affords a song thrush is very different to what it affords a snow tiger. Affordance is fundamentally relational, it can only be understood by considering both the environment and the agent who can act (or be acted upon) in the environment together. For this reason, Gibson’s approach is often called ecological psychology: it is about the creature in an environment.
However, we are creatures adapted to our (natural) environment, and so Gibson argued further that our whole perceptual systems are attuned to seeking out the action potential of the environment. That is our eyes, our ears, indeed all our senses and the brain processing that goes with them, are all designed in order to act. This is precisely the emphasis of embodied mind philosophy introduced in Chapter 5.
Because of this adaptation to the environment, Gibson argues that the perception of affordance is immediate. We do not need to go through abstract reasoning steps:
rock is about 4 inches across and rock is in reach,
therefore rock can be picked up
Instead we are immediately aware of the pertinent qualities of both the object itself, our own bodies in the environment, and the available affordances. Even that awareness may itself be subconscious as we do not even explicitly think, “that cup can be picked up”, but just do it and drink.
Gibson’s illustrations of affordance include natural phenomena and objects as well as artificial ones. However, there is a crucial difference. It is reasonable to believe that the human species has adapted over millennia to the natural world and hence its affordances are directly available to us. The low-level discrimination of our eyes, the range of frequencies that our ears pick-up, the turning of our eyes and head in response to movement at the edge of our vision, the ways in which our brains process these and react to them – they are all there to support acting in this natural environment.
It is not so clear why we are able to ascertain even that a cup can be picked up, let alone a television switched on using a remote control. Indeed in the human-crafted world things often go awry. Norman highlighted many of these in his book “The Design of Everyday Things”1, which popularised the word ‘affordance’ within HCI and design communities, albeit with a somewhat different meaning than Gibson [No98].
Many building doors are symmetric, the handles look the same whichever side you approach them, but in fact only open one way. So you see the door in the distance. As you get closer you can see that the door has a sign taped to it. The sign says “Push”. You read the sign, acknowledge it and reach for the handle. Then you pull the door and nearly jar your arm out of its socket. If you HAVE done this, the good news is that it’s not your fault; essentially, the door carries two signs: the taped one telling you to push and the handle itself that tells you to pull. The physical feel of the handle triggers your response to it which is to pull. It is possible to overrule this urge, but only if you are concentrating, and this is not the type of task we generally concentrate on!
Fig. 8.1 Glued coins and door handles
Sometimes these disparities between appearance and action are deliberate, as in the joke of gluing a coin to the pavement in order to see people struggle to pick it up. However, more often, as in the case of the door handles, it is an accident or poor design. (See Figure 8.1)
While our senses are tuned for the affordances of the natural world, there is no a priori reason why the affordances of constructed and hence unnatural objects should be apparent unless they are designed to be so. That is we have a design heuristic:
design so that the affordances suggested by appearance match those actually afforded by the artefact
Perhaps the surprising thing is not that there are problems like confusing door handles in the constructed environment, but that there are not more. How is it that we manage at all?
Some of this is to do with the objects themselves, some the skill of the designer:
- necessary physical correlates – The purpose and use of an artificial object, that is what it affords, is constrained by the properties of the physical world (including people) in which it will be used. If you design a cup to hold drinks then it will need to have a suitable hollow in it, and this hollow will be similar to those in natural objects that afford holding fluid. Likewise, if the cup is designed to be picked up, it will tend to have the size and shape that is similar to natural pick-up-able objects.
- design selection – Successful designs are ones that can be used easily. Therefore it is natural to suppose that over time traditional designs will end up with perceived affordances that match their actual affordances. The pace of mass-manufacture means that this is not necessarily the case for more explicitly designed goods, although still it is likely that those with successful sales are likely (whether deliberately or accidentally) match appearance and affordance.
- by design – A good designer may either explicitly realise that they need to match perceived and actual affordance, or simply ‘know’ what is right.
- imitation and artefact ancestry – Often artificial objects mimic natural objects, or at least previous artificial objects that themselves mimicked natural ones or were subject to physical correlates, design selection or plain good design.
It is not just that objects are adapted to people, we also adapt to the artificial world. Some creatures operate almost purely by instinctive reactions. Frogs have dedicated visual areas that respond to a small moving dot (that is flies) and trigger the tongue to dart in the right direction to catch it. If you try to feed the frog dead flies, no matter how fresh, it will never notice and starve with food in front of it.
Higher animals and especially humans are much more adaptable and learn both during infanthood and also later as adults. Rather than being born pre-programmed for the affordances of a particular natural environment, we are instead able to learn the affordances. That is the immediate perception of (natural) affordances themselves is not entirely innate, but we are born with the ability to form new associations between perception and action. This has given us the ability to adapt ourselves to different natural environments from savannah to tundra and also to artificial environments of our own making.
Note this adaptability and ability to learn does not mean we are born without innate responses. The idea of the child’s mind being a tabula rasa, a blank slate ready to be written upon by experience and education, has been present in some of the earliest philosophy, but is most closely associated with John Locke’s “An Essay Concerning Humane Understanding” [Lo90]. However, more recent philosophers and psychologists, especially those taking a more embodied view of the mind, suggest that there is far more ‘inbuilt’.
Fig. 8.2 John Locke’s “An Essay Concerning Humane Understanding” [Lo90].
Evidence for this includes the fact that babies from an early age can perform mirror actions, imitating facial expressions and movements, before there has been any chance to learn the association [Ga05, p.159;MM77]. Similarly they can move their heads in the direction of sounds and in other ways that suggest a basic model of the physical self and an egocentric ‘model’ space are there from birth or develop spontaneously soon after.
Brain imaging has shown the existence of so called ‘mirror neurons’. When we see another person doing an action it fires particular parts of our brain (the mirror neurons), but these in turn connect to exactly the same parts of the brain as when we perform the same action ourselves [Ga05, p.220;Ra04, p.37,107]. Seeing someone else doing something is nearly the same as doing it oneself!
It is possible that the relationship between our proprioceptive sense of body position and the actions of muscle movement is effectively ‘learnt’ through association, maybe even in the womb. However, the ability to see another person and mirror their actions, that is the development of the mirror neurons, cannot be learnt before we see others and so appears to be born in us, part of our initial neural wiring.
Looking at other animals, it is clear that they too have an innate sense of body presence and movement and indeed their ‘initial wiring’ seems if anything far stronger than our own. The young of most animals, including those that exhibit learning, can function to some extent either from birth or at a young age; think of those Bambi-like pictures of young fauns standing and running with the herd moments after birth. However, animals with more complex social behaviour (dogs, elephants) often have infants who are more dependent for longer, suggesting that a stronger role for life-long learning necessitates weaker innate instincts.
Humans are the extreme and indeed Broomhall’s book “The Eternal Child” [Br03] suggests that we are effectively always infants; not so much naked apes as ones that never mature enough to have hair! He argues that the infantile features and behaviours in some species allowed them to be more social, because they avoid the extremes of territoriality that come with adulthood. In humans this effect may have been amplified by runaway sexual selection. Following this argument, our cognitive flexibility is simply a side-effect of our perpetual neonate status; we retain the neural plasticity of a pre-born, the curiosity and sociability of a child, whilst having the physical size and strength of an adult.
Others would perhaps swop these around and see the cognitive advantages coming first, but, in both views, it would be a big step from weaker instincts to no instinctive understanding. While the price of sociability and cognitive flexibility may be that human infants do not have the ability to run like a deer, it seems likely both from empirical evidence and common sense that we all start with a level of innate grasp of the body and world that acts as a bootstrap for later learnt understanding.
The problem of distinguishing innate understanding from understanding developed early as an infant, is precisely because of our rich ability to learn; especially if we are ‘hard wired’ to learn certain kinds of things easily. As a tiny infant we reach out, touch things, push beads and blocks. Are we born with the understanding that if we push things they move, or is it that we pick this up in our first few months of life? Whichever way, our grasp of the physical world is a form of birthright, available to every child brought up in normal circumstances, anywhere in the world and at any time in history.
The end-point of this ability to cope with savannah and tundra is that we are also able to learn about light-switches and TV remotes. We are natural born affordance-seekers, learning new ways to act in the world and then adding these to our repertoire. The conventions that we learn whether turning a water tap or switching on a light are cultural affordances, which then become part of the way that future action-potential is perceived.
However, while the basic properties of the physical world do not change, these conventions do. Cultures vary geographically and so travellers find it difficult to re-learn how to turn on a shower, or even a light. Cultures also change over time. Sometimes old conventions are maintained long after the physical reasons for them have faded; so that the play/pause/fast-forward/rewind buttons of the old analogue tape recorder are now preserved in digital controls. However, changing conventions can disenfranchise; for example, elderly people may have difficulty in grasping the action potential of on-screen menus that seem second nature to those who have grown up with computers.
Fig. 8.3 Screen icons imitate physical controls
Our cognitive flexibility also allows us to reason, even when the appearance of things does not immediately suggest use:
1. There is water tap, but no handle to twist or press.
2. There is a foot pedal and it is attached to the water pipe.
3. Therefore try pressing the foot pedal.
This reasoning of course itself makes use of a combination of basic properties of the physical world, learnt associations, and culturally specific clues. There is a hierarchy here, from those things that are either innate or learnt as a tiny infant, to those that we learn culturally, to those that we have to work out in the situation. As we move up this hierarchy to more and more complex behaviours, our reactions tend to be slower, require more mental effort and potentially become more error prone.
It is perhaps the times when errors occur ‘lower down’ the hierarchy that things go most dramatically wrong. A short while ago, Alan was in a stairwell at night. There was no natural light and in the dark he felt around close to the door that he had just emerged from. Eventually he found a small square pad, rather like (cultural affordance) the timed light switches that you press and then give you some minutes of light. He gently pressed, it did not instantly move, so he pressed a little harder (physical understanding, if it doesn’t move press harder). A moment later he felt and heard a crack beneath his fingers and the fire alarm rang!
Earlier we suggested that a good heuristic for affordances was to design things so that the perceptual affordances suggest the actual action potential, but perhaps even more important as a design rule:
make sure that the perceptual affordances (natural or cultural) do not suggest erroneous and harmful behaviour
If you can’t get it right don’t make it too wrong!
Situation 1: You are in your own home and wake early in the morning. You go to your kitchen, open a cupboard and take out the coffee. A few minutes later you return to bed with a mug of coffee and a book.
Situation 2: A few days later you are visiting a friend and wake early in the morning, so go to the kitchen to make a coffee. In front of you is an array of identical cupboards, but which contains the coffee? You have various ways of finding out. You could go and wake your friend to ask. You could reason it out, maybe selecting the cupboard nearest the stove. However, what you are likely to do is open a few doors until you find it.
In Situation 1 you know where the coffee is, as it is your own kitchen (the knowledge is in your head), but in Situation 2 the knowledge is in a sense “out there” in the kitchen and hence you need to explore to find out. In both situations you open a cupboard, but in your own kitchen the action is pragmatic, it is achieving a concrete purpose. In contrast opening the cupboards in Situation 2 is an epistemic action, an action intended to give you knowledge.
You can probably think of many epistemic actions: looking in the fridge to see what you need to replenish, turning a tomato in a supermarket to check it is not damaged on the hidden side, tugging the leaves of a pineapple to see if it is ripe, craning your neck to check the road is clear before driving off. Many of these are incidental to the main purpose of your activity: shopping, getting from A to B, drinking a mug of coffee, but in some cases, such as turning the pages of a book, the primary purpose of the activity is purely informational.
We have already discussed how perception and thinking are not disembodied, but instead intimately tied to us being creatures, who act in the world: we perceive in order to act. These epistemic actions are in sense the other side of this, in that we act in order to perceive. Indeed, Gibson argues that the boundaries are largely artificial, so that seeing is not just about a single image on our retina, nor even the flow of images as our eyes scan in front and our head rotates to see behind, but even our hands and bodies are part and parcel of an integrated sensing-acting system.
Just as we saw when discussing the embodied mind in Chapter 5, Clark’s ‘007 principle’ of parsimony applies [Cl98]. In general we do not bother to hold information in our heads that can be more quickly or easily seen, heard or felt in the environment.
One aspect of this that has been studied in detail is mental rotation; that is the ability to turn an object round in one’s head in two or three dimensions. This is often tested by asking people to decide whether two shapes shown at different angles are the same. The time taken to make the decisions is directly proportional to the angle you have to ‘turn’ the objects in one’s head in order to match them. Players of Tetris typically turn the fresh pieces as soon as they appear, as it only takes a few hundred milliseconds to do this on-screen rotation, (taking into account visual and motor delays) whereas the equivalent mental rotations may take over as second [KM94], it is quicker to turn the actual pieces than to think about it. Similarly, with physical puzzles, if you are completing a jig-saw puzzle, you will not just stare at pieces trying to decide if they will, fit, but instead spin them round and try them in position.
Fig. 8.4 Embodiment in Tetris (top) mirror image pieces especially hard to rotate mentally (bottom) where does it fit? rotating the piece makes it easier to tell
It may seem odd that we are not better at mental rotation as this is effectively about the physical world that we are part of, but in fact drawing is, in evolutionary terms a fairly new skill. Previously we could always ‘try things out’ or move to see an object from a different angle, which probably means that we have not needed to develop better mental skills. Note this is a cost–benefit trade-off again, but at a longer timescale, the ‘choice’ in Tetris or the jigsaw puzzle to manipulate objects in the environment instead of the head is something you do moment to moment and largely unconsciously. The fact that we are not better at mental rotation is about our development as a species.
Of course, acting to perceive is sometimes costly or dangerous, especially if the act of investigating has irreversible consequences, as was the case with Pandora’s box. In such circumstances, we often have to consciously stop ourselves from acting, for example, reaching out to open a door if there is smoke coming through, or looking over your shoulder instead of at the road when driving past a car accident.
The boundaries between action and perception blur further as many actions that are about ‘getting things done’ also give us fresh information as a side effect. As you drive down the road you can see further round the corner, as you pick up a box you feel how robust and how heavy it is. Equally epistemic actions often become the effective action they are getting information for. You turn the jigsaw puzzle piece round, try it in position and find it fits — then just leave it there; finding whether it fits and putting in place are one and the same action. Similarly in the friend’s kitchen if you ask your friend, the acts of finding out (asking) and getting the coffee are separate, but when you investigate, the point at which you have opened a cupboard to see what is inside is coincident with the opening to get it out.
Our bodies are covered in soft skin, our tendons and muscles are elastic; even our bones have some flexibility. This flexibility is important for many reasons. It protects us from injury: we are less likely to break limbs when we fall, or strain muscles when we exercise. It is also efficient; when we run, the foot that lands absorbs the energy of your body and leg hitting the ground and then ‘bounces back’ helping you on your next stride and meaning you can run faster with less energy.
This flexibility also helps when we interact with objects in the world. Imagine a classic science fiction robot with a pincher grip picking up a crystal vase. It needs to excerpt a very precise grip: too tight and it will shatter the glass, too loose and the vase drops. However, when you grip the vase the flexibility of the pads of your fingers gives a small margin.
The objects that we manipulate also make a difference to how easy it is to work with them. Natural materials typically have more flexibility: think about putting on a woollen jumper vs. a suit of armour! Tight, precision fit is often important in mechanical items, but is typically harder to assemble.
While humans cope, robots do not and those designing automated production lines have to design some of this natural flexibility back into both the robots and the items being assembled. One example is in locating a screw in a hole. If the screw is a tight fit, then it may be very difficult to position it in precisely the right position. However, if the screw is slightly pointed at its end or the hole drilled with a slight counter sink, then the screw will find the hole even if the location and angle are not entirely perfect. The robot arm also has to ‘give’ a little so that the screw can guide itself into position, those in robotics call this ‘compliant action’.
Fig. 8.5 Compliant action: (i) screw slightly misplaced, but naturally (ii) slides, and (iii) rotates, (iv) into place.
Box 8.1 tight screws
If you are putting in a small screw yourself, you might have similar problems to the robot. If you tighten the screw when it is in the wrong position you might ‘cross thread’ the screw spoiling either it or the work piece or both. Instead try holding the screw in the hole and with slight pressure turn it backwards (counter clockwise for a normal screw). At some point you feel a slight click into position. When you feel this you can then start to turn the right way and (usually!) it fits easily.
In Chapter 5, we noted how one solves mathematical problems using paper and in Chapter 4 how match puzzles usually require playing with the matches, not just thinking about them. However, in both these cases the problems are rather discrete (unless one breaks matches!) and cerebral (even if enacted with physical materials). However, we often use the more analogue properties of materials to help us solve problems or ‘compute’ things.
It is laundry day and you have just taken the freshly cleaned sheets off the washing line. You want to fold the sheet in half. Do you carefully measure the mid-point? No. If there are two of you each take opposite corners and then each bring the corners together, hold them with one hand, put your finger in the fold and then pull tight. Your finger naturally slides into the middle position. To fold it end-to-end one of you brings their end to the other and the weight of the sheet again neatly halves it. Even thirds are quite easy using variants of the finger and slide techniques.
Fig. 8.6 Folding clean linen in perfect thirds
As well as helping us to do things, physical properties can prevent us doing things. Think of the time-set button on a wristwatch, it is usually small and recessed so that you have to use a pen point or something similar to hold it in position. While this makes it rather difficult to deliberately set the time, it also makes it almost impossible to accidentally do so. In a similar way, flip phones had a flap over the keyboard so that you cannot accidentally ring a number. Phones with physical keys, but no keyboard flap need to have an explicit keyboard-lock function, usually pressing several keys in combination. Notice the choice: the flip phone works by exploiting the physicality of the phone, the keyboard-lock uses digital interaction.
There are clearly many issues in play when making design decisions, but certainly when a physical constraint can be used, such as the keyboard flap or the recessed time set button, it is often more intuitive and robust.
The flip phone is not just preventing accidental phone calls, but as you open the flap new opportunities for interaction are made apparent. In terms of affordances, the flap affords opening and the buttons afford pressing. When using a digital key-lock for a phone with physical buttons, the buttons are still visible, and so it is possible to start to enter a number before one realises the phone is locked. In this case the perceptual affordance of the keys is there whether or not they actually afford entering a phone number. In contrast, with the flap, the perceptual affordances of the buttons are only revealed when they can be used.
Gaver calls this process sequential affordance, where performing an action corresponding to one affordance makes new affordances available and apparent. He gives the example of a twist door handle. The handle itself (if designed well) is hand shaped and sized and affords grasping. Once you have grasped the door handle, it then affords twisting, and finally having twisted the door handle, the door affords opening.
Note that the actual affordances have a sequence to them: you cannot twist the handle unless you are grasping it, you cannot pull open the door unless you have twisted the handle. The door handle also has perceptual affordances: the hand size and shape is visually apparent. But what about the twisting and opening affordances? How do we know what to do?
Obviously there are cultural issues in play, we know door handles often twist and that doors open, furthermore typically (but not always) door handles twist downwards. There may also be subtle visible signs in the shape of the door surround, or the visibility of the hinges, that tells us whether the door opens inwards or outwards Yet we do cope remarkably well when we do not know and indeed if we get it initially wrong.
Various factors are at work, but one is that the door handle ‘gives’ a little if you put pressure in the right direction — that is you can work out what the affordances are by feel. This ‘give’ is common in physical situations. You are not sure whether a box is full and too heavy to lift or empty, so you give it a little trial pull, not fully lifting it, but with enough pressure to see if it would lift if you really tried. You are not sure whether a door is locked, so you give it little trial pull to see if it starts to move, but not enough to fully open it.
In contrast many digital controls lack this ‘try it out’ nature, for example, with many touch screens if you put your finger on an area and it happens to be active, then you have already selected it. Sometimes additional visual clues are given. For example the mouse cursor may change shape, a screen area change colour, or a tool tip appear, or when you depress the mouse button over an active area the screen object may change colour or highlight. All of these give similar information of the small ‘give’ of the door handle, but certainly lack the intuitive and immediate feel of the physical feedback and rely more on the visual senses.
1. [Br03] Clive Bromhall. The Eternal Child: Staying Young and the Secret of Human Success. Ebury Press, Jan 2003. ISBN-10: 0091885744 [8.3]
2. [Cl98] Clark, A. (1998). Being There: Putting Brain, Body and the World Together Again. MIT Press. [8.4]
4. [Gi79] Gibson, J. (1979). The Ecological Approach to Visual Perception. New Jersey, USA, Lawrence Erlbaum Associates [8.1]
5. [KM94] Kirsh, D., & Maglio, P (1994) On Distinguishing Epistemic from Pragmatic Action. Cognitive Science: A Multidisciplinary Journal, Vol. 18, No. 4: pages 513-549 [8.4]
7. [MM77] Andrew N. Meltzoff, M. Keith Moore (1977). Imitation of Facial and Manual Gestures by Human Neonates. Science, New Series, 198(4312; 7 Oct. 1977):75-78. DOI: 10.1126/science.897687 [8.3]
8. [No98] Donald A. Norman, The Design of Everyday Things. MIT Press, 1998, ISBN-10: 0-262-64037-6 [8.2]
9. [Ra04] Ramachandran, V. (2004). A Brief Tour of Human Consciousness. Pi Press, New York. [8.3]
Originally entitled the “Psychology of Everyday Things”, but when it moved into paperback the publishers obviously thought ‘design’ was a better draw than ‘psychology’. ↩