• Keine Ergebnisse gefunden

6   Conclusion

3.2.4   Analysis of gaze

SFP axis notation code were considered to be adequate. The orientation towards the objects was also coded with regard to the orientation of the shoulders of the user.

Based on these annotations, the body orientation of human and robot could be determined at all times of the interaction. It is of interest here what orientation the users chose, whether the orientation depends on certain tasks, and what the switches in orientation tell us about the importance for spacing in the interaction. With respect to the last point, the question of how much the participants try to reach a certain position needs to be considered (how much time do they spend on spatial adaptation, how often do they change their orientation, and what does this imply for their need to attain the robot’s attention?). These questions are analyzed in the following without taking proxemics (the distance between the user and the robot) into account.

There surely is a correlation between both measures, for example, the closer a person stands to another person, the more indirect the body orientation usually is. However, there are two main reasons to focus on body orientation here: as has been mentioned before, much work has already been conducted on proxemics in HRI while body orientation has not played such an important role. Yet, body orientation is considered here as an important factor because it structures the interaction and, hence, implies a great deal of information that could be useful for the robot to better understand what the user wants to do. Moreover, the robot is not yet able to adapt its requirements concerning the human body model to the situation, i.e., in all situations it needs the same percepts to represent the user. However, it can be assumed that body orientation differs between tasks. That is why research into body orientation can help to develop a more flexible human body model. On the other hand, it can also help to improve the robot’s behavior model and to enable it to adapt its behavior to the situation and, hence, to improve the interaction. Thus, the robot also needs knowledge about the social function of spatial behavior.

To analyze the data with these aspects in mind, the body orientation was annotated in the data of the second iteration of the home tour study in the apartment. To test the reliability of the coding scheme, interrater agreement was calculated (see Section 5.1.3).

Table 3-6. Overview of analyses of body orientation Object-teaching 1 no analysis of body orientation Object-teaching 2 no analysis of body orientation Home tour 1 no analysis of body orientation

Home tour 2 statistical analysis of body orientation (Section 5.1.3)

Gaze differs from other modalities because it is a non-verbal signal and at the same time a way of perceiving others, their faces, and expressions. Gaze is both a signal and a channel. People use the direction of their gaze to designate persons or objects that they are attending to. Hence, gaze is one kind of directing-to action (Clark, 2003, see Section 3.2.2). However, it is not effect-ive unless it is registered by the other interactor. Therefore, it is usually grounded by mutual gaze. Goodwin’s (1981) conversational rule states that the speaker has to obtain the gaze of a listener to produce a coherent sentence. To attract the gaze, the speaker can either make a restart at the beginning by uttering a phrase before uttering the coherent phrase or by pausing. More-over, gaze is often used along with face direction, torso direction, and pointing (Clark, 2003).

Even though gaze is an important signal of attention, speakers look intermittently at their listeners (Argyle, 1988). Reasons for frequent gaze shifts are that looking at the listener all the time could lead to cognitive overload or the arousal caused by the gaze could infer with the cognitive planning. Argyle (1988) provides basic statistics of people’s gaze in emotionally neutral conversations with a distance of about two meters (see Table 3-7).

Table 3-7. Statistics of gaze behavior (Argyle, 1988)

individual gaze 60% of the time while listening 75% of the time while talking 40% of the time length of glance 3 seconds mutual glance 30% of the time length of mutual glance 1.5 seconds

The table shows that the amount of gazes varies (Argyle, 1988). For example, the interactors look less at each other when there are other things to look at, especially when there is an object of joint attention. This indicates that in an object-teaching task, where the object is in the focus of attention, the users will look less at the robot (see Sections 4.2.5 and 5.1.4). Moreover, the relationship between the interactors influences the amount of gaze. Argyle (1988) found that strangers who were two meters apart looked at each other 40% of the time. He also reported that this number might be higher for couples and it is also higher if the individuals like each other in general. Also, the spatial situation is of importance; greater distances lead to more gazes.

Finally, the personalities of the interactors determine the amount of looking. Dominant individuals were found to look more while they talk (Argyle, 1988).

Next to these situational constraints, also the conversation itself structures gaze behavior. The main reason that people gaze is to obtain additional information especially about what was just said. That is why more gaze is necessary at the end of the utterance because then feedback is needed. Hence, speakers look more at the end of utterances and look away at the beginning of utterances, especially if they have been asked questions. In turn, listeners typically look 70% to 75% of the time in quite long glances of about seven to eight seconds, because they try to pick up the non-verbal signals of the speaker. However, they also look less than 100% of the time to decrease cognitive load and arousal (Argyle, 1988).

At the end of each turn, often (62%) a terminal gaze occurs (Kendon, 1967). A terminal gaze is a prolonged gaze at another just before the end of long utterances. If this gaze is not exchanged, the transition to the next speaker takes longer.

According to Argyle (1988), the usual measure of gaze is mutual gaze, i.e., the percentage of time that is spent looking at another in the area of the face. Moreover, one can measure looking rates while talking and while listening, average length of glances, pattern of fixation, pupil dilation, eye expression, direction of gaze-breaking, and blink rate. Usually the measures are chosen according to the research question at hand.

Brand, Shallcross, Sabatos, and Massie (2007) use eye gaze as a measure of interactiveness in child-directed demonstration. They count gaze bouts (gaze shifts from elsewhere to the face of the partner) per minute, measure gaze duration (the percentage of the demonstration spent gazing at the partner), and compute average gaze length (average length of each gaze bout).

Based on this work, Vollmer et al. (2009) evaluated eye gaze in studies with the virtual agent Babyface as a measure of contingency. The results were then compared to child and adult-adult interaction. Comparable to the data presented here, their study was also based on a teaching scenario. However, the participants did not teach objects but actions. For their data, Vollmer et al. (2009) computed the frequency of eye-gaze bouts to the interaction partner and the object (eye-gaze bouts per minute), the average length of eye-gaze bouts to the interaction partner and object, and the total length of eye-gaze bouts to the interaction partner and object (percentage of time spent gazing at agent or object).

Gaze in HRI

Some work has also been conducted on gaze in HRI. Staudte and Crocker (2008) carried out an experiment focusing on the production of robot gaze in a scenario where the robot pointed out objects to the participants either correctly or falsely. They measured the reaction time of the human depending on the gaze direction and the speech of the robot and showed that in the case that the robot’s gaze or speech behaviors were inappropriate, the interaction slowed down measurably in response time and fixation distribution.

Sidner, Kidd, Lee, and Lesh (2004) report a study in which the robot Mel gazed at the participants. They coded shared looking (mutual gaze and both look at the same object) and found a positive relationship between shared looking and engagement of the interactors.

Moreover, they showed that looking and gestures are more powerful than just speech to achieve engagement. They get people to pay more attention to the robot and they may also cause people to adjust their gaze behavior based on the robot's gaze.

Green (2009) reports gaze as one mean to establish contact in a home tour scenario. Users “look for” feedback in order to find out whether a command has been received correctly by the robot.

Therefore, gazing at the robot indicates that some kind of feedback is necessary. Green (2009) concludes that the system should provide continuous information on its status of availability for communication. This status can be provided with the gaze of a camera because just like a human eye the camera behavior of the robot can be used to grab the user’s attention and to make contact. This assumption is also followed in the work with BIRON.

To conclude, gaze has been shown to have several functions in HRI such as communicating status, engaging the user, and making interaction more efficient. These functions have been evaluated with different measures.

Coding of gaze

The general question regarding gaze in the following analysis is how the situation and the expectations influence human gazing behavior. Therefore, it will be evaluated where the users gazed in general and in different phases of the interaction. Moreover, it will be determined how gaze related to other modalities (speech, gesture) since strong interrelations between all modalities can be expected. The users’ gaze behavior was annotated using three distinct directions: gaze at the robot, gaze at the object of interest, and gaze somewhere else. Since only three gaze directions were differentiated and many studies have found interrater agreement of over 90% for coded gazing behaviors with more diversified schemes (Argyle, 1988), interrater reliability has not been calculated.

Table 3-8. Overview of analyses of gaze

Object-teaching 1 no analysis of gaze behavior

Object-teaching 2 statistical analysis of gaze direction (Section 4.2.5) Home tour 1 no analysis of gaze behavior

Home tour 2 statistical analysis of gaze direction (Section 5.1.4)