• Keine Ergebnisse gefunden

on modern methods of recording and analysis of eye movement. To fully perceive a complex environment, after looking at one object we have to shift our gaze from one object to another.

This, at first glance rather simple pattern does not only involve the fastest muscular move-ment human bodies can perform - a saccade, one of only two voluntary eye movements -with up to 1000 ° s1 [19] but is also a highly complex task that involves more than just a single precise movement. In this section general eye movement and gaze behaviour will be discussed and the relevant points for gaze tracking will be highlighted.

The most simple task for the human eye one can think of, is looking at a stationary target.

The task of maintaining the visual gaze on a single target is called a fixation. "Fixations are defined as a spatially stable gaze lasting for approximately 200 ms to 300 ms, during which visual attention is directed to a specific area of the visual display." [20].

When recording the eye movement during fixations, at least three different movement pat-terns are recognizable: [21–23]

the drift: is a slow unintentional single angular movement of the gaze direction to actively move the gaze object out of the fovea,

the tremor: is an uncontrolled tremble of the eye that, on average, does not move the gaze target away from the fovea.

micro saccades:are short, unintentional and relatively slow eye movements to counter the effect of the drift and bring the gaze object back into the fovea.

Depending on the size of the object, thesaccadecan be induced as well.

If the gaze object starts to move slowly and as fast as the natural drift (above 0.1 ° s1, below 30 ° s1) the so-called smooth pursuit - the second intentional eye movement - is used to follow the objects path and fixate it in the fovea [24]. This smooth pursuit is still interrupted or distorted by the movements mentioned above. If the object speed moves beyond 30 ° s1 micro saccades are used to catch up with the object again up to a speed of 100 ° s1 [25].

If the object velocity goes above 100 ° s1, or multiple objects of interest appear inside the field of view, saccades can be triggered again [26]. During this movement or if the object speed is at least three times larger than the eye movement, all information processing by the eye will be suppressed [27].

2.2 eye and gaze tracking

The first time external hardware was used to monitor the described eye movements was re-ported by Delabarre in 1898, who proposed to use a contact lens formed from plaster to transfer the eye movement to a leaver [28]. While the eye had to be numbed using cocain, this enabled Delabarreto record eye movements for the first time. Further improvements to this method were made in the consecutive years. Until the work of Juddin 1905, others were able to use similar means to extract eye movements but were limited by their experimental setup to either one dimension (Dodge) or not being able to measure timing (Stratton) [28]. In 1905 the first non-invasive eye tracking methodology was introduced by placing a white particle on the eye [29] and tracking its movement. While other methods were developed to moni-tor eye movement, like Electro-OculoGraphy (EOG), Photo-Oculo Graphy (POG), Video-Oculo

14 fundamentals of visual perception

Graphy (VOG) and further improvements of the above mentioned Scleral Contact Lens meth-ods were made, those methmeth-ods are not applicable to this work. For example,EOG measures the difference in skin potential due to the muscular movements needed for eye movement and is applicable for detection of eye movement in a more general case. [23,30] Since EOG does not record what a subject is looking at, it cannot be used to measure the gaze behaviour of drivers in automotive use cases to an extend as needed in this thesis. While Scleral Contact Lenses or Search Coils attached to the subjects eye deliver by far, the most accurate results of less than 5′′ to 10′′ [23,31], its usable range of about 5° is too narrow for the use case at hand.POGandVOGdescribe different techniques to record eye movement. They often do not include the measurement of the absolute angle but focus on measuring the eye movement relative to the test subject’s head [23]. Only the measurement of both, eye movement com-bined with a head tracker allows for a detailed identification of points of regard [23].

2.2.1 video-based gaze tracking

Video-based gaze tracking has come to increased popularity over the last couple of years [32, 33]. New camera and computer technology enabled smaller, faster and more accurate gaze tracking [23, 34, 35]. This development leads to a broad application spectrum for gaze tracking, as collected by Duchowski which ranges from medical applications over aviation or driving tasks to marketing and advertisement analysis [36], or since as early as 1989, as a computer input interface [32,34,37–39].

To do this, the relative motion of a light reflection on the cornea, the so-called Purkinje Reflection (PR) of an external (infrared) light source to the movement of the pupil is recorded [40]. Due to the setup of the eye as described in 2.1, four Purkinje images are visible when using one external light source. Figure2.5shows the four different reflectionsPR1-4.

Figure2.5 – Schematic eye setup with the four Purkinje reflections PR 1-4, the incoming light (IL), the aqueous (A), the cornea (C), the sclera (S), the lens (L) and the iris (I) [40]

The different reflections occur due to multiple transitions between layers with different refractive indices.PR1 originates at the front surface of the cornea,PR 2at the back surface of the cornea, PR 3 at the front of the lens and PR 4 at the back surface of the lens. Due

2.2 eye and gaze tracking 15 to the change in refraction index the PR 2 is almost exactly coincident with PR 1. PR 3 is a virtual image and much larger and more diffuse than the other reflections. PR 4 is a real image again, and is formed at almost the same plane as PR 1 and 2. However, due to the much lower difference in refractive index here, only at about 1 % of the intensity ofPR 1. In case, the eye undergoes a rotation,PR1and4separate and from this physical splitting of the images, it is possible to calculate the angular orientation of the eye. [40]

New approaches to video-based gaze tracking propose conventional image processing paired with special neural networks to estimate the gaze direction from stable eye images without any additional light sources [41]. While Stiefelhagenachieves an accuracy of about 2° with the whole system being relatively inexpensive, this approach is not feasible for this thesis, since accurate gaze tracking is required for different lighting situations without disturbing the test subjects. For this reason, the use of an eye tracker paired with infrared light sources is chosen as the most suitable method.

After the angular movement of the eye is extracted from the image using thePRimages, the next step is to transfer the angular motion into a gaze direction. For this, an eye ball model is fitted to each test subject. This model is based on thePR, and the known position of the light sources, as well as the known distance from the test subject to the camera. Finding the centre of the pupil enables the possibility to calculate a vector from the middle of the eyeball to the centre of the pupil thus defining the gaze vector. To estimate the absolute gaze direction, the test subject’s head needs to be fixed in position, or head tracking is used to calculate the eye’s linear movement. Since camera technology has advanced in recent years, the high resolution of eye tracking cameras, enables the tracker to not only focus on the subject’s eyes, but to monitor the whole head and thereby track the head movement using points of interest.

These points of interest are significant points on human heads, where typically ears, nose, mouth corners and the eyes themselves are used. Further detail about the functionality and the different approaches to video-based gaze tracking goes beyond the scope of this thesis but can be found, for example, in the work by Duchowski[23].

2.2.2 video-based pupil tracking

As mentioned above, tracking the pupil and estimating its center point is crucial for gaze tracking. Additionally, this leads to the benefit of being able to read out, and explore the pupil behaviour for the investigated situations. Furthermore, pupil dilation can be used as a measure for attention, focus and more, and blinking can be used to measure fatigue [42,43].

For automotive use, those metrics are highly relevant since different emotions, attention and fatigue is known to influence driving and gaze behaviour. This has already been reported by CharlesDarwinas early as 1872 who monitored muscular movement in animal and human faces and thereby also recorded the pupillary movement [44].

Monitoring the pupil size on film has been done over the last couple of decades. In 1960 for example, Hess recorded videos of human eyes on 16 mm film and then measured the pupil diameter as a response to different images that are used to inflict different emotions in his participants [42]. To estimate the pupil size from the recorded video data, the region around the test subject’s eye is extracted and enlarged. In this area, an algorithm is used to find, depending on the use case, up to two bright/dark transitions. This has already been proposed and used successfully applied by Ebisawa in 1970 who used his approach for a human machine interface using a video-based gaze tracker to monitor the pupil diameter as well [45]. The first dark/bright transition is the border between the pupil and the iris, the

sec-16 fundamentals of visual perception

ond transition marks the boarder between the iris and the sclera. To track the pupil size, the first border, between pupil and iris, is sufficient. The size of the pupil can either be estimated by using a stereo camera system, that due to the calibration of the cameras to each other can also measure the distance to the subject, or a marker of known size which is attached to the eye, as shown in figure2.6.

Figure2.6– Measurement of the pupil size with an additional measuring strip added to below the eye for accuracy calibration [46].

Adding the second transition, increases stability and robustness of the algorithm and is therefore considered a useful feature. In particular, the alogorithm performs better, when using infrared light sources and test subjects with different iris colours since the contrast between the pupil and very bright eyes will be reduced. This is due to the fact, that the re-flectance of the human iris under infrared light is directly anti proportional to the rere-flectance under visible light and a very light iris appears dark under infrared light.

Another way, to change the contrast between the iris and the pupil is given by the position of the infrared light sources. When setting up the infrared light source, close to the optical axis of the camera, the light will be reflected back to the camera by the retina and the pupil will appear white (bright). When setting the light source further apart from the optical axis, the light is reflected away from the camera by the cornea due to the inverted geometry of the cornea compared to the retina and the pupil will therefore appear black (dark). Morimoto used both setups to get a clearer image of the pupil for all subjects. [47]

Tracking the pupil and calculating its size is not a difficult task, when the eye is directly in front of the camera and the pupil plane is parallel to the camera sensor. One of the challenges of tracking the pupil accurately is the distortion of the pupil under different angles towards the video cameras. This effect, the so-called pupil foreshortening error, is a technical error that can easily be avoided or re-mapped. However, many factors that cannot be recorded or controlled without a great amount of effort include fatigue, certain food and emotions. On the other hand, physiological factors can easily be recorded and limited to a certain degree.

Winn investigated different physiological parameters and their influence on the pupil diam-eter under differnet, constant light conditions ranging from 9 cd m2 up to 4400 cd m2. He found, that the pupil diameter under different light conditions is independent on sex, refrac-tion errors in the subjects lenses or iris colour, but declines linearly with age [43]. Fotiou found similar results for the dependence on age, when investigating the pupil diameter un-der dark adaptation [48]. While no difference in latency was recorded between the two age groups, he measured a significantly lower pupil diameter as well as a lower pupil velocity.