Datum Informatik II
Adrian Kündig
adkuendi@student.ethz.ch
1 Samstag, 27. April 13
The beginning of gestures based interfaces
2 Samstag, 27. April 13
§ 1970 Myron W. Krueger and VideoPlace
http://www.inventinginteractive.com/2010/03/22/myron-krueger/http://sofa23.net/index.php?m=1&sm=&t=23&sp=18&spic=43&me=show%20all&s=
3 Samstag, 27. April 13
One of the first prototyped VR
Using cameras for recognition
Simple ideas
Gesture Recognition
(Baudel and Beaudouin-Lafon, 1993)
§ 1970 Myron W. Krueger and VideoPlace
§ 1993 Charade
4 Samstag, 27. April 13
First formal definition of gestures Control PowerPoint
dataglove
4 line = fingers, 1 line = thumb
Gesture Recognition
§ 1970 Myron W. Krueger and VideoPlace
§ 1993 Charade
5 Samstag, 27. April 13
Selection of gestures
Gesture Recognition
§ 1970 Myron W. Krueger and VideoPlace
§ 1993 Charade
§ 2002 Minority Report
http://7thperbmmrblog.blogspot.ch/2011/01/william-bermudez.htmlhttp://thomaspmbarnett.com/globlogization/2013/2/5/times-battleland-terrorism-minority-report-has-finally-arriv.html
6 Samstag, 27. April 13
Hollywood movie from Steven Spielberg
Rooted in Research from John Underkoffler
“like conducting an orchestra”
tom cruise
§ 1970 Myron W. Krueger and VideoPlace
§ 1993 Charade
§ 2002 Minority Report
§ 2009 Oblong Industries
7 Samstag, 27. April 13
Last step in our history of gesture based interfaces Commercial company founded by John Underkoffler developed g-speak
Intended for big data analysis
Requires specialized applications
Oblong Industries - Demo
http://oblong.com/g-speak/
8 Samstag, 27. April 13
Orientation in 3D Selection
Segmentation
http://oblong.com/g-speak/
8 Samstag, 27. April 13
Orientation in 3D Selection
Segmentation
Common Factor
http://www.5dt.com/DataGloveImages.html
9 Samstag, 27. April 13
most shown systems have in common: data glove Hand tracking
Hand reconstruction
Feedback
10 Samstag, 27. April 13
Free up hands
Remove instrumentation
Muscle Computer Interface
§ Hands free gestures while holding an object
§ Arm band like design
§ Sensing muscle activity
(Saponas et al, 2009)
11 Samstag, 27. April 13
Hands free
Muscle sending
Muscle Computer Interface - Technology
http://painmd.tv/wp-content/uploads/2011/04/emg-muscle-configuration.gif
12 Samstag, 27. April 13
EMG or Electromyography
primarily in Medical therapy (muscle function assessment, controlling prosthetics)
Action Potential generated by muscle when signal arrives from Motor Neuron
Invasively by inserting a needle into the muscle
Non invasively by sensing on the skin
Muscle Computer Interface - Technology
http://www.emgsrus.com/graphics/emg_trial_rect_page.png
(Saponas et al, 2009)
13 Samstag, 27. April 13
here measured activity 6 Different muscles
Peaks of action potentials
Muscle Computer Interface - Technology
http://www.nature.com/gimo/contents/pt1/fig_tab/gimo32_F2.html
Support Vector Machine
§ Root mean square
§ Frequency energy
§ Phase Coherence
14 Samstag, 27. April 13
6 Sensors and 2 ground electrodes Features extracted from 31ms sample
- Root Mean Square of amplitude per channel and ratio of pair of channels sqrt(1/n * (x1^2 + x2 ^ 2 + ...))
- Frequency energy via FFT
- Relationship between channels
Classified from SVM into gestures
Support Vector Machines
§ Binary Linear Classifier
§ Extended to multiple classes
https://en.wikipedia.org/wiki/File:Kernel_Machine.png
15 Samstag, 27. April 13
Function phi transforms feature space, such that it is possible to lay a hyper plain between two classes
Try to lay separator such that separation is most clear
Multiple classes by (one vs rest) or pairwise (one vs one)
Muscle Computer Interface - Demo
16 Samstag, 27. April 13
Guitar hero
input is sent as soon as user touches both fingers
Muscle Computer Interface - Demo
(Saponas et al, 2009)
16 Samstag, 27. April 13
Guitar hero
input is sent as soon as user touches both fingers
Muscle Computer Interface
§ Pro
§ No instrumentation of hand
§ Hidden near elbow
§ Contra
§ Inaccurate compared to some following papers
§ Muscle activity required
17 Samstag, 27. April 13
79 % accuracy
Gesture Wrist
§ Hands free gestures
§ Embed sensing device in wrist watch
§ Feedback on gesture
(Rekimoto, 2001)
18 Samstag, 27. April 13
Gesture Wrist - Technology
Receiver Electrodes
Acceleration Sensor Piezo Actuator
Transmitter Electrode Original wristwatch dial
Receiver electrodes
Transmitter electrode Tilt sensor
(ADXL202)
Wrist
Piezo-actuator
Figure 2: GestureWrist: Wristband-type input device.
3.2 On-body networking
Based also on capacitive sensing, a technique that trans- mits data through the human body has been proposed [14, 5]. Here, both a transmitter and a receiver are capacitively coupled to the human body. When a transmission signal is modulated by data (by using amplitude shift keying (ASK) or frequency shift keying (FSK)), this affects the modi- fied signal that is received at the receiver side. Using this technology, wearable devices can communicate with each other [14], or they can automatically authenticate digital devices that are touched [5]. We also use this technique for distinguishinga wearer from other people while interacting with GesturePad.
4 GestureWrist: A wristband-type input de- vice
GestureWrist is a wristwatch-type input device that rec- ognizes human hand gestures by capacitively measuring wrist-shape changes and also measuring forearm move- ments.
Figure 3: Sensing arm-shape change based on capacitive sensing.
Figure 2 shows the current GestureWrist prototype. This device consists of two input sensors (capacitance and ac- celeration sensors), and one tactile feedback actuator. The prototype is fabricated by attaching the sensors and actu- ators to a conventional wristwatch. We expect that em- bedding all the sensing elements within the wristwatch and the wristwatch band is technically possible, so a wearer can use this system in any social situation. Sensed infor- mation is processed at an external signal-processing board connected by a cable.
4.1 Hand-gesture recognition
GestureWrist recognizes hand gestures by measuring the changes of the arm shape on the inside of the wrist- band. To do this, a combination of transmitter and receiver electrodes are attached to the back of the watch dial and in- side of the wristband. As described in the previous section, this combination acts as a capacitance sensor.
The principle of gesture sensing is shown in Figure 3.
When a wearer opens and closes his or her hand, the cross- sectional shape of the wrist changes accordingly; partic- ularly, the left and right parts around the forearm sinew slightly bulge or cave in. A transmitter behind the wrist- band dial transmits a square wave signal (at approximately 160KHz). This signal goes through the wrist, and is re- ceived by the receiver electrodes on the wristband. The amplitude of the receiving signal is determined by the ca- pacitance between the transmitter electrode and the wrist, the resistance of the wrist, and the capacitance between the wrist and the receiver electrode. Since the first two values are mostly stable, the received signal strength is mainly determined by the last parameter (capacitance between the wrist and the receiver).
To calibrate the displacement of receiving electrodes, more than one electrode is installed on the wristband. The current prototype has three receivers. Each transmitter- receiver pair produces sensed values. The values conform Receiver Electrodes
Acceleration Sensor Piezo Actuator
Transmitter Electrode Original wristwatch dial
Receiver electrodes
Transmitter electrode Tilt sensor
(ADXL202)
Wrist
Piezo-actuator
Figure 2: GestureWrist: Wristband-type input device.
3.2 On-body networking
Based also on capacitive sensing, a technique that trans- mits data through the human body has been proposed [14, 5]. Here, both a transmitter and a receiver are capacitively coupled to the human body. When a transmission signal is modulated by data (by using amplitude shift keying (ASK) or frequency shift keying (FSK)), this affects the modi- fied signal that is received at the receiver side. Using this technology, wearable devices can communicate with each other [14], or they can automatically authenticate digital devices that are touched [5]. We also use this technique for distinguishinga wearer from other people while interacting with GesturePad.
4 GestureWrist: A wristband-type input de- vice
GestureWrist is a wristwatch-type input device that rec- ognizes human hand gestures by capacitively measuring wrist-shape changes and also measuring forearm move- ments.
Figure 3: Sensing arm-shape change based on capacitive sensing.
Figure 2 shows the current GestureWrist prototype. This device consists of two input sensors (capacitance and ac- celeration sensors), and one tactile feedback actuator. The prototype is fabricated by attaching the sensors and actu- ators to a conventional wristwatch. We expect that em- bedding all the sensing elements within the wristwatch and the wristwatch band is technically possible, so a wearer can use this system in any social situation. Sensed infor- mation is processed at an external signal-processing board connected by a cable.
4.1 Hand-gesture recognition
GestureWrist recognizes hand gestures by measuring the changes of the arm shape on the inside of the wrist- band. To do this, a combination of transmitter and receiver electrodes are attached to the back of the watch dial and in- side of the wristband. As described in the previous section, this combination acts as a capacitance sensor.
The principle of gesture sensing is shown in Figure 3.
When a wearer opens and closes his or her hand, the cross- sectional shape of the wrist changes accordingly; partic- ularly, the left and right parts around the forearm sinew slightly bulge or cave in. A transmitter behind the wrist- band dial transmits a square wave signal (at approximately 160KHz). This signal goes through the wrist, and is re- ceived by the receiver electrodes on the wristband. The amplitude of the receiving signal is determined by the ca- pacitance between the transmitter electrode and the wrist, the resistance of the wrist, and the capacitance between the wrist and the receiver electrode. Since the first two values are mostly stable, the received signal strength is mainly determined by the last parameter (capacitance between the wrist and the receiver).
To calibrate the displacement of receiving electrodes, more than one electrode is installed on the wristband. The current prototype has three receivers. Each transmitter- receiver pair produces sensed values. The values conform The first device, GestureWrist, is a wristwatch-type in-
put device that recognizes human hand gestures by capac- itively measuring changes in wrist shape. Combined with an acceleration sensor, which is also mounted to the wrist- band, the GestureWrist can be used as a command-input device, with a physical appearance almost identical to to- day’s wristwatches.
The latter device, GesturePad, is a layer of sensor elec- trodes that transforms conventional clothes into interaction devices, or “interactive clothing”. This module can be at- tached to an area of clothes such as a sleeve or a lapel. Also on capacitive sensing, it can detect and read finger motions applied to the outside of the clothing fabric, while shield- ing the capacitive influence from the human body.
2 Related work
Some wearable computers use physical dials, buttons, or touch-pads as input devices [10]. These devices are used to select menus or control nearby ubiquitous computers or appliances. We are aiming at similar applications by using more unobtrusive devices.
Baudel and Beaudouin-Lafon demonstrated a hand-gesture input system that is used as a remote control method [1].
A wearer can control a presentation system by using hand- gestures. Since this system is based on “DataGlove” and an attached position sensor, a user has to first put on a glove to use it. In contrast, our solution aims to be more seamless;
using wearable input devices requires no particular prepa- ration.
GesturePendant is a camera-based gesture recognition system that can be worn like a pendant [9]. A user can hand gesture in front of it while it is worn around the neck. The current prototype is still noticeably bigger than an ideal one, and would presupposedly always wear it over their clothes.
Wireless FingerRing is a hand-worn input device con- sisting of acceleration-sensitive finger rings and a wristband- type receiver [3]. A user puts on four rings, and taps on a flat surface with one finger. This is detected by the ring’s sensor, and the information is transmitted to the wristband receiver through an on-body network. Acceleration Sens- ing Glove also uses an acceleration sensor on each finger- tip [6]. While wearing one finger ring is common and so- cially accepted, putting on four rings is unusual and thus it is unlikely all of us would do it. Supplying sufficient power to operate all the finger rings is an additional un- solved technical problem.
Measuring muscle tension (electromyogram, or EMG) and using the information as computer inputs has been widely studied [12]. This method is important for people with physical disabilities. However, it also involves some difficulties. One problem is placing the electrode. To cor- rectly measure electricity, electrodes must have direct con- tact to the skin, often requiring wet-conductive gel. At least two (and often at least three) electrodes need to be attached to the skin, and maintain certain distances. These require- ments make it difficult to configure a simple wristband- type EMG sensor that can be easily worn. Our method measures the cross-sectional shape of the wrist, instead of using an EMG, to detect hand motions.
LPF AD Converter Analog
switch Transmitter Receiver
Wave Signal
Transmitter Receiver
Figure 1: A capacitive sensor is used to measure distance between sensor electrodes and an object.
3 Technological background
Before describing our proposed input devices, we briefly introduce their sensing technologies.
3.1 Capacitance sensing
“Capacitance sensing” is a technique for measuring dis- tances of nearby conductive objects by measuring the ca- pacitance between the sensor and the object and uses a transmitter and a receiver electrode (Figure 1). When the transmitter is excited by a wave signal (of typically several hundred kilohertz), the receiver receives this wave. The magnitude of the receiving signal is proportional to the fre- quency and voltage of the transmitted signal, as well as to the capacitance between the two electrodes.
When a conductive object is close to both electrodes, it also capacitively couples to the electrode and strengthens the receiving wave signal amplitude. When a conductive and grounded object is close to both electrodes, it capaci- tively couples to the electrodes, drains the wave signal, and thus weakens the received signal amplitude. By measuring these effects, it is possible to detect the proximity of con- ductive objects.
The received signal often contains noises from nearby electric circuits and inverters of fluorescent lamps. To accurately measure signals from the transmitter electrode only, a technique called “lock-in amplifier” can be used.
This technique uses an analogue switch as a phase-sensitive detector. A control signal is used to switch it on and off, to select signals that have the synchronized frequency and phase of the transmitted signal. Normally, a control sig- nal needs to be created by phase-locking the incoming sig- nal, but for capacitive sensing, the system can simply use a transmitted signal, because the transmitter and the receiver are both on the same circuit board.
This capacitive sensing technique is mainly used for proximity and position sensors [15]. In our work, capaci- tive sensing is used for measuring the arm shape by plac- ing both the transmitter and the receiver electrodes on a wristband, and for measuring finger positions by attaching electrodes on the inside of clothes.
§ Wave signal is transmitted
§ The receivers are synchronized
§ The received strength is proportional to the distance
19 Samstag, 27. April 13
Actuator vibrates
measure the capacitance of the wrist and the receiver electrodes
measuring the distance between wristband an wrist
Gesture Wrist
§ Distinguish ‘Point’ and ‘Fist’ pose
(Rekimoto, 2001)
Gesture Wrist - Technology
20 Samstag, 27. April 13
Clear difference between point and fist
Only two gestures used to differentiate gestures
Gesture Wrist - Examples
§ Distinguish ‘Point’ and ‘Fist’ pose
§ Combined with an accelerometer
§ Rotation also recognizable
21 Samstag, 27. April 13
Only two gestures used to differentiate gestures
Use rotation to control slider or knob
Gesture Wrist
§ Pro
§ Small, watch like design
§ Sensor embedded inside accessory
§ Simple recognition method
§ Contra
§ Only a small set of gestures can be recognized
(Rekimoto, 2001)
22 Samstag, 27. April 13
Hand Shape with Wrist Contour
§ Hands free gestures
§ Wrist watch like design
23 Samstag, 27. April 13
Hand Shape with Wrist Contour - Technology
§ Static wrist band
§ Photo reflectors
§ Senses distance between band and skin
Wrist contour Wrist cross section
Hand shape
Flexor and extensor carpi
Flexor and extensor pollicis
Flexor and extensor digitorum
0 X
Y
Y
0 X
Figure 2. Wrist contour basis.
Hand shape classification Output to distance
conversion Feature extraction Data collection
and transfer Measurement of wrist contour
by photo reflector array Sensor device
RF PC
Figure 3. Data flow block diagram.
shows examples of hand shapes and wrist contour sets. Mus- cles and tendons for finger movements are compacted near the elbow. Around the wrist, however, tendons and muscles are separated to some extent, so they are comparatively ob- servable. We observed the variation of their thicknesses and positions, which vary with finger movements. For example, to bend a finger, a flexor contracts and the nearby wrist sur- face dents. To straighten a finger, a flexor relaxes and the nearby wrist surface becomes as before. Our approach is to recognize hand shapes from these variations.
WRIST CONTOUR MEASURING SYSTEM
Figure 3 shows our system configuration and data flow dia- gram. We developed a wrist watch type sensor device (Fig- ure 4) and a recognition system.
Required specification
Human constraints and our design are as follows.
• Human constraints:
(1a) Muscles and tendons for finger movements are approx- imately 5 mm in diameter. (1b) Radial variation of wrist contour is approximately 5 mm at maximum.
(2a) Wrist circumference is approximately 150 ∼ 170 mm.
(2b) Human arm motions should not be interrupted.
• Design:
(1a) Sensor pitch is 2.5 mm around circumference. (1b) Ra- dial resolution of the sensors is 0.1 mm.
(2a) Measurement area is at least 170 mm in circumference.
(2b) The band is narrower than 30 mm.
To achieve the design requirements, we adopted photo re- flector sensors and shift register switching method.
Photo reflector as distance sensor
Photo reflector is a combination of infrared LED and photo transistor. LED transmits an infrared signal and Photo tran-
Measurement band
Control board(front,rear)
Battery and control part Measurement part
Pitch:2.5mm Photo reflector
ZigBee module Spacer
Micro controller
25 mm12 mm
Cross section
Fixing band
Control board Battery
Figure 4. Wrist contour measuring device.
Infrared signal 2.5mm
Figure 5. Mechanism of photo reflector.
sistor detects the intensity of the signal reflected at the sur- face of the object as shown in Figure 5. We selected a small photo reflector sensor ”NJL5901AR-1” (produced by New Japan Radio Co.) to achieve the measurement density 2.5mm.
Because an output of photo reflector is non-linear with dis- tance, and sensors have individual differences, raw outputs cannot be used for measuring distances as they are. Then, we calibrated the outputs by prior measurement. We measured range of 0 ∼ 10mm with 0.05mm pitch with 1-axis automatic stage to achieve 0.1mm radial resolution. As a result, we achieved 0.1mm resolution in 0 ∼ 3.5mm. As figure 6 indi- cates, the smooth surface of an inclined flat board can be recognized in the range of 0 ∼ 3.5mm.
0 50 100 150 200 250
1 11 21 31 41 51 61 71
Distance (mm)
A/D Conv. output (8-bit)
Sensor number 250
200 150 100 50
01 11 21 31 41 51 61 71 10
8 6 4 2 0 Raw data
Distance
10mm
Figure 6. Measuring an inclined board.
Clock
Output
D Q D Q D Q Control
Vcc
Q1 Q2 Q3
Clock Control Q1 Q2 Q3
Signal Circuit
Shift
Register
Figure 7. Shift register switch- ing method.
Shift register switching method
To measure the whole circumference of wrist contours, we arranged photo reflector sensors in rows. We mounted them
Paper Session: Home and Away UbiComp'11 / Beijing, China
312
(Fukui et al, 2011)
24 Samstag, 27. April 13
150 sensors
Hand Shape with Wrist Contour - Demo
25 Samstag, 27. April 13
static image representing gesture
Hand Shape with Wrist Contour - Demo
(Fukui et al, 2011)
25 Samstag, 27. April 13
static image representing gesture
Hand Shape with Wrist Contour - Examples
26 Samstag, 27. April 13
The recognized gesture set
some gestures quiet similar
Hand Shape with Wrist Contour - Accuracy
(Fukui et al, 2011)
27 Samstag, 27. April 13
Confusion matrix wide spread
boosting method and k-NN method rather simple
diagonal is correctly recognized
Hand Shape with Wrist Contour
§ Pro
§ Small, watch like design
§ Can be hidden inside accessory
§ New approach to gesture recognition
§ Contra
§ Bad recognition rate
§ Limited set of gestures
28 Samstag, 27. April 13
Digits
§ Recover full 3D hand model
§ Cheap hardware
§ Low power
(Kim et al, 2012)
29 Samstag, 27. April 13
Already partly presented by Professor Hilliges in the introduction of the seminar
more sophisticated
imitates data glove
Digits - Technology
3D Laser Triangulation
Background Subtraction CCL & Tracking
Hand Pose Recovery
30 Samstag, 27. April 13
We use a number of image processing techniques to segment and track five discrete points on the fingers
Knowing the camera and laser posi;on we can triangulate 3D posi;ons from this informa;on
And finally use a kinema;cs model to recover the full hand configura;on
Digits - Technology
3D Laser Triangulation
Background Subtraction CCL & Tracking
Hand Pose Recovery
(Kim et al, 2012)
30 Samstag, 27. April 13
We use a number of image processing techniques to segment and track five discrete points on the fingers
Knowing the camera and laser posi;on we can triangulate 3D posi;ons from this informa;on
And finally use a kinema;cs model to recover the full hand configura;on
Digits - Technology
3D Laser Triangulation
Background Subtraction CCL & Tracking
Hand Pose Recovery
30 Samstag, 27. April 13
We use a number of image processing techniques to segment and track five discrete points on the fingers
Knowing the camera and laser posi;on we can triangulate 3D posi;ons from this informa;on
And finally use a kinema;cs model to recover the full hand configura;on
Digits - Technology
3D Laser Triangulation
Background Subtraction CCL & Tracking
Hand Pose Recovery
(Kim et al, 2012)
30 Samstag, 27. April 13
We use a number of image processing techniques to segment and track five discrete points on the fingers
Knowing the camera and laser posi;on we can triangulate 3D posi;ons from this informa;on
And finally use a kinema;cs model to recover the full hand configura;on
Digits - Technology
3D Laser Triangulation
Background Subtraction CCL & Tracking
Hand Pose Recovery
30 Samstag, 27. April 13
We use a number of image processing techniques to segment and track five discrete points on the fingers
Knowing the camera and laser posi;on we can triangulate 3D posi;ons from this informa;on
And finally use a kinema;cs model to recover the full hand configura;on
Digits - Examples
(Kim et al, 2012)
31 Samstag, 27. April 13
accurate
Digits - Demo
32 Samstag, 27. April 13
shooting
grabbing
pulling
Digits - Demo
(Kim et al, 2012)
32 Samstag, 27. April 13
shooting
grabbing
pulling
Digits
§ Pro
§ Portable
§ Intern processing
§ Accurate replacement for data glove
§ Contra
§ As obtrusive as a data glove
§ Occlusion is major problem
33 Samstag, 27. April 13
Towards bimanual gestures
34 Samstag, 27. April 13
previous papers all tried to reconstruct a model of the hand in a more or less accurate fashion
In the next paper we will see a move away from reconstruction
towards using the second hand for input and the first hand as a
trigger
Gesture Watch
§ Contact free interface
§ Unobtrusive
35 Samstag, 27. April 13
device recognizing other hand
wearing arm used to initiate gesture
Gesture Watch - Technology
Sensor signal
Recognized gesture
(Kim et al, 2007)
36 Samstag, 27. April 13
4 proximity Sensors arranged in a cross + 1 for initiating towards the hand
binary 0/1 sensors
Gesture Watch - Examples
37 Samstag, 27. April 13
proposed gestures
Gesture Watch
§ Pro
§ Unobtrusive desgin
§ Sensors embedded
§ Contact free
§ Private
§ Contra
§ Requires action from second hand to start gesture
(Kim et al, 2007)
38 Samstag, 27. April 13
private by hiding the gesture from other people
39 Samstag, 27. April 13
But still, instrumentation of the user is required To get hands free
To be cheaper
Sound Wave
§ No instrumentation of user
§ Reusing existing hardware
(Gupta et al, 2012)
40 Samstag, 27. April 13
Reuses speakers and microphone from an existing laptop
Sound Wave - Technology
formed and sensed [4]. While these projects show the po- tential of low-cost sonic gesture sensing, they require cus- tom hardware, which is a significant barrier to widespread adoption. In our work, we focus on a solution that works across a wide range of existing hardware to facilitate im- mediate application development and adoption.
THE SOUNDWAVE SYSTEM
SoundWave uses existing speakers on commodity devices to generate tones between 18-22 kHz, which are inaudible.
We then use the existing microphones on these same devic- es to pick up the reflected signal and estimate motion and gesture through the observed frequency shifts.
Theory of Operation
The phenomenon SoundWave uses to sense motion is the shift in frequency of a sound wave in response to a moving object, an effect called the Doppler effect. This frequency shift is proportional to source frequency and to the velocity with which the object moves. In our approach, the original source (the speakers) and listener (the microphone) are sta- tionary, thus in absence of any motion, there is no frequen- cy change. When a user moves his hand, however, it re- flects the waves, causing a shift in frequency. This frequency is measured by the microphone ( ) and can be described by the following equation, which is used for Doppler radar as well as for estimating frequency changes in reflection of light by a moving mirror [2]:
Figure 2 shows the frequency of the signal (a) when no mo- tion is present and when a hand is moved (b) away from or (c) closer to the laptop. This change in frequency as a hand moves farther or closer is one of the many characteristic properties of the received signal that we leverage in detect- ing motion and constructing gestures.
Algorithm & Implementation Details
SoundWave generates a continuous pilot tone, played through the device’s speakers at the highest possible fre- quency (typically in the range of 18-22 kHz on commodity audio systems). Although we have verified that SoundWave can operate on audio down to 6 kHz, we favor tones above
18 kHz since they are generally inaudible [1]. Additionally, the higher the frequency, the greater the shift for a given velocity, which makes it computationally easier to estimate motion at a given resolution. The upper bound is largely a function of most laptop and phone speaker systems only being capable of producing audio at up to 22 kHz. Fortu- nately, we do not need much higher frequencies to sense the relatively coarse gestures we are targeting.
Due to variations in hardware as well as filtering in sound and microphone systems, SoundWave requires an initial calibration to find the optimal tone frequency (no user in- tervention is required). It performs a 500 ms frequency sweep, and keeps track of peak amplitude measurements as well as the number of candidate motion events detected (i.e., potential false positives). SoundWave selects the high- est frequency at which minimum false events are detected and the peak is most isolated (i.e., the amplitude is at least 3 dB greater than next-highest peak in the sweep range).
The system consistently favors the 18-19 kHz range.
With the high-frequency tone being emitted, any motion in proximity (around 1 m depending on speed) of the laptop will cause Doppler-shifted reflections to be picked up by the microphone, which is continuously sampled at 44.1 kHz. We buffer the incoming time-domain signal from the microphone and compute the Fast Fourier Transform (FFT) with 2048-point Hamming window vectors. This yields 1024-point magnitude vectors that are spread equally over the spectral width of 22.05 kHz. After each FFT vector is computed, it is further processed by our pipeline: signal conditioning, bandwidth extraction, motion detection, and feature extraction.
Signal Conditioning: Informal tests with multiple people indicated that the fastest speed at which they could move their hands in front of a laptop was about 3.9 m/sec. Hence, we conservatively bound signals of interest at 6 m/sec. Giv- en our sampling rate and FFT size, this yields about 33 fre- quency bins on either side of the emitted peak.
Bandwidth Extraction: As seen in Figure 2, motion around the device creates a shifted frequency that effectively in- creases the bandwidth of the pilot tone (i.e., window aver- aging and spectral leakage blur the movement of the peak).
To detect this, SoundWave computes the bandwidth of the pilot tone by scanning the frequency bins on both sides in-
Figure 2: (a) Pilot tone with no motion. (b and c) Increase in bandwidth on left and right due to motion away from and towards the laptop respectively. (d) Shift in frequency large enough for a separate peak. A single scan would not capture the true shift in fre-
quency and would terminate at the local minima. A second scan compensates for the bandwidth of the shifted peak.
1912
(Gupta et al, 2012)
41 Samstag, 27. April 13
Doppler effect
Emitted sound 18 - 22 kHz Input sampled -> FFT
22.05kHz spectrum divided into 33 bins scanned until amplitude drops below 10%
second scan until 30% away from pilot tone
Sound Wave - Technology
formed and sensed [4]. While these projects show the po- tential of low-cost sonic gesture sensing, they require cus- tom hardware, which is a significant barrier to widespread adoption. In our work, we focus on a solution that works across a wide range of existing hardware to facilitate im- mediate application development and adoption.
THE SOUNDWAVE SYSTEM
SoundWave uses existing speakers on commodity devices to generate tones between 18-22 kHz, which are inaudible.
We then use the existing microphones on these same devic- es to pick up the reflected signal and estimate motion and gesture through the observed frequency shifts.
Theory of Operation
The phenomenon SoundWave uses to sense motion is the shift in frequency of a sound wave in response to a moving object, an effect called the Doppler effect. This frequency shift is proportional to source frequency and to the velocity with which the object moves. In our approach, the original source (the speakers) and listener (the microphone) are sta- tionary, thus in absence of any motion, there is no frequen- cy change. When a user moves his hand, however, it re- flects the waves, causing a shift in frequency. This frequency is measured by the microphone ( ) and can be described by the following equation, which is used for Doppler radar as well as for estimating frequency changes in reflection of light by a moving mirror [2]:
Figure 2 shows the frequency of the signal (a) when no mo- tion is present and when a hand is moved (b) away from or (c) closer to the laptop. This change in frequency as a hand moves farther or closer is one of the many characteristic properties of the received signal that we leverage in detect- ing motion and constructing gestures.
Algorithm & Implementation Details
SoundWave generates a continuous pilot tone, played through the device’s speakers at the highest possible fre- quency (typically in the range of 18-22 kHz on commodity audio systems). Although we have verified that SoundWave can operate on audio down to 6 kHz, we favor tones above
18 kHz since they are generally inaudible [1]. Additionally, the higher the frequency, the greater the shift for a given velocity, which makes it computationally easier to estimate motion at a given resolution. The upper bound is largely a function of most laptop and phone speaker systems only being capable of producing audio at up to 22 kHz. Fortu- nately, we do not need much higher frequencies to sense the relatively coarse gestures we are targeting.
Due to variations in hardware as well as filtering in sound and microphone systems, SoundWave requires an initial calibration to find the optimal tone frequency (no user in- tervention is required). It performs a 500 ms frequency sweep, and keeps track of peak amplitude measurements as well as the number of candidate motion events detected (i.e., potential false positives). SoundWave selects the high- est frequency at which minimum false events are detected and the peak is most isolated (i.e., the amplitude is at least 3 dB greater than next-highest peak in the sweep range).
The system consistently favors the 18-19 kHz range.
With the high-frequency tone being emitted, any motion in proximity (around 1 m depending on speed) of the laptop will cause Doppler-shifted reflections to be picked up by the microphone, which is continuously sampled at 44.1 kHz. We buffer the incoming time-domain signal from the microphone and compute the Fast Fourier Transform (FFT) with 2048-point Hamming window vectors. This yields 1024-point magnitude vectors that are spread equally over the spectral width of 22.05 kHz. After each FFT vector is computed, it is further processed by our pipeline: signal conditioning, bandwidth extraction, motion detection, and feature extraction.
Signal Conditioning: Informal tests with multiple people indicated that the fastest speed at which they could move their hands in front of a laptop was about 3.9 m/sec. Hence, we conservatively bound signals of interest at 6 m/sec. Giv- en our sampling rate and FFT size, this yields about 33 fre- quency bins on either side of the emitted peak.
Bandwidth Extraction: As seen in Figure 2, motion around the device creates a shifted frequency that effectively in- creases the bandwidth of the pilot tone (i.e., window aver- aging and spectral leakage blur the movement of the peak).
To detect this, SoundWave computes the bandwidth of the pilot tone by scanning the frequency bins on both sides in-
Figure 2: (a) Pilot tone with no motion. (b and c) Increase in bandwidth on left and right due to motion away from and towards the laptop respectively. (d) Shift in frequency large enough for a separate peak. A single scan would not capture the true shift in fre-
quency and would terminate at the local minima. A second scan compensates for the bandwidth of the shifted peak.
Session: Sensory Interaction Modalities CHI 2012, May 5–10, 2012, Austin, Texas, USA
1912
(Gupta et al, 2012)
41 Samstag, 27. April 13
Doppler effect
Emitted sound 18 - 22 kHz Input sampled -> FFT
22.05kHz spectrum divided into 33 bins scanned until amplitude drops below 10%
second scan until 30% away from pilot tone
Sound Wave - Demo
42 Samstag, 27. April 13
Wake up and sleep automatically
control media player
Sound Wave - Demo
(Gupta et al, 2012)
42 Samstag, 27. April 13
Wake up and sleep automatically
control media player
Sound Wave
§ Pro
§ No instrumentation of user
§ Accurate results
§ Even in noisy environments
§ Contra
§ Base tone may be hearable
43 Samstag, 27. April 13
All sensors need a network
44 Samstag, 27. April 13
To conclude we have a look at a completely different paper that discusses how the body itself can be used as a network for
communication
Gesture Pad
§ The body as touch interface
§ The body as network
§ The body as transceiver
45 Samstag, 27. April 13
Taken from the paper of Gesture Wrist, the capacitance sensing wrist sensor
Communicate between themselfes
Send data to (touched) outside world
Humantenna inverted
Gesture Pad
Figure 4: Relation between hand shape and obtained val- ues.
Figure 5: Example gesture commands
to a vector space (three dimensional, in this case), and a point in this space corresponds to a hand shape.
Figure 4 shows measured sensor values and their cor- responding hand shapes. As shown here, the system can distinguish two hand shapes, grasping and pointing clearly.
4.2 Forearm movement measurement
In addition to the hand-shape measurement, an accelera- tion sensor (Analog Devices ADXL202) is mounted on the
transmitter receiver
body
shield layer fabric
A
transmitter
body
receiver fabric
shield layer
B
body
transmitter fabric
shield layer
B’
receiver
Figure 6: Sensor configurations for GesturePad
wristwatch dial. This sensor is a solid-state 2-axis sensor and measures the inclination of the forearm.
4.3 Tactile feedback
When a gesture is recognized, the GestureWrist gives feedback to the user by tactile sensation. On the inside of the wristwatch dial, a ceramic piezoelectric-actuator is attached to produce the feedback. We use 20-Hz square- wave signals to excite this actuator.
4.4 Combining two sensor inputs
By combining these two inputs, we designed simple gesture commands. We selected two hand shapes (making a fist and pointing) and six different arm positions (palm
body
fabric
receiver shield layer
transmitter
Figure 7: Variation of GesturePad Type-B which is used in combination with GestureWrist. This module receives a signal from the GestureWrist through the body.
up, palm right, palm left, palm down, forearm up, and fore- arm down). The hand shapes are used to separate gesture commands into segments, and two consecutive arm posi- tions (e.g., palm left palm down) make up one input command. Examples of gesture commands are shown in Figure 5.
Continuouslyadjust parameters is also possible by twist- ing the forearm. For example, a user can first decide which parameter to change, and control it by rotating his or her forearm.
Based on our experience, absolute values from capac- itive sensors gradually change over a certain time period.
This is mainly because the position of the wristband moves over time. On the other hand, the derivative of the capac- itive values reflects the hand motion (e.g., from grasping to pointing) consistently. We are currently integrating this feature for to add stability and robustness to gesture recog- nition.
5 GesturePad: A sensor module for interac- tive clothing
Our next trial is to transform conventional clothes into interactive objects. Previous work on interactive clothes [7], have used metallic yarns woven into fabrics. This approach requires specially designed clothes, and is difficult to apply to clothes that already exist. We chose a “retrofit” approach that allows users to attach interactive modules to clothes easily. In addition, we particularly concentrated on mak- ing the attachment as unnoticeable as possible. We believe that clothes are a highly social media, and thus attaching obtrusive devices (such as [10]) is not an ideal solution.
The GesturePad, is a module that consists of a layer of sensors that can be attached to the inside of clothes. A wearer can control this module from the outside. As a result, a part of the clothes becomes interactive without changing its appearance.
5.1 Sensor configurations
Figure 6 shows three configurations of the GesturePad.
All types can be attached to the clothes on the inside, and
Figure 8: GesturePad prototype.
the wearer controls it from the outside.
Figure 6-(A), shows Type-A, which consists of an array of capacitive sensors (a combination of transmitters and re- ceivers) and a shield layer attached to the behind. Each vertical grid line is a transmitter and each horizontal line a receiver electrode. The sensing of both the transmitter and the receiver is time-multiplexed, so the sensor can inde- pendently measure the capacitance value of each electrode crossing point.
When a user’s finger is close enough to the sensor sur- face (typically within 1 cm), the sensor grid recognizes the finger position. During this operation, the shield layer at- tached on the backside of the module blocks influence from the wearer’s body. For example, when a module is placed on the inside of a lapel, a finger stroke gesture on the lapel becomes an input to the computer. This could enable con- trolling the volume of a worn MP3 player. Multiple sensor points on the module also enable multiple finger inputs.
For example, a chording-keyboard type input would also be possible.
Figure 6-(B) and (B’) show another sensor structure, Type-B (and B’), that consists of a transmitter and a re- ceiver layer separated by a shield layer. In this configu- ration, a signal from the transmitter layer is capacitively coupled to a receiver layer through the user’s body (i.e., on-body network). When the user’s finger is within prox- imity of the GesturePad, a wave signal from the transmitter electrode is transmitted to the receiver one. This type could be put in a trouser pocket and operated from the outside of the pocket. One benefit of this configuration is that it can prevent other people from interacting with the sensor.
The Type-B(B’) can also use an array of sensor elec- trodes so the user’s finger motion is detected by comparing the received signal amplitudes. The difference between B and B’ is the placement of transmitter and receiver elec- trodes. The Type-B places multiple transmitter electrodes on the front side and one receiver on the backside, while Type-B’ uses multiple receiver electrodes on the front side.
Since multiple transmitters can be easily implemented by time-multiplexing single transmitter, the needed hardware for Type-B is smaller than that of Type-B’.
Our current prototype for this Type-B integrates a trans-
(Rekimoto, 2001)
46 Samstag, 27. April 13
A: Transmitter/receiver multiplexed
B: Shield layer separates transmitter from receiver
Gesture Pad
§ Further Ideas
§ Use NFC transceivers inside pads
§ Identify person touching by there signal
47 Samstag, 27. April 13
Comparison
Mobility Accuracy Instrumentation Main Application
Muscle Computer
Interface Designed for mobile use, data sent via wifi/BT
65% busy hand, no feedback, 4 fingers 91% busy hand, feedback, 3 fingers
An arm band at the upper
forearm Gesture recognition with
busy hands
Gesture Wrist (Capacity sensing)
Designed for mobile use,
data sent via body network N/A Wrist watch like utility Hand shape recognition, authentication
Wrist Shape (Photosensors)
Designed for mobile use,
offline processing atm. 45-48% Wrist watch like utility Hand shape recognition
Digits
(3D reconstruction)
Designed for mobile use,
data sent via wifi/BT 91%, varying from finger
to finger Small camera worn at a
wrist band Reconstructing 3D model of hand
Gesture Watch (in air over hand)
Designed for mobile use,
data sent via wifi/BT 95 % Wrist watch like utility Simple gesture recognition using one hand
Sound Wave
(in air over laptop) Bound to Laptop 90-95% None, using existing
hardware Add simple gesture
recognition to laptop
48 Samstag, 27. April 13
Different aspect that would maybe required from a gesture based
interface
§ Today
§ Gesture recognition is feasible
§ Ranging accuracy
§ Integration is still complicated
§ In the future we ...
§ need to control unobtrusively
§ can authenticate with an accessory
§ wear touchable cloth
§ use the body as a network
49 Samstag, 27. April 13
50 Samstag, 27. April 13
Vaporware!?
commercial from myo
foresight of how gesture interaction could look like
commercial from myo
foresight of how gesture interaction could look like
“Any sufficiently advanced technology is indistinguishable from magic.”
Arthur C. Clarke
51 Samstag, 27. April 13