Complex environments with many objects in the background
Small objects and fingers can be brought into the interaction space between interlocutors
Human hand can be brought into interaction space (non-virtual object)
Problems with 2D gaze tracking in 3D worlds
Difficult to disambiguate between gaze on foreground and background objects
This is also true for real environments!
Overview
Overview
Review of Visualization Methods for 2D Gaze Tracking
Going from 2D to 3D
Measuring the 3D Point of Regard
Geometry-based Approaches
Holistic Approaches
Visualizing Attention in Space
Conclusion
2D Visualisations for Gaze
Scanpath
Basis of the association between eye fixation and visual context is the
point of regard (PoR) (Dodge 1907)
Scanpaths show the sequence of several point of regards (Yarbus 1967, but russian original earlier in 1965)
Scanpaths are a qualitative visualization
Several variations exist: size of PoR scaled to duration, animated scanpaths, etc.
Scanpath on a 2D image
Yarbus, A. L. (1967). Eye Movements and Vision. Plenum Press.
2D Visualisations for Gaze
Regions of Interest
Aggregate fixations over certain regions
Regions of Interest (RoI) provide quantitative feedback
Several variations exists, example: links between regions reveal transition probabilities
Mapping from fixations to Regions of Interest with transition probabilities
Fitts, Jones and Milton (1950). Eye movements of aircraft pilots during instrument-landing approaches. Aeronautical Engineering Review.
2D Visualisations for Gaze
Heatmaps / Fixation Maps / Attentional Landscapes
Concept introduced by Pomplun, Ritter and Velichkovsky (1996), today often referred to as Heatmaps
Re-Interpretation of Attentional Landscapes (Elias, Sherwin and Wise, 1984) on images
Elaborated by Wooding (2002) as Fixation Maps
Model area of high acuity of gaze as gaussian distribution (SD 1 degree of visual angle)
Aggregate over several PoR
Heatmaps provide qualitative feedback
Heatmap of gaze on Boring figure
Pomplun, Ritter and Velichkovsky (1996). Disambiguating complex visual information: Towards communication of personal views of a scene. Perception.
2D Stimuli
Research Question
How do results from studies on 2D (or 2.5D) images scale to reality?
Motiviation for investigating 3D stimuli
Problematic areas: navigation, motor control, spatial language, ...
Eye tracking is leaving the desktop-lab: mobile eye tracking, eye tracking in cockpits, eye tracking in virtual reality
3D content is becoming widespread (commercial interests)
Virtual prototyping is taking up speed
... you might know some more
Going from 2D to 3D
Going from 2D to 3D
Requirements (see paper for details)
eye tracker: monocular or binocular (better)
body tracking: outside-in or inside-out tracking of at least head position and orientation
data fusion unit: integrates eye and body tracking
solution for calibration: depends on set-up, could be laser pointer, marker, 3D display
geometry model database: a must for geometry-based approaches
Focus of this talk
3D point of regard estimating unit: geometry-based or holistic approach
3D gaze visualization
Estimating the 3D Point of Regard
Geometry-based Approaches
2D Gaze Tracking
2D Gaze Tracking
is already a 3D point of regard estimation
but it is based on hard constraints
a fixed position of the screen plane
a (relatively) fixed position of the user
the screen is normally not the object of interest (they are on the screen)
Geometry-based Approaches
2.5D Gaze Tracking
2.5D object geometry acquisition used by Rötting et al. (1999)
Rötting, Göbel and Springer (1999). Automatic object identification and analysis of eye movement recordings. MMI-Interaktiv
2.5D Gaze Tracking
offline process for the semi-automatic detection of 2.5D PoR (Rötting et al. 1999)
based on monocular eye tracking, scene-camera and Ascension Flock of Birds 6DoF tracker
2 staged process
object regions were manually labeled in different views to extract 2.5D geometry-model
object regions were projected onto scene-camera for each frame and fixations clustered on the image plane
Geometry-based Approaches
Tanriverdi and Jacob (2000)
based on Virtual Reality with Head-Mounted-Display (HMD) and monocular eye tracking
cast visual line of sight into 3D world (geometries known) to identify model of interest
use model-based dwell times for the selection of objects: interactive use
HMD simplifies approach, as eye-screen transformation is fixed
Pfeiffer (2008)
extended this approach to CAVE-like setups with 3D projection screens
problem: eye-screen transformation is dynamic, several screens
idea: virtual calibration screen interlocked with head movements
Geometry-based Approaches
Duchowski et al. (2001)
use binocular eye tracking with a HMD
designed for diagnostic use
returns visual line of sight for each eye and computes intersection point
creates virtual line of sight (cyclopean)
computes geometry intersection based on virtual line of sight
depth information of 3D PoR is thrown away in favor of geometry-based approach (!)
Setup for binocular eye tracking in HMD-VR
Duchowski et al. (2001): Binocular Eye Tracking in VR for Visual Inspection Training. Virtual Reality Software and Technology.
Geometry-based Approaches
Advantages
Object-centered
Moving objects are easier to handle
requires only monocular eye tracking
only standard calibration needed
suggest a high achievable precision (?)
Disadvantages
Object-centered
No distribution of attention on other objects
based on strong assumptions
tracking has a high acuity (small objects/letters?, partial occlusions?)
first model hit by the ray always wins (transparencies?, geometries with holes?)
static dominant eye (changes based on task?, dual target problem?)
problems with foreground/background disambiguation
Holistic Approaches
determine the 3D PoR based on measurements alone
require at least two viewing directions (binocular eye tracking or temporal monocular eye tracking)
mapping to geometries only done for quantitative interpretation
3D PoR triangulation based on vergence
Pfeiffer et al. (2009). Evaluation of Binocular Eye Trackers and Algorithms for 3D Gaze Interaction in Virtual Reality Environments. JVRB.
Holistic Approaches
Essig et al. (2006)
binocular eye tracking on desktop VR with anaglyph stereo presentation of target dots
machine-learning approach (parameterized self-organizing map) to 3D PoR estimation
significant reduction of error compared to naive 3D triangulation (45 percent)
Setup with anaglyph stereo glasses and binocular eye tracker (Eye Link I)
Essig et al. (2006). A neural network for 3D gaze recording with binocular eye trackers. The International Journal of Parallel, Emergent and Distributed Systems.
Holistic Approaches
Pfeiffer et al. (2009)
extended the approach of Essig et al. (2006) to generic 3D environments
Virtual Reality environments presented with shutter glasses
Real World environments
applied this approach to object selection tasks
Binocular eye tracker (Arrington Research), polarized stereo glasses and marker for optical tracking
Holistic Approaches
Advantages
scene-centered
spreading attention build in
requires binocular eye tracking for real-time performance
grounded in the measurements, not in geometry assumptions
does not require geometry models
less accurate with increasing distance, but fallback to geometry-based approach possible
Disadvantages
scene-centered
problem with moving objects
more expensive (binocular vs. monocular)
more effort in calibration
visualization requires large memory (think of several hundreds of parallel heatmaps) or GPUs
Visualizing Attention in Space
Visualizations for Geometry-based Approaches
Attentional Maps
proposed by Stellmach, Nacke and Dachselt (2010)
come in different flavours
Projected Attentional Maps (2D projection == common heatmap)
Object-based Attentional Maps
Surface-based Attentional Maps
proposed solution based on geometry-based 3D POR/Desktop VR
Visualizations for Holistic Approaches
Target Structure
Interactive visualization of the example structure in immersive virtual reality used for testing. Press 'a' for viewing the full scene.
Example for Virtual Reality
Dense structure within a volume of $30\,cm \times 30\,cm \times 30\,cm$
Constructed out of a virtual version of a wooden toy-kit
Extension of individual building blocks about $1\,cm$ to $2\,cm$ (not considering bars)
Visualizations for Holistic Approaches
3D Scanpath - VP 09
Individual 3D scanpath of VP 09, specific sequence of objects, holistic 3D PoR, static radius
with $d(t)$: amplification factor depending on the duration
and computing $\sigma$ as a function of the actual distance of $\vec{x}$ from the eye $\vec{p}_{eye}$ (here cyclopean)
Image to the left shows version with fixed $\sigma$, dynamic $\sigma$ shown in video on the next slide
Visualizations for Holistic Approaches
3D Attention Volumes
Visualizations for Holistic Approaches
3D Attention Volumes for Real Objects
Object
Image of the real world 3D object
Attention Volumes from Different Perspectives
Conclusion
Reviewed of the state-of-the-art to show that
3D gaze tracking is just around the corner
basic algorithms are there for tracking and visualization
costs are still high (eye tracking system plus motion capturing), but mobile systems with lower costs are within reach
Identified necessary steps for 3D gaze tracking and visualization
Presented 3D Attention Volumes
which extend the concept of heatmaps to 3D space
can not only visualize geometry-based 3D POR but also holistic 3D PORs
are independent of a 3D geometry model (unless the application needs a semantic interpretation)
can be applied to virtual and real worlds
Open Questions / Future Work
How can we increase the validity of the estimated distribution of attention around the 3D POR, especially in depth? Is there appropriate data available?
Can/Should we compare important findings based on 2D stimuli with 3D counterparts?
How to handle dynamic environments?
How can we solve the problem of real-time model acquisiton and/or update in the real world?