Activities

No upcoming events

AudibleEyed

Participants

Project Supervisors

Motivation

Application scenario

The user wears eye-tracking glasses while playing a regular game of chess. Initially, the software follows the game silently by analyzing the live video recorded by the glasses' camera. Also fixations (places at which the user looks) are processed so that the system knows which piece the user is currently interested in. Visual buttons, physical tokens, are located near the chess board and may be fixated to give certain commands. The system then recognizes the user's wish and, for example, activates a mode in which the software identifies the fixated piece and "speaks" it out in natural language. Furthermore, a hint may be requested, causing the included chess engine to calculate the best move and likewise notify the user about it.

Objectives

The project goals are

Description

As soon as a game of chess is started, the internal representation of the game is being updated with the players' moves on the physical chessboard, which is filmed by the eye-tracker's main camera. The approach of implementing a direct object recognition of chess pieces seemed unrealistic due to insufficient visual differences, even more complicated by the flat-angle video of the chessboard itself causing overlapping pieces. It was decided to just differentiate between "occupied field" and "not occupied field" and then updating the internal representation in case formerly occupied fields were recognized as not occupied and vice versa. Of course only valid moves should result in changes of the internal representation. Therefore (and to provide hints for next moves), the chess engine Brutus was fit into the system.

Besides the moves, the user's eye fixations (positions looked at) are being monitored and corresponding actions are triggered when either a chess piece or a visual button is fixated. Two types of visual buttons are introduced: On/Off-buttons, which change the system's mode of operation (The only implemented mode is "Tell fixated piece") and buttons which trigger actions (The only implemented action is "Tell hint for next move"). In this way, the user can operate the system by just looking at the mentioned buttons and will not need to mind the computer itself and its artificial input devices throughout the game. If "Tell fixated piece"-mode is activated, the system will identify the fixated pieces and "tell" the types in natural language. Likewise, if the user requests a hint, the proposed move will be vocalized.

The whole system is developed in C++ on Microsoft Windows because the here-utilized eye-tracking glasses from SMI are shipped with software for Windows environments only. It consists of several components communicating which each other over RSB. Although only the component for publishing eye-tracking data and the component for speech synthesis (using the built-in Windows speech api) are restricted to Microsoft Windows, and RSB allows multi-platform communication, the whole system is developed and applied under Microsoft Windows for simplicity reasons.

Results

The described functionality was successfully implemented, resulting in a system of seven simultaneously running executables, interacting through RSB. Communication via RSB was perceived reliable and generally fast enough, although occasionally lags occurred. However these may be caused by the system itself and are not problematic here.

The only major drawback is that currently colored poker chips had to be used instead of traditional chess pieces, because the object recognition proved not very stable: Smaller pieces (as pawns) may be concealed by bigger figures (as queens); Black figures on black fields were hard to detect. Support for traditional pieces is planned, see outlook below.

On January 29, 2013, the system was demonstrated live during the final event of the course. It performed as planned and the whole presentation was received very positive overall. A recording of the presented system is available as interaction video: The video shows the described features: Telling fixated pieces and giving hints. The bad quality of the displayed images is not caused by bad video quality but in fact an accurate visualization of the images processed by the system: The used eye-tracker is a development version in both hard- and software and apparently has quite a few problems with delivering the recorded images. The frame rate itself is low (approx. 5 Hz) and not constant because images without a valid fixation are not displayed. This output is only shown to illustrate the presentation and normally not visible to the user.

Discussion and Conclusion

After the system was usable, it was soon noticed that the eye-tracker itself is much less noticed by the user after a few minutes of concentrated playing. Furthermore, communication via visual buttons seems a very intuitive concept and it is quite fascinating how even people who never used an eye-tracker before are at once intimately familiar with using it to issue commands. One could imagine that humans (and primates in general) are inherently used to sending signals via eye-movements. On the other hand, the fact that no intermediate mechanical device (as keyboard or mouse) is applied here could alone result in the observed easy usability.
The concept of controlling systems via visual buttons is interesting and could be used in everyday life when light and wireless eye-trackers are available.

As a result of calculating the internal representation based on changes in occupied/not occupied fields, every game must be watched from the very beginning and every move needs to be recognized before another move is executed. As moves are typically recognized in less than a second, this is no problem in real games. To give feedback, the system is configured to inform the user if a move was recognized.

Outlook