Camera Ready Controlling Mouse Cursor using Head Movement

Controlling Mouse Cursor using Head Movement
Gunawan
Department of Electrical Engineering, Faculty of Industrial Technology
Institut Teknologi Sepuluh Nopember, Surabaya 60111, Indonesia
Email: [email protected]
FX. Ferdinandus, Tri Kurniawan Wijaya, Indra Maryati, and Edwin Seno Dwihapsoro
Department of Computer Science, Sekolah Tinggi Teknik Surabaya
Surabaya 60284, Indonesia
Email: [email protected], [email protected], [email protected]



Abstract— On this research we made a tool and a
program that could replace the behavior of the
mouse. The result is proved can successfully
replace the mouse operation that usually operated
by our hands. Furthermore, the system we
developed in this research can be used to helped
people with quadriplegia (paralyzed) to operate
his/her computer easily. In order to have a high
accessibility, the necessary tools have to be

inexpensive and can be purchased easily. Thus we
selected an input method using a series of LED and
a webcam. There are 2 LED on the circuit because
we need 2 source of light in order to replace the
mouse behavior. The image captured by a webcam
is filtered using image segmentation technique
(thresholding). The first light source is used as the
head-tilt detector, head-tilt is used as the
replacement for the left-click and the right-click.
The second light source is used as a head-motion
detector, to replace the mouse movement. This
head movement tracking device using 2 pieces
LEDs, wires, several resistor and energy sources
(battery).
I. INTRODUCTION

A

PERSON who suffers quadriplegia can not move
his arms and legs to operate mouse and

keyboard. Therefore we need a special tool so
that the person can operate the computer.
Recently, there are equipments or software that can
help a people who suffer from quadriplegia to interact

Manuscript received October 9, 2009.
FX. Ferdinandus, Tri Kurniawan Wijaya, and Indra Maryati is
with Department of Computer Science, Sekolah Tinggi Teknik
Surabaya,
Surabaya
60284
Indonesia
(email:
{ferdi,
tritritri}@stts.edu, [email protected]).
Gunawan is with Department of Electrical Engineering, Faculty
of Industrial Technology, Institut Teknologi Sepuluh Nopember,
Surabaya 60111 Indonesia (email: [email protected]).
Edwin Seno Dwihapsoro was with Department of Computer
Science, Sekolah Tinggi Teknik Surabaya, Surabaya 60284

Indonesia.

with the computers. But unfortunately, most of the
equipments and software is difficult to purchase. The
difficulty is mostly derived from the (high) prices on
the market, the low number of the instruments or items
that are available, and the rules of buying and selling
electronic goods across states.
Example of the detector that recently came out,
“Ocular Mouse” is the eye muscle detector. However,
the tool is only sold to certain circles and is expensive.
There is also a head movements tracking device called
TrackIR that is relatively well known compared to
other tracer devices. This tool is known as the primary
function is for playing games. However, the tool that
can detect head movements can also be used to help
people who suffer from quadriplegia to operate the
computer. But as the other head movements tracking
device that are available in the market, the price of this
tool is also relatively expensive and is difficult to buy

in several countries, including Indonesia.
II. IMAGE SEGMENTATION
Thresholding is a simple method of image
segmentation. Thresholding method allows the
creation of a binary image from a grayscale image.
Binary image is the image that is only possible to have
2 different values for each pixel. Values that usually
chosen are the value that represents black and white,
although the values that represent other colors can also
be used.
In the binary image, a pixel color value is used as
the foreground/object color while the value of the
other colors is used as the background. The ultimate
purpose of the threshold is to simplify the
representation of the image so that the image can be
more easily analyzed.
The reason behind why we choose thresholding
method is for we can easily detect and differentiate the
features of tracked object and the background. Since
the thresholding method is a simple method, it is

expected that computational processes that are

required is not overload the system and the minimum
specifications required can be as minimum as possible.
III. HEAD MOVEMENT TRACKING DEVICE
There are several targets that are used as reference
in making the head movements tracking system. The
goal is the development of the system that has a high
accessibility. Factors that became a benchmark of
accessibility at the end of this task among others:
1) Fee required. In order the motion tracking system
can be use by many societies, the overall cost of
the components required by the system to work
well is suppressed as minimum as possible.
2) The availability of the required components. The
components needed for motion tracking system
was also made as public as possible, therefore the
prospective users have no difficulty in finding the
components required by head movement tracking
system.

3) Flexibility of use. Head movement tracking system
is developed to possess the highest flexibility. This
can be seen from the many alternative forms and
components that can be used to build the tracking
device. On the software can also be found many
arrangements that can be changed according the
user’s wishes.
In order to the tracking system can reach the above
targets, the cost needs to be pressed as minimum as
possible. One of the most potentially needs that can be
pushed is the cost of the video input device. Generally
for the system can track an object, the system must be
able to clearly distinguish the special features of the
object we want to track if compared to another objects
in an image.
Low and middle class video input devices generally
have a bad captured quality image (dark and have a lot
of noise). Generally there are settings that can improve
the quality of the captured image. But the image
quality is usually inversely proportional to the level of

images/frames that can be captured per second (FPS –
frame per second) by the video input device.
Decreasing in FPS has an impact in the response time
on detection and tracking the movements. In order to
solved the problem (and the video input devices that
can be used can be more variety in its class), we to
made a tool that can reduce the impact.
We made head movement tracking system, not eye
movement tracking system. Why not eye movement
tracking system? To solve the fps problem (head
movement is easier to detect than the eye movement),
to reduce the cost of the developed system, to develop
a system with the high accessibility, and to develop a
system that is relatively more practical to use.

Fig. 1. The recent eye movement tracking device

Eye movement tracking device with video input
devices generally have a high manufacturing cost,
large size and less practical for use. Fig. 1 (a) is the

eye movement tracking devices that uses early version
of the video input devices. The devices are still less
accurate. However, these tool still require top-class
camera. The size of the eye movement tracking device
is also bulky and not practical. Fig. 1 (b) is the eye
movement tracking device using the latest video
inputs, this tool is developed through the cooperation
of several companies such as Vision Systems
International, Elbit sytems, Rockwell Collins, and
Helmet Integrated Systems.
Eye movement tracking device in fig. 1 (b) is used
by pilots of combat aircraft of the United States in the
latest-generation fighter, the F-35 Lightning II.
However, the eye movement tracking device with
video input devices (although without the helmet) is
still large, and require components with higher costs.
Recently, there is a new eye tracking movement
device from Brazil, “Ocular Mouse”. The Ocular
Mouse tool was developed in 5 years by a team of
Brazilian scientists headed by Professor Manuel

Cardoso, a graduate of Electric Engineering of the
Federal University of Rio de Janeiro (UFRJ). The
development of this tool is sponsored by the
organization named Paulo Feitosa Brazil Foundation
(PFF). General functions of the device are almost
exactly the same with the function of head movement
tracking system we developed in this research, helps
people with paralysis quadriplegia interact with
computers.
However, the Ocular Mouse do not use any video
input device, but a special sensor that can detect eye
muscle movements. Sensors are placed around the
user's eyes, there are at least 6 sensors that need to be
attached to a special section the user's head. The
sensors are then plugged into a tool that is capable of
processing data of the eye muscle movement. The
instrument used to obtain the data requires a PC with a
serial connection RS-232. The price is about US$ 200
and only available for businesses man.
IV. SYSTEM ARCHITECTURE

Block diagram of the main system can be seen in

fig. 2 below. The system gets the input from head
marker tool. Head marker tool used in this system emit
light. The light is then captured by the video input.
Light captured by the video input device is
processed physically (by the video input sensor
devices), and temporarily stored in the video input
device in the form of data that can represent a digital
image, for example: a JPEG format. To be able to
produce moving images, the webcam captures images
generally several times per second (FPS) depending on
the specification of the webcam and the software
settings allowed.

in the form of mouse movements and actions
emulation. In addition, the mouse movements and
actions emulation can also be used to activate and
operate the Windows onscreen keyboard. The
combination of the output from the program and

Windows onscreen keyboard can then be used to
activate or operate other applications such as the
internet browser for surfing the internet.
V. ENSOTRACKER
In this research we named our system as
Ensotracker. As previously explained, the system,
Ensotracker, gets the image from Windows
DirectShow API. The source image itself is caught by
video input device in the form of a digital
representation of the image. Can be seen in Fig. 3 that
the programs obtain the original image captured by a
webcam via DSPack234. Image that is caught by
visual input devices used in Ensotracker systems is
distributed using the Windows DirectShow API.
After the program got the original image, the image
is then segmented based on the light intensity using
thresholding segmentation method. In this method, the
initial image is turned into grayscale image. Next, the
image is transformed into a binary image, using a
threshold value predetermined by the user. In
Ensotracker, transformation process (grayscale and
binary image) is done from the top-left coordinate to
bottom-right coordinate of the image.

Fig. 2. Block diagram of the main system

In the system, the video input device used has to
support Microsoft DirectShow. The video input device
is connected to the operating system through the
driver. In this context the driver is a computer program
that connected hardware with a software, such as the
operating system. Drivers connect the hardware and
software through the computer bus or other
communication subsystem where the hardware is
connected. Computer bus is a subsystem that transfers
data between computer components inside, or with
other computers.
DirectShow is the API (Application Program
Interface) developed by Microsoft. One of the
functions of DirectShow is to help software developers
in variety operations in terms of streaming media.
Whereas API is a set of classes, procedures, and
functions of an operating system, libraries, or services
made to help link a program with other programs.
The program asks the image captured by a webcam
through DSPack234 (DSPack234 is this research is a
collection of components and additional class which
can be used to connect the Delphi with DirectShow)
which get the image data from DirectShow. The image
is obtained and then processed to obtain the relevant
information so that the program can provide the output

Fig. 3. Diagram of the processes that occur in Ensotracker

Ensotracker received the light source's position as
the program transforms the grayscale image into a
binary image. At the transformation process, when the
program found the object pixel, the program takes the
pixel position, then the position is stored as the first
light source position. After that the program does not
take other object pixel position, before the image is
processed by the program has come a certain distance.
At default settings required distance is 20 pixels down

(vertical). When the program had covered the
distance, if there is another object pixel position, the
position is stored as the next light source position.
This process continues until the image acquiring from
the video input device is stopped or Ensotracker
program is closed.
After the program got the first light source position
and the second, the program can do the mouse
emulation. However Ensotracker can not do the mouse
emulation when the light source is more or less than 2.
This is because the program uses 2 light sources as the
input. The first light source is used as the head-tilt
angle. Meanwhile, the second light source is used as
the head position detector. The reason why the mouse
emulation can not be done if the light source is more
than 2 is that the position and condition of equipment
at the head marker captured image can not be known.
This is because there is a possibility that the first light
source position and the second is not the head position
marker tool, but it is a noise.
VI. ANGLE CALCULATION
Can be seen in Fig. 4 that the angle θ is obtained
from the deflection of the first light source (box 1) of
the triangle A and the second light source (box 2). The
formula used to calculate the angle θ, if the known
values are the X and Y position of the box 1 and box 2
(X1, X2, Y1, Y2) is:
θ = arctan ((X2-X1) / (Y2-Y1)) * 180/pi

(1)

Fig. 4. Angle calculation

The above formula can lead to an error if we do not
check the value of X and Y of the box 1 and box 2.
Calculations with the formula above has a requirement
that the difference between X and Y is at least 1.
Special attention also needs to be given to the
difference in the value of Y, because it can lead to
“division by zero” error. Formula 180/pi is used to
convert degree to radians, because we need the value
of θ in degree, but the default result of the arctan
calculation is radians.
VII. DIRECT SHOW
Microsoft DirectShow is the API developed by
Microsoft that can be used by Windows applications,

to interact and control the input devices Windows
media. As an example: camcorder, webcam, DVD
Drive, TV tuner, and analog video input devices.
DirectShow can also be used to play media files.
DirectShow flexibility is due to the modular
approach. Audio and video files are treated as data
streams, and the software modules can control the
streams when the media input device send its data to
the output device, for example: webcam data before
reach the monitor, and sound card data before reach
the speaker.
VIII. TESTING
The testing of the system is done on several cases:
the mouse click, internet browsing, and run other
applications using Ensotracker.
Testing the internet browsing is started by open the
internet browser application, enter the URL, and click
on a link. In the testing process, the URL used is
www.google.com and the link used is “About
Google”.
To test running other applications using
Ensotracker, we tested notepad and a game
“Smashing”. In notepad, we asked user to type
“ABCDEFGHIJ” (using on-screen keyboard). In
smashing game, the user play games similar to
arkanoid. Arkanoid is a simple game that requires user
to move a board to right and to left. There are also a
ball that bounce forever inside the screen. The board is
located on the bottom of the screen and used to
prevent the ball fall of below the screen (the ball
bounce back when reach the board or the wall in the
top, left, and right of the screen).
The results are then received from respondents who
were asked to perform the activities of the above test.
The respondents were asked to use the system
Ensotracker only and not allowed to use both hands
and feet in a test run.
From the testing results can be concluded that the
system can help people browsing the Internet without
using both hands and feet. Thus it can also be
concluded that Ensotracker system can help people
with quadriplegia to browsing through the Internet.
The respondents had no difficulty in either open text
file or playing game “Smashing”. When trying
notepad, the respondents can type in the letters
“ABCDEFGHIJ” without problems. In trying the
game “Smashing” which is an arkanoid game, the
respondents did not experience problems in operate
reflective boards operated by using the mouse pointer.
So, from the results can be concluded that the system
can help people to run and operate other applications
besides the Internet browser, without using hands or
feet.

IX. CONCLUSION AND FUTURE WORK
In spite of making the head movement tracking
device, in this research we also showed that
thresholding method can be applied as the detecting
and tracking method of the position of the object/light.
However, in order to this method working properly, it
has some limitations. Ambient light and background
should be darker than the object/light tracked.
In this research, we have succeeded in
implementing a device that can replace the mouse
input using head tracking movement device in an
inexpensive way. The device can absolutely useful
primarily for quadriplegia sufferer. However, the
resolution and image quality produced by the video
input device is very strongly influenced the quality of
the tracking head movements, and mouse emulation.
For the further development of the system, the light
position tracking process can be improved using an
algorithm that can handle the noise easily and the
usage of a larger image resolution can refine the
movement of the mouse pointer.
REFERENCES
[1]

[2]

[3]
[4]
[5]

[6]

[7]

[8]

[9]

[10]

[11]
[12]

[13]

[14]

CoderSource.net, Binary Image,
http://www.codersource.net/csharp_color_image_to_binary.as
px. 2008.
Gonzalez, Rafael C. & Woods, Richard E. Thresholding. In
Digital Image Processing. Pearson Education. 2002, pp. 595611.
Topik Primary color Wikipedia,
http://en.wikipedia.org/wiki/Primary_color. 2008.
Topik Visible Spectrum Wikipedia,
http://en.wikipedia.org/wiki/Visible_Spectrum. 2008.
Redação Terra, Brasileiro cria mouse ocular para tetraplégicos
- Terra - Hardware & Software,
http://tecnologia.terra.com.br/interna/0,,OI1124734EI4801,00.html. 2008.
Tania Orsi, ScienceNET,
http://www.sciencenet.com.br/backup/english/sciencenews/ed
_48/48_ocularmouse.htm. 2008.
Quadriplegia Quadriplegia,http://www.spinalinjury.net/quadriplegia.htm.
2008.
UN Enable - Relationship between Development and Human
Rights,
http://www.un.org/disabilities/default.asp?navid=35&pid=33.
2008.
Market share for browsers, operating systems and search
engines,
http://marketshare.hitslink.com/report.aspx?qprid=10. 2008.
Jow Webcams Work,
http://www.digitalmoviebox.com/mycamcar/how-web-camswork/. 2008.
Device Driver Wikipedia,
http://en.wikipedia.org/wiki/Device_driver. 2008.
Bray, Andrew C.; Dickens, Adrian C.; Holmes, Mark A. The
Advanced User Guide for the BBC Microcomputer.
Cambridge, UK: Cambridge Microcomputer Centre. 1983, pp.
442-443.
MSDN DirectShow documentation.
http://msdn.microsoft.com/en-us/library/ms783323.aspx.
2007.
Orenstein, David. QuickStudy: Application Programming
Interface (API)". Computerworld.
http://www.computerworld.com/action/article.do?command=v
iewArticleBasic&articleId=43487. 2000.

[15] Topik Augmented Reality Wikipedie,
http://en.wikipedia.org/wiki/Augmented_reality. 2008.
[16] Page 1 Introduction to DirectShow,
http://www.vwlowen.co.uk/directshow/page01.htm. 2008.
[17] The DSPack Project. prodigy.com.
http://www.progdigy.com/?page_id=4. 2008.

Dokumen yang terkait

ALOKASI WAKTU KYAI DALAM MENINGKATKAN KUALITAS SUMBER DAYA MANUSIA DI YAYASAN KYAI SYARIFUDDIN LUMAJANG (Working Hours of Moeslem Foundation Head In Improving The Quality Of Human Resources In Kyai Syarifuddin Foundation Lumajang)

1 46 7

The effect of using the flannel boad on grammar abilities of the second year students of SLTPN I Paiton in the 2001/2002 academic year

0 5 78

Increacing students' motivation in learning english by using folktales

1 20 57

Developing students vocabulary by using mind Map : an experimental study of eigt grants at mts jam'iyyatul khair ciputat

0 12 68

Teaching simple past tense by using cooperative learning : an experimental study at seconde of MTS pembangunan UIN Jakarta

0 8 81

Enriching students vocabulary by using word cards ( a classroom action research at second grade of marketing program class XI.2 SMK Nusantara, Ciputat South Tangerang

12 142 101

The efectiveness of using dialogue techique in teaching simple present tense: a pre-experimental study at the first grade of the students in MTsN Tangerang II Pamulang

0 11 61

Improving students' ability in writing recount text by using mind-mapping technique ( A Classroom action research in the 8.2 Class of SMPN 2 Kota Tangerang Selatan)

0 26 119

The Effectiveness of using pictures in teaching present continuous tense : an experimental study at the yaer students of SMP Perwira Ulujami Jakarta Selatan

0 11 75

The using persuasive sentence in pamphlet : job training report

0 18 34