Eye-To-Text Communication Based on Human-Computer Interface Method
Eye-To-Text Communication Based on Human-Computer Interface Method
AbstractـــEye-to-text communication is a technology that for overcoming the communication challenges faced by
has gained significant importance in recent years in the field of paralyzed people. They retain their mental faculties but
human-computer interaction (HCI), becoming increasingly cannot use their bodies in any way. Created an interactive
necessary for people with speech impairments or movement system that responds in real-time to eye blinks, allowing the
disabilities. Therefore, the webcam, Raspberry Pi 3, and paralyzed to communicate normally again. The device
display screen have been used to gather data from an eye's creates an alert signal in response to patient needs, like
movement in the left, right, top, bottom, and blink directions. asking for water or food; when the blink count reaches the
This research aims to design a system that converts eye-to-text maximum threshold, the system sounds an alarm and
communication to preview the typing on the display screen in
transmits an audio message. Applying a Haar Cascade
real-time, which is low-cost and non-invasive. The system
utilizes OpenCV2, Dlib, Numpy, and Pandas libraries for data
Classifier for face and eye detection enables real-time blink
collection from a webcam and enables the extraction of values detection. Then, the Euclidean distance between the eyes
from user eye movements on the Excel sheet. The system determines the eye-aspect ratio (EAR). We can determine
suggested that two stages are needed to calculate the ratio. how often a patient blinks in each frame with reliable eye
First stage: To detect eye blinking by the Dlib library, detect detection and face tracking. However, the system is
facial landmarks in the region of the eye, and then calculate the positioned in a well-lit area Srinivas et al. [7] for various
eye aspect ratio (EAR) between these landmarks. Second stage: lighting settings.
By using the OpenCV2 library to convert the image into Presented a system that allows people who are paralyzed
grayscale format, creating a black mask to detect the region of or have disabilities to use a computer without using their
the eye only, calculating all white pixels on the left side and hands, such as by utilizing a virtual keyboard or mouse. The
right side, and dividing the left pixels by the right pixels to get system captures facial expressions using a webcam as its
the ratio of pixels to select toward the eyes. According to our primary input system, especially eye and mouth movements,
experience, the algorithms produced satisfying results when to control the virtual keyboard and mouse. The technique
tested with an accuracy of 93.7% in real-time, including the used a Haar classifier to detect and extract the eye and facial
Dlib algorithm with an accuracy of 94.5% and the OpenCV2 regions. The system enables the user to scroll in various
algorithm with an accuracy of 92.9%. directions with mouse movement and to type on a virtual
keyboard by selecting the desired keys through mouth
KeywordsـــEye-to-text communication, landmarks, movement without the need for any additional assistance
OpenCV2 library, Dlib library, ratio, pixels from someone. The results demonstrated that it integrates all
types of user input and improves cursor utilization across
I. INTRODUCTION various contexts. Bharath et al. [8].
The development of eye-to-text communication devices Presented an eye-tracking system for driver fatigue and
has become a helpful resource for people with many drowsiness detection. Because fatigue and distraction are
impairments. These systems use computer vision (CV) to significant causes of traffic accidents, the suggested device
monitor eye movement and translate it into text, allowing detects closed eyes, indicates tiredness, and alarms the driver
users a more natural and efficient means of communication to prevent accidents. The system regularly captures and
[1]. While OpenCV2 provides various tools for analyzing analyzes eye pictures, applies the Haar algorithm to
photos and videos, Dlib offers powerful algorithms for recognize the driver's eyes, and determines the best threshold
identifying facial landmarks, such as precise eye tracking. for accurate identification. The technology recognizes the
Integrating these libraries can create an effective eye- eyes, nose, lips, and eyebrows. The eye area is then measured
tracking system that mediates data transfer from gaze to vertically and horizontally using Euclidean distance. These
words [2]. Improvements to the system's algorithms allow measures calculate the Eye Aspect Ratio (EAR), which
for more accurate monitoring of the user's eyes and less lag indicates the driver's attention. Increased accuracy, such as
time [3-5]. adding an infrared video camera for nighttime or low-light
Furthermore, the system utilizes a webcam as its primary conditions and considering sunglasses-wearing drivers
data source, making it easily accessible and inexpensive. Băiașu et al. [9].
Observing eye movements is made possible by a webcam's
constant feed of live video of the user's face. The data is
then sent into eye tracking algorithms, which precisely II. MATERIALS AND METHODS
identify the user's gaze direction and focus. Once the data The effectiveness of the proposed system is evaluated by
has been retrieved, it may be converted into text, giving describing the hardware used and the acquired dataset in the
people a new means of expressing their thoughts and following sections.
communicating efficiently [6]. Suggested a novel approach
1. Proposed System
91
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on February 11,2025 at 06:57:42 UTC from IEEE Xplore. Restrictions apply.
2023 IEEE 13th International Conference on System Engineering and Technology (ICSET), 2 October 2023, Shah Alam, Malaysia
(a)
(a) (b)
Figure 4. Face detection (a) and (b) grayscale.
92
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on February 11,2025 at 06:57:42 UTC from IEEE Xplore. Restrictions apply.
2023 IEEE 13th International Conference on System Engineering and Technology (ICSET), 2 October 2023, Shah Alam, Malaysia
Measuring the distance between two points, horizontally indicated when the sclera covers the eye's left side while the
and vertically. The left eye's horizontal line length will be pupil and iris point in the opposite direction [19]. The sclera
the distance between points 36 and 39, and the left eye's covers the lower side, and on the other side, the pupil and iris
vertical line length will be the halfway point between points are on the top side, as shown in Figure 10.
37 and 38 and between points 40 and 41. As shown in
Figure 8, using the equations below (1) and (2), get the ratio
between two eyes.
‖𝐏𝟐−𝐏𝟔‖+‖𝐏𝟑−𝐏𝟓‖
𝐸𝐴𝑅 = (1)
2‖𝐏𝟏−𝐏𝟒‖
Selecting landmarks for the eye region left: [36, 37, 38, 39,
40, 41] Also, with the eye region right, these landmarks:
[42, 43, 44, 45, 46, 47].
93
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on February 11,2025 at 06:57:42 UTC from IEEE Xplore. Restrictions apply.
2023 IEEE 13th International Conference on System Engineering and Technology (ICSET), 2 October 2023, Shah Alam, Malaysia
𝑅𝑎𝑡𝑖𝑜 =
𝐿𝑒𝑓𝑡 𝑠𝑖𝑑𝑒 𝑤ℎ𝑖𝑡𝑒
(3) direction to the bottom has not occurred. However, if the
𝑅𝑖𝑔ℎ𝑡 𝑠𝑖𝑑𝑒 𝑤ℎ𝑖𝑡𝑒 EAR is less than the threshold of 0.20, an eye-blinking is
produced in the event of continual eyelid-closing when the
EAR is less than 0.17, and if the EAR is between 0.19 and
𝑇𝑜𝑝 𝑠𝑖𝑑𝑒 𝑤ℎ𝑖𝑡𝑒
𝑅𝑎𝑡𝑖𝑜 = (4) 0.18, an eye-direction signifies down. The results showed
𝐵𝑜𝑡𝑡𝑜𝑚 𝑠𝑖𝑑𝑒 𝑤ℎ𝑖𝑡𝑒
that the system achieved an accuracy of 94.5%.
III. RESULTS
1. Eye blinking using the Dlib library
The graph showing eye blinking signals for the dataset is
displayed in Figure 15. The reading for the eye blink varied
from 0.05 to 0.4 for the ratio (EAR) computation taken from
the webcam's 14 video frames. This variation is dependent
on the threshold that was chosen. When an EAR is greater
than the threshold value of 0.20, an eye-blinking or eye- Figure 17. Eye movement directions.
94
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on February 11,2025 at 06:57:42 UTC from IEEE Xplore. Restrictions apply.
2023 IEEE 13th International Conference on System Engineering and Technology (ICSET), 2 October 2023, Shah Alam, Malaysia
95
Authorized licensed use limited to: Amrita School of Engineering. Downloaded on February 11,2025 at 06:57:42 UTC from IEEE Xplore. Restrictions apply.