Investigating unimodal isolated signer-independent sign language recognition
- Authors: Marais, Marc Jason
- Date: 2024-04-04
- Subjects: Convolutional neural network , Sign language recognition , Human activity recognition , Pattern recognition systems , Neural networks (Computer science)
- Language: English
- Type: Academic theses , Master's theses , text
- Identifier: http://hdl.handle.net/10962/435343 , vital:73149
- Description: Sign language serves as the mode of communication for the Deaf and Hard of Hearing community, embodying a rich linguistic and cultural heritage. Recent Sign Language Recognition (SLR) system developments aim to facilitate seamless communication between the Deaf community and the broader society. However, most existing systems are limited by signer-dependent models, hindering their adaptability to diverse signing styles and signers, thus impeding their practical implementation in real-world scenarios. This research explores various unimodal approaches, both pose-based and vision-based, for isolated signer-independent SLR using RGB video input on the LSA64 and AUTSL datasets. The unimodal RGB-only input strategy provides a realistic SLR setting where alternative data sources are either unavailable or necessitate specialised equipment. Through systematic testing scenarios, isolated signer-independent SLR experiments are conducted on both datasets, primarily focusing on AUTSL – a signer-independent dataset. The vision-based R(2+1)D-18 model emerged as the top performer, achieving 90.64% accuracy on the unseen AUTSL dataset test split, closely followed by the pose-based Spatio- Temporal Graph Convolutional Network (ST-GCN) model with an accuracy of 89.95%. Furthermore, these models achieved comparable accuracies at a significantly lower computational demand. Notably, the pose-based approach demonstrates robust generalisation to substantial background and signer variation. Moreover, the pose-based approach demands significantly less computational power and training time than vision-based approaches. The proposed unimodal pose-based and vision-based systems were concluded to both be effective at classifying sign classes in the LSA64 and AUTSL datasets. , Thesis (MSc) -- Faculty of Science, Ichthyology and Fisheries Science, 2024
- Full Text:
- Date Issued: 2024-04-04
- Authors: Marais, Marc Jason
- Date: 2024-04-04
- Subjects: Convolutional neural network , Sign language recognition , Human activity recognition , Pattern recognition systems , Neural networks (Computer science)
- Language: English
- Type: Academic theses , Master's theses , text
- Identifier: http://hdl.handle.net/10962/435343 , vital:73149
- Description: Sign language serves as the mode of communication for the Deaf and Hard of Hearing community, embodying a rich linguistic and cultural heritage. Recent Sign Language Recognition (SLR) system developments aim to facilitate seamless communication between the Deaf community and the broader society. However, most existing systems are limited by signer-dependent models, hindering their adaptability to diverse signing styles and signers, thus impeding their practical implementation in real-world scenarios. This research explores various unimodal approaches, both pose-based and vision-based, for isolated signer-independent SLR using RGB video input on the LSA64 and AUTSL datasets. The unimodal RGB-only input strategy provides a realistic SLR setting where alternative data sources are either unavailable or necessitate specialised equipment. Through systematic testing scenarios, isolated signer-independent SLR experiments are conducted on both datasets, primarily focusing on AUTSL – a signer-independent dataset. The vision-based R(2+1)D-18 model emerged as the top performer, achieving 90.64% accuracy on the unseen AUTSL dataset test split, closely followed by the pose-based Spatio- Temporal Graph Convolutional Network (ST-GCN) model with an accuracy of 89.95%. Furthermore, these models achieved comparable accuracies at a significantly lower computational demand. Notably, the pose-based approach demonstrates robust generalisation to substantial background and signer variation. Moreover, the pose-based approach demands significantly less computational power and training time than vision-based approaches. The proposed unimodal pose-based and vision-based systems were concluded to both be effective at classifying sign classes in the LSA64 and AUTSL datasets. , Thesis (MSc) -- Faculty of Science, Ichthyology and Fisheries Science, 2024
- Full Text:
- Date Issued: 2024-04-04
A natural user interface architecture using gestures to facilitate the detection of fundamental movement skills
- Authors: Amanzi, Richard
- Date: 2015
- Subjects: Human activity recognition , Human-computer interaction
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: http://hdl.handle.net/10948/6204 , vital:21055
- Description: Fundamental movement skills (FMSs) are considered to be one of the essential phases of motor skill development. The proper development of FMSs allows children to participate in more advanced forms of movements and sports. To be able to perform an FMS correctly, children need to learn the right way of performing it. By making use of technology, a system can be developed that can help facilitate the learning of FMSs. The objective of the research was to propose an effective natural user interface (NUI) architecture for detecting FMSs using the Kinect. In order to achieve the stated objective, an investigation into FMSs and the challenges faced when teaching them was presented. An investigation into NUIs was also presented including the merits of the Kinect as the most appropriate device to be used to facilitate the detection of an FMS. An NUI architecture was proposed that uses the Kinect to facilitate the detection of an FMS. A framework was implemented from the design of the architecture. The successful implementation of the framework provides evidence that the design of the proposed architecture is feasible. An instance of the framework incorporating the jump FMS was used as a case study in the development of a prototype that detects the correct and incorrect performance of a jump. The evaluation of the prototype proved the following: - The developed prototype was effective in detecting the correct and incorrect performance of the jump FMS; and - The implemented framework was robust for the incorporation of an FMS. The successful implementation of the prototype shows that an effective NUI architecture using the Kinect can be used to facilitate the detection of FMSs. The proposed architecture provides a structured way of developing a system using the Kinect to facilitate the detection of FMSs. This allows developers to add future FMSs to the system. This dissertation therefore makes the following contributions: - An experimental design to evaluate the effectiveness of a prototype that detects FMSs - A robust framework that incorporates FMSs; and - An effective NUI architecture to facilitate the detection of fundamental movement skills using the Kinect.
- Full Text:
- Date Issued: 2015
- Authors: Amanzi, Richard
- Date: 2015
- Subjects: Human activity recognition , Human-computer interaction
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: http://hdl.handle.net/10948/6204 , vital:21055
- Description: Fundamental movement skills (FMSs) are considered to be one of the essential phases of motor skill development. The proper development of FMSs allows children to participate in more advanced forms of movements and sports. To be able to perform an FMS correctly, children need to learn the right way of performing it. By making use of technology, a system can be developed that can help facilitate the learning of FMSs. The objective of the research was to propose an effective natural user interface (NUI) architecture for detecting FMSs using the Kinect. In order to achieve the stated objective, an investigation into FMSs and the challenges faced when teaching them was presented. An investigation into NUIs was also presented including the merits of the Kinect as the most appropriate device to be used to facilitate the detection of an FMS. An NUI architecture was proposed that uses the Kinect to facilitate the detection of an FMS. A framework was implemented from the design of the architecture. The successful implementation of the framework provides evidence that the design of the proposed architecture is feasible. An instance of the framework incorporating the jump FMS was used as a case study in the development of a prototype that detects the correct and incorrect performance of a jump. The evaluation of the prototype proved the following: - The developed prototype was effective in detecting the correct and incorrect performance of the jump FMS; and - The implemented framework was robust for the incorporation of an FMS. The successful implementation of the prototype shows that an effective NUI architecture using the Kinect can be used to facilitate the detection of FMSs. The proposed architecture provides a structured way of developing a system using the Kinect to facilitate the detection of FMSs. This allows developers to add future FMSs to the system. This dissertation therefore makes the following contributions: - An experimental design to evaluate the effectiveness of a prototype that detects FMSs - A robust framework that incorporates FMSs; and - An effective NUI architecture to facilitate the detection of fundamental movement skills using the Kinect.
- Full Text:
- Date Issued: 2015
- «
- ‹
- 1
- ›
- »