An evaluation of hand-based algorithms for sign language recognition
- Marais, Marc, Brown, Dane L, Connan, James, Boby, Alden
- Authors: Marais, Marc , Brown, Dane L , Connan, James , Boby, Alden
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465124 , vital:76575 , xlink:href="https://ieeexplore.ieee.org/abstract/document/9856310"
- Description: Sign language recognition is an evolving research field in computer vision, assisting communication between hearing disabled people. Hand gestures contain the majority of the information when signing. Focusing on feature extraction methods to obtain the information stored in hand data in sign language recognition may improve classification accuracy. Pose estimation is a popular method for extracting body and hand landmarks. We implement and compare different feature extraction and segmentation algorithms, focusing on the hands only on the LSA64 dataset. To extract hand landmark coordinates, MediaPipe Holistic is implemented on the sign images. Classification is performed using poplar CNN architectures, namely ResNet and a Pruned VGG network. A separate 1D-CNN is utilised to classify hand landmark coordinates extracted using MediaPipe. The best performance was achieved on the unprocessed raw images using a Pruned VGG network with an accuracy of 95.50%. However, the more computationally efficient model using the hand landmark data and 1D-CNN for classification achieved an accuracy of 94.91%.
- Full Text:
- Date Issued: 2022
- Authors: Marais, Marc , Brown, Dane L , Connan, James , Boby, Alden
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465124 , vital:76575 , xlink:href="https://ieeexplore.ieee.org/abstract/document/9856310"
- Description: Sign language recognition is an evolving research field in computer vision, assisting communication between hearing disabled people. Hand gestures contain the majority of the information when signing. Focusing on feature extraction methods to obtain the information stored in hand data in sign language recognition may improve classification accuracy. Pose estimation is a popular method for extracting body and hand landmarks. We implement and compare different feature extraction and segmentation algorithms, focusing on the hands only on the LSA64 dataset. To extract hand landmark coordinates, MediaPipe Holistic is implemented on the sign images. Classification is performed using poplar CNN architectures, namely ResNet and a Pruned VGG network. A separate 1D-CNN is utilised to classify hand landmark coordinates extracted using MediaPipe. The best performance was achieved on the unprocessed raw images using a Pruned VGG network with an accuracy of 95.50%. However, the more computationally efficient model using the hand landmark data and 1D-CNN for classification achieved an accuracy of 94.91%.
- Full Text:
- Date Issued: 2022
Deep face-iris recognition using robust image segmentation and hyperparameter tuning
- Authors: Brown, Dane L
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465145 , vital:76577 , xlink:href="https://link.springer.com/chapter/10.1007/978-981-16-3728-5_19"
- Description: Biometrics are increasingly being used for tasks that involve sensitive or financial data. Hitherto, security on devices such as smartphones has not been a priority. Furthermore, users tend to ignore the security features in favour of more rapid access to the device. A bimodal system is proposed that enhances security by utilizing face and iris biometrics from a single image. The motivation behind this is the ability to acquire both biometrics simultaneously in one shot. The system’s biometric components: face, iris(es) and their fusion are evaluated. They are also compared to related studies. The best results were yielded by a proposed lightweight Convolutional Neural Network architecture, outperforming tuned VGG-16, Xception, SVM and the related works. The system shows advancements to ‘at-a-distance’ biometric recognition for limited and high computational capacity computing devices. All deep learning algorithms are provided with augmented data, included in the tuning process, enabling additional accuracy gains. Highlights include near-perfect fivefold cross-validation accuracy on the IITD-Iris dataset when performing identification. Verification tests were carried out on the challenging CASIA-Iris-Distance dataset and performed well on few training samples. The proposed system is practical for small or large amounts of training data and shows great promise for at-a-distance recognition and biometric fusion.
- Full Text:
- Date Issued: 2022
- Authors: Brown, Dane L
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465145 , vital:76577 , xlink:href="https://link.springer.com/chapter/10.1007/978-981-16-3728-5_19"
- Description: Biometrics are increasingly being used for tasks that involve sensitive or financial data. Hitherto, security on devices such as smartphones has not been a priority. Furthermore, users tend to ignore the security features in favour of more rapid access to the device. A bimodal system is proposed that enhances security by utilizing face and iris biometrics from a single image. The motivation behind this is the ability to acquire both biometrics simultaneously in one shot. The system’s biometric components: face, iris(es) and their fusion are evaluated. They are also compared to related studies. The best results were yielded by a proposed lightweight Convolutional Neural Network architecture, outperforming tuned VGG-16, Xception, SVM and the related works. The system shows advancements to ‘at-a-distance’ biometric recognition for limited and high computational capacity computing devices. All deep learning algorithms are provided with augmented data, included in the tuning process, enabling additional accuracy gains. Highlights include near-perfect fivefold cross-validation accuracy on the IITD-Iris dataset when performing identification. Verification tests were carried out on the challenging CASIA-Iris-Distance dataset and performed well on few training samples. The proposed system is practical for small or large amounts of training data and shows great promise for at-a-distance recognition and biometric fusion.
- Full Text:
- Date Issued: 2022
Deep Learning Approach to Image Deblurring and Image Super-Resolution using DeblurGAN and SRGAN
- Kuhlane, Luxolo L, Brown, Dane L, Connan, James, Boby, Alden, Marais, Marc
- Authors: Kuhlane, Luxolo L , Brown, Dane L , Connan, James , Boby, Alden , Marais, Marc
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465157 , vital:76578 , xlink:href="https://www.researchgate.net/profile/Luxolo-Kuhlane/publication/363257796_Deep_Learning_Approach_to_Image_Deblurring_and_Image_Super-Resolution_using_DeblurGAN_and_SRGAN/links/6313b5a01ddd44702131b3df/Deep-Learning-Approach-to-Image-Deblurring-and-Image-Super-Resolution-using-DeblurGAN-and-SRGAN.pdf"
- Description: Deblurring is the task of restoring a blurred image to a sharp one, retrieving the information lost due to the blur of an image. Image deblurring and super-resolution, as representative image restoration problems, have been studied for a decade. Due to their wide range of applications, numerous techniques have been proposed to tackle these problems, inspiring innovations for better performance. Deep learning has become a robust framework for many image processing tasks, including restoration. In particular, generative adversarial networks (GANs), proposed by [1], have demonstrated remarkable performances in generating plausible images. However, training GANs for image restoration is a non-trivial task. This research investigates optimization schemes for GANs that improve image quality by providing meaningful training objective functions. In this paper we use a DeblurGAN and Super-Resolution Generative Adversarial Network (SRGAN) on the chosen dataset.
- Full Text:
- Date Issued: 2022
- Authors: Kuhlane, Luxolo L , Brown, Dane L , Connan, James , Boby, Alden , Marais, Marc
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465157 , vital:76578 , xlink:href="https://www.researchgate.net/profile/Luxolo-Kuhlane/publication/363257796_Deep_Learning_Approach_to_Image_Deblurring_and_Image_Super-Resolution_using_DeblurGAN_and_SRGAN/links/6313b5a01ddd44702131b3df/Deep-Learning-Approach-to-Image-Deblurring-and-Image-Super-Resolution-using-DeblurGAN-and-SRGAN.pdf"
- Description: Deblurring is the task of restoring a blurred image to a sharp one, retrieving the information lost due to the blur of an image. Image deblurring and super-resolution, as representative image restoration problems, have been studied for a decade. Due to their wide range of applications, numerous techniques have been proposed to tackle these problems, inspiring innovations for better performance. Deep learning has become a robust framework for many image processing tasks, including restoration. In particular, generative adversarial networks (GANs), proposed by [1], have demonstrated remarkable performances in generating plausible images. However, training GANs for image restoration is a non-trivial task. This research investigates optimization schemes for GANs that improve image quality by providing meaningful training objective functions. In this paper we use a DeblurGAN and Super-Resolution Generative Adversarial Network (SRGAN) on the chosen dataset.
- Full Text:
- Date Issued: 2022
Deep Palmprint Recognition with Alignment and Augmentation of Limited Training Samples
- Brown, Dane L, Bradshaw, Karen L
- Authors: Brown, Dane L , Bradshaw, Karen L
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/440249 , vital:73760 , xlink:href="https://doi.org/10.1007/s42979-021-00859-3"
- Description: This paper builds upon a previously proposed automatic palmprint alignment and classification system. The proposed system was geared towards palmprints acquired from either contact or contactless sensors. It was robust to finger location and fist shape changes—accurately extracting the palmprints in images without fingers. An extension to this previous work includes comparisons of traditional and deep learning models, both with hyperparameter tuning. The proposed methods are compared with related verification systems and a detailed evaluation of open-set identification. The best results were yielded by a proposed Convolutional Neural Network, based on VGG-16, and outperforming tuned VGG-16 and Xception architectures. All deep learning algorithms are provided with augmented data, included in the tuning process, enabling significant accuracy gains. Highlights include near-zero and zero EER on IITD-Palmprint verification using one training sample and leave-one-out strategy, respectively. Therefore, the proposed palmprint system is practical as it is effective on data containing many and few training examples.
- Full Text:
- Date Issued: 2022
- Authors: Brown, Dane L , Bradshaw, Karen L
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/440249 , vital:73760 , xlink:href="https://doi.org/10.1007/s42979-021-00859-3"
- Description: This paper builds upon a previously proposed automatic palmprint alignment and classification system. The proposed system was geared towards palmprints acquired from either contact or contactless sensors. It was robust to finger location and fist shape changes—accurately extracting the palmprints in images without fingers. An extension to this previous work includes comparisons of traditional and deep learning models, both with hyperparameter tuning. The proposed methods are compared with related verification systems and a detailed evaluation of open-set identification. The best results were yielded by a proposed Convolutional Neural Network, based on VGG-16, and outperforming tuned VGG-16 and Xception architectures. All deep learning algorithms are provided with augmented data, included in the tuning process, enabling significant accuracy gains. Highlights include near-zero and zero EER on IITD-Palmprint verification using one training sample and leave-one-out strategy, respectively. Therefore, the proposed palmprint system is practical as it is effective on data containing many and few training examples.
- Full Text:
- Date Issued: 2022
Deep palmprint recognition with alignment and augmentation of limited training samples
- Brown, Dane L, Bradshaw, Karen L
- Authors: Brown, Dane L , Bradshaw, Karen L
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/464074 , vital:76473 , xlink:href="https://doi.org/10.1007/s42979-021-00859-3"
- Description: This paper builds upon a previously proposed automatic palmprint alignment and classification system. The proposed system was geared towards palmprints acquired from either contact or contactless sensors. It was robust to finger location and fist shape changes—accurately extracting the palmprints in images without fingers. An extension to this previous work includes comparisons of traditional and deep learning models, both with hyperparameter tuning. The proposed methods are compared with related verification systems and a detailed evaluation of open-set identification. The best results were yielded by a proposed Convolutional Neural Network, based on VGG-16, and outperforming tuned VGG-16 and Xception architectures. All deep learning algorithms are provided with augmented data, included in the tuning process, enabling significant accuracy gains. Highlights include near-zero and zero EER on IITD-Palmprint verification using one training sample and leave-one-out strategy, respectively. Therefore, the proposed palmprint system is practical as it is effective on data containing many and few training examples.
- Full Text:
- Date Issued: 2022
- Authors: Brown, Dane L , Bradshaw, Karen L
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/464074 , vital:76473 , xlink:href="https://doi.org/10.1007/s42979-021-00859-3"
- Description: This paper builds upon a previously proposed automatic palmprint alignment and classification system. The proposed system was geared towards palmprints acquired from either contact or contactless sensors. It was robust to finger location and fist shape changes—accurately extracting the palmprints in images without fingers. An extension to this previous work includes comparisons of traditional and deep learning models, both with hyperparameter tuning. The proposed methods are compared with related verification systems and a detailed evaluation of open-set identification. The best results were yielded by a proposed Convolutional Neural Network, based on VGG-16, and outperforming tuned VGG-16 and Xception architectures. All deep learning algorithms are provided with augmented data, included in the tuning process, enabling significant accuracy gains. Highlights include near-zero and zero EER on IITD-Palmprint verification using one training sample and leave-one-out strategy, respectively. Therefore, the proposed palmprint system is practical as it is effective on data containing many and few training examples.
- Full Text:
- Date Issued: 2022
Exploring the Incremental Improvements of YOLOv7 over YOLOv5 for Character Recognition
- Boby, Alden, Brown, Dane L, Connan, James, Marais, Marc
- Authors: Boby, Alden , Brown, Dane L , Connan, James , Marais, Marc
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463395 , vital:76405 , xlink:href="https://link.springer.com/chapter/10.1007/978-3-031-35644-5_5"
- Description: Technological advances are being applied to aspects of life to improve quality of living and efficiency. This speaks specifically to automation, especially in the industry. The growing number of vehicles on the road has presented a need to monitor more vehicles than ever to enforce traffic rules. One way to identify a vehicle is through its licence plate, which contains a unique string of characters that make it identifiable within an external database. Detecting characters on a licence plate using an object detector has only recently been explored. This paper uses the latest versions of the YOLO object detector to perform character recognition on licence plate images. This paper expands upon existing object detection-based character recognition by investigating how improvements in the framework translate to licence plate character recognition accuracy compared to character recognition based on older architectures. Results from this paper indicate that the newer YOLO models have increased performance over older YOLO-based character recognition models such as CRNET.
- Full Text:
- Date Issued: 2022
- Authors: Boby, Alden , Brown, Dane L , Connan, James , Marais, Marc
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463395 , vital:76405 , xlink:href="https://link.springer.com/chapter/10.1007/978-3-031-35644-5_5"
- Description: Technological advances are being applied to aspects of life to improve quality of living and efficiency. This speaks specifically to automation, especially in the industry. The growing number of vehicles on the road has presented a need to monitor more vehicles than ever to enforce traffic rules. One way to identify a vehicle is through its licence plate, which contains a unique string of characters that make it identifiable within an external database. Detecting characters on a licence plate using an object detector has only recently been explored. This paper uses the latest versions of the YOLO object detector to perform character recognition on licence plate images. This paper expands upon existing object detection-based character recognition by investigating how improvements in the framework translate to licence plate character recognition accuracy compared to character recognition based on older architectures. Results from this paper indicate that the newer YOLO models have increased performance over older YOLO-based character recognition models such as CRNET.
- Full Text:
- Date Issued: 2022
Improving licence plate detection using generative adversarial networks
- Authors: Boby, Alden , Brown, Dane L
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/464145 , vital:76480 , xlink:href="https://link.springer.com/chapter/10.1007/978-3-031-04881-4_47"
- Description: The information on a licence plate is used for traffic law enforcement, access control, surveillance and parking lot management. Existing li-cence plate recognition systems work with clear images taken under controlled conditions. In real-world licence plate recognition scenarios, images are not as straightforward as the ‘toy’ datasets used to bench-mark existing systems. Real-world data is often noisy as it may contain occlusion and poor lighting, obscuring the information on a licence plate. Cleaning input data before using it for licence plate recognition is a complex problem, and existing literature addressing the issue is still limited. This paper uses two deep learning techniques to improve li-cence plate visibility towards more accurate licence plate recognition. A one-stage object detector popularly known as YOLO is implemented for locating licence plates under challenging situations. Super-resolution generative adversarial networks are considered for image upscaling and reconstruction to improve the clarity of low-quality input. The main focus involves training these systems on datasets that include difficult to detect licence plates, enabling better performance in unfavourable conditions and environments.
- Full Text:
- Date Issued: 2022
- Authors: Boby, Alden , Brown, Dane L
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/464145 , vital:76480 , xlink:href="https://link.springer.com/chapter/10.1007/978-3-031-04881-4_47"
- Description: The information on a licence plate is used for traffic law enforcement, access control, surveillance and parking lot management. Existing li-cence plate recognition systems work with clear images taken under controlled conditions. In real-world licence plate recognition scenarios, images are not as straightforward as the ‘toy’ datasets used to bench-mark existing systems. Real-world data is often noisy as it may contain occlusion and poor lighting, obscuring the information on a licence plate. Cleaning input data before using it for licence plate recognition is a complex problem, and existing literature addressing the issue is still limited. This paper uses two deep learning techniques to improve li-cence plate visibility towards more accurate licence plate recognition. A one-stage object detector popularly known as YOLO is implemented for locating licence plates under challenging situations. Super-resolution generative adversarial networks are considered for image upscaling and reconstruction to improve the clarity of low-quality input. The main focus involves training these systems on datasets that include difficult to detect licence plates, enabling better performance in unfavourable conditions and environments.
- Full Text:
- Date Issued: 2022
Improving signer-independence using pose estimation and transfer learning for sign language recognition
- Marais, Marc, Brown, Dane L, Connan, James, Boby, Alden
- Authors: Marais, Marc , Brown, Dane L , Connan, James , Boby, Alden
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463406 , vital:76406 , xlink:href="https://doi.org/10.1007/978-3-031-35644-5"
- Description: Automated Sign Language Recognition (SLR) aims to bridge the com-munication gap between the hearing and the hearing disabled. Com-puter vision and deep learning lie at the forefront in working toward these systems. Most SLR research focuses on signer-dependent SLR and fails to account for variations in varying signers who gesticulate naturally. This paper investigates signer-independent SLR on the LSA64 dataset, focusing on different feature extraction approaches. Two approaches are proposed an InceptionV3-GRU architecture, which uses raw images as input, and a pose estimation LSTM architecture. MediaPipe Holistic is implemented to extract pose estimation landmark coordinates. A final third model applies augmentation and transfer learning using the pose estimation LSTM model. The research found that the pose estimation LSTM approach achieved the best perfor-mance with an accuracy of 80.22%. MediaPipe Holistic struggled with the augmentations introduced in the final experiment. Thus, looking into introducing more subtle augmentations may improve the model. Over-all, the system shows significant promise toward addressing the real-world signer-independence issue in SLR.
- Full Text:
- Date Issued: 2022
- Authors: Marais, Marc , Brown, Dane L , Connan, James , Boby, Alden
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463406 , vital:76406 , xlink:href="https://doi.org/10.1007/978-3-031-35644-5"
- Description: Automated Sign Language Recognition (SLR) aims to bridge the com-munication gap between the hearing and the hearing disabled. Com-puter vision and deep learning lie at the forefront in working toward these systems. Most SLR research focuses on signer-dependent SLR and fails to account for variations in varying signers who gesticulate naturally. This paper investigates signer-independent SLR on the LSA64 dataset, focusing on different feature extraction approaches. Two approaches are proposed an InceptionV3-GRU architecture, which uses raw images as input, and a pose estimation LSTM architecture. MediaPipe Holistic is implemented to extract pose estimation landmark coordinates. A final third model applies augmentation and transfer learning using the pose estimation LSTM model. The research found that the pose estimation LSTM approach achieved the best perfor-mance with an accuracy of 80.22%. MediaPipe Holistic struggled with the augmentations introduced in the final experiment. Thus, looking into introducing more subtle augmentations may improve the model. Over-all, the system shows significant promise toward addressing the real-world signer-independence issue in SLR.
- Full Text:
- Date Issued: 2022
Investigating signer-independent sign language recognition on the lsa64 dataset
- Marais, Marc, Brown, Dane L, Connan, James, Boby, Alden, Kuhlane, Luxolo L
- Authors: Marais, Marc , Brown, Dane L , Connan, James , Boby, Alden , Kuhlane, Luxolo L
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465179 , vital:76580 , xlink:href="https://www.researchgate.net/profile/Marc-Marais/publication/363174384_Investigating_Signer-Independ-ent_Sign_Language_Recognition_on_the_LSA64_Dataset/links/63108c7d5eed5e4bd138680f/Investigating-Signer-Independent-Sign-Language-Recognition-on-the-LSA64-Dataset.pdf"
- Description: Conversing with hearing disabled people is a significant challenge; however, computer vision advancements have significantly improved this through automated sign language recognition. One of the common issues in sign language recognition is signer-dependence, where variations arise from varying signers, who gesticulate naturally. Utilising the LSA64 dataset, a small scale Argentinian isolated sign language recognition, we investigate signer-independent sign language recognition. An InceptionV3-GRU architecture is employed to extract and classify spatial and temporal information for automated sign language recognition. The signer-dependent approach yielded an accuracy of 97.03%, whereas the signer-independent approach achieved an accuracy of 74.22%. The signer-independent system shows promise towards addressing the real-world and common issue of signer-dependence in sign language recognition.
- Full Text:
- Date Issued: 2022
- Authors: Marais, Marc , Brown, Dane L , Connan, James , Boby, Alden , Kuhlane, Luxolo L
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465179 , vital:76580 , xlink:href="https://www.researchgate.net/profile/Marc-Marais/publication/363174384_Investigating_Signer-Independ-ent_Sign_Language_Recognition_on_the_LSA64_Dataset/links/63108c7d5eed5e4bd138680f/Investigating-Signer-Independent-Sign-Language-Recognition-on-the-LSA64-Dataset.pdf"
- Description: Conversing with hearing disabled people is a significant challenge; however, computer vision advancements have significantly improved this through automated sign language recognition. One of the common issues in sign language recognition is signer-dependence, where variations arise from varying signers, who gesticulate naturally. Utilising the LSA64 dataset, a small scale Argentinian isolated sign language recognition, we investigate signer-independent sign language recognition. An InceptionV3-GRU architecture is employed to extract and classify spatial and temporal information for automated sign language recognition. The signer-dependent approach yielded an accuracy of 97.03%, whereas the signer-independent approach achieved an accuracy of 74.22%. The signer-independent system shows promise towards addressing the real-world and common issue of signer-dependence in sign language recognition.
- Full Text:
- Date Issued: 2022
Investigating the Effects of Image Correction Through Affine Transformations on Licence Plate Recognition
- Boby, Alden, Brown, Dane L, Connan, James, Marais, Marc
- Authors: Boby, Alden , Brown, Dane L , Connan, James , Marais, Marc
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465190 , vital:76581 , xlink:href="https://ieeexplore.ieee.org/abstract/document/9856380"
- Description: Licence plate recognition has many real-world applications, which fall under security and surveillance. Deep learning for licence plate recognition has been adopted to improve existing image-based processing techniques in recent years. Object detectors are a popular choice for approaching this task. All object detectors are some form of a convolutional neural network. The You Only Look Once framework and Region-Based Convolutional Neural Networks are popular models within this field. A novel architecture called the Warped Planar Object Detector is a recent development by Zou et al. that takes inspiration from YOLO and Spatial Network Transformers. This paper aims to compare the performance of the Warped Planar Object Detector and YOLO on licence plate recognition by training both models with the same data and then directing their output to an Enhanced Super-Resolution Generative Adversarial Network to upscale the output image, then lastly using an Optical Character Recognition engine to classify characters detected from the images.
- Full Text:
- Date Issued: 2022
- Authors: Boby, Alden , Brown, Dane L , Connan, James , Marais, Marc
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465190 , vital:76581 , xlink:href="https://ieeexplore.ieee.org/abstract/document/9856380"
- Description: Licence plate recognition has many real-world applications, which fall under security and surveillance. Deep learning for licence plate recognition has been adopted to improve existing image-based processing techniques in recent years. Object detectors are a popular choice for approaching this task. All object detectors are some form of a convolutional neural network. The You Only Look Once framework and Region-Based Convolutional Neural Networks are popular models within this field. A novel architecture called the Warped Planar Object Detector is a recent development by Zou et al. that takes inspiration from YOLO and Spatial Network Transformers. This paper aims to compare the performance of the Warped Planar Object Detector and YOLO on licence plate recognition by training both models with the same data and then directing their output to an Enhanced Super-Resolution Generative Adversarial Network to upscale the output image, then lastly using an Optical Character Recognition engine to classify characters detected from the images.
- Full Text:
- Date Issued: 2022
Iterative Refinement Versus Generative Adversarial Networks for Super-Resolution Towards Licence Plate Detection
- Boby, Alden, Brown, Dane L, Connan, James
- Authors: Boby, Alden , Brown, Dane L , Connan, James
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463417 , vital:76407 , xlink:href="https://link.springer.com/chapter/10.1007/978-981-99-1624-5_26"
- Description: Licence plate detection in unconstrained scenarios can be difficult because of the medium used to capture the data. Such data is not captured at very high resolution for practical reasons. Super-resolution can be used to improve the resolution of an image with fidelity beyond that of non-machine learning-based image upscaling algorithms such as bilinear or bicubic upscaling. Technological advances have introduced more than one way to perform super-resolution, with the best results coming from generative adversarial networks and iterative refinement with diffusion-based models. This paper puts the two best-performing super-resolution models against each other to see which is best for licence plate super-resolution. Quantitative results favour the generative adversarial network, while qualitative results lean towards the iterative refinement model.
- Full Text:
- Date Issued: 2022
- Authors: Boby, Alden , Brown, Dane L , Connan, James
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463417 , vital:76407 , xlink:href="https://link.springer.com/chapter/10.1007/978-981-99-1624-5_26"
- Description: Licence plate detection in unconstrained scenarios can be difficult because of the medium used to capture the data. Such data is not captured at very high resolution for practical reasons. Super-resolution can be used to improve the resolution of an image with fidelity beyond that of non-machine learning-based image upscaling algorithms such as bilinear or bicubic upscaling. Technological advances have introduced more than one way to perform super-resolution, with the best results coming from generative adversarial networks and iterative refinement with diffusion-based models. This paper puts the two best-performing super-resolution models against each other to see which is best for licence plate super-resolution. Quantitative results favour the generative adversarial network, while qualitative results lean towards the iterative refinement model.
- Full Text:
- Date Issued: 2022
Plant disease detection using deep learning on natural environment images
- De Silva, Malitha, Brown, Dane L
- Authors: De Silva, Malitha , Brown, Dane L
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465212 , vital:76583 , xlink:href="https://ieeexplore.ieee.org/abstract/document/9855925"
- Description: Improving agriculture is one of the major concerns today, as it helps reduce global hunger. In past years, many technological advancements have been introduced to enhance harvest quality and quantity by controlling and preventing weeds, pests, and diseases. Several studies have focused on identifying diseases in plants, as it helps to make decisions on spraying fungicides and fertilizers. State-of-the-art systems typically combine image processing and deep learning methods to identify conditions with visible symptoms. However, they use already available data sets or images taken in controlled environments. This study was conducted on two data sets of ten plants collected in a natural environment. The first dataset contained RGB Visible images, while the second contained Near-Infrared (NIR) images of healthy and diseased leaves. The visible image dataset showed higher training and validation accuracies than the NIR image dataset with ResNet, Inception, VGG and MobileNet architectures. For the visible image and NIR dataset, ResNet-50V2 outperformed other models with validation accuracies of 98.35% and 94.01%, respectively.
- Full Text:
- Date Issued: 2022
- Authors: De Silva, Malitha , Brown, Dane L
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465212 , vital:76583 , xlink:href="https://ieeexplore.ieee.org/abstract/document/9855925"
- Description: Improving agriculture is one of the major concerns today, as it helps reduce global hunger. In past years, many technological advancements have been introduced to enhance harvest quality and quantity by controlling and preventing weeds, pests, and diseases. Several studies have focused on identifying diseases in plants, as it helps to make decisions on spraying fungicides and fertilizers. State-of-the-art systems typically combine image processing and deep learning methods to identify conditions with visible symptoms. However, they use already available data sets or images taken in controlled environments. This study was conducted on two data sets of ten plants collected in a natural environment. The first dataset contained RGB Visible images, while the second contained Near-Infrared (NIR) images of healthy and diseased leaves. The visible image dataset showed higher training and validation accuracies than the NIR image dataset with ResNet, Inception, VGG and MobileNet architectures. For the visible image and NIR dataset, ResNet-50V2 outperformed other models with validation accuracies of 98.35% and 94.01%, respectively.
- Full Text:
- Date Issued: 2022
Plant disease detection using multispectral imaging
- De Silva, Malitha, Brown, Dane L
- Authors: De Silva, Malitha , Brown, Dane L
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463439 , vital:76409 , xlink:href="https://link.springer.com/chapter/10.1007/978-3-031-35641-4_24"
- Description: People worldwide are undergoing many challenges, including food scarcity. Many pieces of research are now focused on improving agriculture to increase the harvest and reduce the cost. Identifying plant diseases and pests in the early stages helps to enhance the yield and reduce costs. However, most plant disease identification research with computer vision has been done with images taken in controlled environments on publically available data sets. Near-Infrared (NIR) imaging is a favourable approach for identifying plant diseases. Therefore, this study collected NIR images of healthy and diseased leaves in the natural environment. The dataset is tested with eight Convolutional Neural Network (CNN) models with different train-test splits ranging from 10:90 to 90:10. The evaluated models attained their highest training and test accuracies from the 70:30 split onwards. Xception outperformed all the other models in all train-test splits and achieved 100% accuracy, precision and recall in the 80:20 train-test split.
- Full Text:
- Date Issued: 2022
- Authors: De Silva, Malitha , Brown, Dane L
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463439 , vital:76409 , xlink:href="https://link.springer.com/chapter/10.1007/978-3-031-35641-4_24"
- Description: People worldwide are undergoing many challenges, including food scarcity. Many pieces of research are now focused on improving agriculture to increase the harvest and reduce the cost. Identifying plant diseases and pests in the early stages helps to enhance the yield and reduce costs. However, most plant disease identification research with computer vision has been done with images taken in controlled environments on publically available data sets. Near-Infrared (NIR) imaging is a favourable approach for identifying plant diseases. Therefore, this study collected NIR images of healthy and diseased leaves in the natural environment. The dataset is tested with eight Convolutional Neural Network (CNN) models with different train-test splits ranging from 10:90 to 90:10. The evaluated models attained their highest training and test accuracies from the 70:30 split onwards. Xception outperformed all the other models in all train-test splits and achieved 100% accuracy, precision and recall in the 80:20 train-test split.
- Full Text:
- Date Issued: 2022
- «
- ‹
- 1
- ›
- »