A Practical Use for AI-Generated Images
- Boby, Alden, Brown, Dane L, Connan, James
- Authors: Boby, Alden , Brown, Dane L , Connan, James
- Date: 2023
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463345 , vital:76401 , xlink:href="https://link.springer.com/chapter/10.1007/978-3-031-43838-7_12"
- Description: Collecting data for research can be costly and time-consuming, and available methods to speed up the process are limited. This research paper compares real data and AI-generated images for training an object detection model. The study aimed to assess how the utilisation of AI-generated images influences the performance of an object detection model. The study used a popular object detection model, YOLO, and trained it on a dataset with real car images as well as a synthetic dataset generated with a state-of-the-art diffusion model. The results showed that while the model trained on real data performed better on real-world images, the model trained on AI-generated images, in some cases, showed improved performance on certain images and was good enough to function as a licence plate detector on its own. The study highlights the potential of using AI-generated images for data augmentation in object detection models and sheds light on the trade-off between real and synthetic data in the training process. The findings of this study can inform future research in object detection and help practitioners make informed decisions when choosing between real and synthetic data for training object detection models.
- Full Text:
- Date Issued: 2023
- Authors: Boby, Alden , Brown, Dane L , Connan, James
- Date: 2023
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463345 , vital:76401 , xlink:href="https://link.springer.com/chapter/10.1007/978-3-031-43838-7_12"
- Description: Collecting data for research can be costly and time-consuming, and available methods to speed up the process are limited. This research paper compares real data and AI-generated images for training an object detection model. The study aimed to assess how the utilisation of AI-generated images influences the performance of an object detection model. The study used a popular object detection model, YOLO, and trained it on a dataset with real car images as well as a synthetic dataset generated with a state-of-the-art diffusion model. The results showed that while the model trained on real data performed better on real-world images, the model trained on AI-generated images, in some cases, showed improved performance on certain images and was good enough to function as a licence plate detector on its own. The study highlights the potential of using AI-generated images for data augmentation in object detection models and sheds light on the trade-off between real and synthetic data in the training process. The findings of this study can inform future research in object detection and help practitioners make informed decisions when choosing between real and synthetic data for training object detection models.
- Full Text:
- Date Issued: 2023
Exploring the Incremental Improvements of YOLOv7 over YOLOv5 for Character Recognition
- Boby, Alden, Brown, Dane L, Connan, James, Marais, Marc
- Authors: Boby, Alden , Brown, Dane L , Connan, James , Marais, Marc
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463395 , vital:76405 , xlink:href="https://link.springer.com/chapter/10.1007/978-3-031-35644-5_5"
- Description: Technological advances are being applied to aspects of life to improve quality of living and efficiency. This speaks specifically to automation, especially in the industry. The growing number of vehicles on the road has presented a need to monitor more vehicles than ever to enforce traffic rules. One way to identify a vehicle is through its licence plate, which contains a unique string of characters that make it identifiable within an external database. Detecting characters on a licence plate using an object detector has only recently been explored. This paper uses the latest versions of the YOLO object detector to perform character recognition on licence plate images. This paper expands upon existing object detection-based character recognition by investigating how improvements in the framework translate to licence plate character recognition accuracy compared to character recognition based on older architectures. Results from this paper indicate that the newer YOLO models have increased performance over older YOLO-based character recognition models such as CRNET.
- Full Text:
- Date Issued: 2022
- Authors: Boby, Alden , Brown, Dane L , Connan, James , Marais, Marc
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463395 , vital:76405 , xlink:href="https://link.springer.com/chapter/10.1007/978-3-031-35644-5_5"
- Description: Technological advances are being applied to aspects of life to improve quality of living and efficiency. This speaks specifically to automation, especially in the industry. The growing number of vehicles on the road has presented a need to monitor more vehicles than ever to enforce traffic rules. One way to identify a vehicle is through its licence plate, which contains a unique string of characters that make it identifiable within an external database. Detecting characters on a licence plate using an object detector has only recently been explored. This paper uses the latest versions of the YOLO object detector to perform character recognition on licence plate images. This paper expands upon existing object detection-based character recognition by investigating how improvements in the framework translate to licence plate character recognition accuracy compared to character recognition based on older architectures. Results from this paper indicate that the newer YOLO models have increased performance over older YOLO-based character recognition models such as CRNET.
- Full Text:
- Date Issued: 2022
Improving signer-independence using pose estimation and transfer learning for sign language recognition
- Marais, Marc, Brown, Dane L, Connan, James, Boby, Alden
- Authors: Marais, Marc , Brown, Dane L , Connan, James , Boby, Alden
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463406 , vital:76406 , xlink:href="https://doi.org/10.1007/978-3-031-35644-5"
- Description: Automated Sign Language Recognition (SLR) aims to bridge the com-munication gap between the hearing and the hearing disabled. Com-puter vision and deep learning lie at the forefront in working toward these systems. Most SLR research focuses on signer-dependent SLR and fails to account for variations in varying signers who gesticulate naturally. This paper investigates signer-independent SLR on the LSA64 dataset, focusing on different feature extraction approaches. Two approaches are proposed an InceptionV3-GRU architecture, which uses raw images as input, and a pose estimation LSTM architecture. MediaPipe Holistic is implemented to extract pose estimation landmark coordinates. A final third model applies augmentation and transfer learning using the pose estimation LSTM model. The research found that the pose estimation LSTM approach achieved the best perfor-mance with an accuracy of 80.22%. MediaPipe Holistic struggled with the augmentations introduced in the final experiment. Thus, looking into introducing more subtle augmentations may improve the model. Over-all, the system shows significant promise toward addressing the real-world signer-independence issue in SLR.
- Full Text:
- Date Issued: 2022
- Authors: Marais, Marc , Brown, Dane L , Connan, James , Boby, Alden
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463406 , vital:76406 , xlink:href="https://doi.org/10.1007/978-3-031-35644-5"
- Description: Automated Sign Language Recognition (SLR) aims to bridge the com-munication gap between the hearing and the hearing disabled. Com-puter vision and deep learning lie at the forefront in working toward these systems. Most SLR research focuses on signer-dependent SLR and fails to account for variations in varying signers who gesticulate naturally. This paper investigates signer-independent SLR on the LSA64 dataset, focusing on different feature extraction approaches. Two approaches are proposed an InceptionV3-GRU architecture, which uses raw images as input, and a pose estimation LSTM architecture. MediaPipe Holistic is implemented to extract pose estimation landmark coordinates. A final third model applies augmentation and transfer learning using the pose estimation LSTM model. The research found that the pose estimation LSTM approach achieved the best perfor-mance with an accuracy of 80.22%. MediaPipe Holistic struggled with the augmentations introduced in the final experiment. Thus, looking into introducing more subtle augmentations may improve the model. Over-all, the system shows significant promise toward addressing the real-world signer-independence issue in SLR.
- Full Text:
- Date Issued: 2022
Iterative Refinement Versus Generative Adversarial Networks for Super-Resolution Towards Licence Plate Detection
- Boby, Alden, Brown, Dane L, Connan, James
- Authors: Boby, Alden , Brown, Dane L , Connan, James
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463417 , vital:76407 , xlink:href="https://link.springer.com/chapter/10.1007/978-981-99-1624-5_26"
- Description: Licence plate detection in unconstrained scenarios can be difficult because of the medium used to capture the data. Such data is not captured at very high resolution for practical reasons. Super-resolution can be used to improve the resolution of an image with fidelity beyond that of non-machine learning-based image upscaling algorithms such as bilinear or bicubic upscaling. Technological advances have introduced more than one way to perform super-resolution, with the best results coming from generative adversarial networks and iterative refinement with diffusion-based models. This paper puts the two best-performing super-resolution models against each other to see which is best for licence plate super-resolution. Quantitative results favour the generative adversarial network, while qualitative results lean towards the iterative refinement model.
- Full Text:
- Date Issued: 2022
- Authors: Boby, Alden , Brown, Dane L , Connan, James
- Date: 2022
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/463417 , vital:76407 , xlink:href="https://link.springer.com/chapter/10.1007/978-981-99-1624-5_26"
- Description: Licence plate detection in unconstrained scenarios can be difficult because of the medium used to capture the data. Such data is not captured at very high resolution for practical reasons. Super-resolution can be used to improve the resolution of an image with fidelity beyond that of non-machine learning-based image upscaling algorithms such as bilinear or bicubic upscaling. Technological advances have introduced more than one way to perform super-resolution, with the best results coming from generative adversarial networks and iterative refinement with diffusion-based models. This paper puts the two best-performing super-resolution models against each other to see which is best for licence plate super-resolution. Quantitative results favour the generative adversarial network, while qualitative results lean towards the iterative refinement model.
- Full Text:
- Date Issued: 2022
A Robust Portable Environment for First-Year Computer Science Students
- Brown, Dane L, Connan, James
- Authors: Brown, Dane L , Connan, James
- Date: 2021
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465113 , vital:76574 , xlink:href="https://link.springer.com/chapter/10.1007/978-3-030-92858-2_6"
- Description: Computer science education in both South African universities and worldwide often aim at making students confident at problem solving by introducing various programming exercises. Standardising a computer environment where students can apply their computational thinking knowledge on a more even playing field – without worrying about software issues – can be beneficial for problem solving in classroom of diverse students. Research shows that having consistent access to this exposes students to core concepts of Computer Science. However, with the diverse student base in South Africa, not everyone has access to a personal computer or expensive software. This paper describes a new approach at first-year level that uses the power of a modified Linux distro on a flash drive to enable access to the same, fully-fledged, free and open-source environment, including the convenience of portability. This is used as a means to even the playing field in a diverse country like South Africa and address the lack of consistent access to a problem solving environment. Feedback from students and staff at the Institution are effectively heeded and attempted to be measured.
- Full Text:
- Date Issued: 2021
- Authors: Brown, Dane L , Connan, James
- Date: 2021
- Subjects: To be catalogued
- Language: English
- Type: text , article
- Identifier: http://hdl.handle.net/10962/465113 , vital:76574 , xlink:href="https://link.springer.com/chapter/10.1007/978-3-030-92858-2_6"
- Description: Computer science education in both South African universities and worldwide often aim at making students confident at problem solving by introducing various programming exercises. Standardising a computer environment where students can apply their computational thinking knowledge on a more even playing field – without worrying about software issues – can be beneficial for problem solving in classroom of diverse students. Research shows that having consistent access to this exposes students to core concepts of Computer Science. However, with the diverse student base in South Africa, not everyone has access to a personal computer or expensive software. This paper describes a new approach at first-year level that uses the power of a modified Linux distro on a flash drive to enable access to the same, fully-fledged, free and open-source environment, including the convenience of portability. This is used as a means to even the playing field in a diverse country like South Africa and address the lack of consistent access to a problem solving environment. Feedback from students and staff at the Institution are effectively heeded and attempted to be measured.
- Full Text:
- Date Issued: 2021
- «
- ‹
- 1
- ›
- »