YRPIPKUYRPIPKU

Journal of Applied Engineering and Technological Science (JAETS)Journal of Applied Engineering and Technological Science (JAETS)

The quality of manual methods for assessing the ripeness of cocoa pods is subjective and varies from one person to another because of the intense labor required and variation of light and background conditions within the field. This research implemented an automated classification approach for cocoa ripeness classification utilizing Vision Transformer (ViT) with Shifted Patch Tokenization (SPT) and Locality Self Attention (LSA) to improve classification accuracy. The model proposed in this research achieved an accuracy of 82.65% and a macro F1 score of 82.71 on the exam with 1,559 images captured under varying illumination backgrounds and complex scenes. The model also proved better than baseline CNN architectures such as VGG, MobileNet, and ResNet in identifying visually progressive stages of ripeness and demonstrated greater generalization in cocoa ripeness classification. The findings of this research indicate the benefits of reducing manual intervention with careful inspection without compromising quality assurance standards in cocoa production. This work demonstrates new ways of applying transformer models to address computer vision problems in agriculture which is a step towards precision and smart farming.

This research successfully developed a Vision Transformer (ViT)-based model enhanced with Shifted Patch Tokenization (SPT) and Locality Self-Attention (LSA) for classifying cocoa ripeness.The model achieved high classification accuracy and robustness, outperforming baseline ViT and CNN architectures.The findings demonstrate the potential of transformer-based architectures for complex agricultural image recognition tasks and offer a practical solution for automating cocoa ripeness classification.

Future research should focus on expanding the dataset to encompass a wider range of cocoa pod variations and environmental conditions. Investigating the integration of multimodal data, such as hyperspectral or thermal imaging, could further enhance classification accuracy by providing complementary information about cocoa pod maturity. Furthermore, deploying the model on edge computing devices for real-time, in-field applications would facilitate immediate harvesting decisions and improve the efficiency of cocoa production. These advancements will contribute to the development of more robust and practical smart farming solutions for the cocoa industry, ultimately benefiting farmers and improving the quality of cocoa beans.

  1. [2010.11929] An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. worth 16x16... arxiv.org/abs/2010.119292010 11929 An Image is Worth 16x16 Words Transformers for Image Recognition at Scale worth 16x16 arxiv abs 2010 11929
  2. [2312.08659] On the Image-Based Detection of Tomato and Corn leaves Diseases : An in-depth comparative... arxiv.org/abs/2312.086592312 08659 On the Image Based Detection of Tomato and Corn leaves Diseases An in depth comparative arxiv abs 2312 08659
Read online
File size902.89 KB
Pages13
DMCAReport

Related /

ads-block-test