Next Article in Journal
Induced Circulation by Plunging and Submerged Jets in Circular Water Storage Tanks Using CFD
Previous Article in Journal
Electronic Technology for Wastewater Treatment and Clean Water Production
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Deep Learning-Based Algal Detection Model Development Considering Field Application

1
Department of Civil and Environmental Engineering, Hanbat National University, 125 Dongsuh-Blvd, Yuseong-gu, Daejeon 34158, Korea
2
UnU Inc., #1005, Samsung IT Valley, 27, Digital-ro 33-gil, Guro-gu, Seoul 08380, Korea
3
Department of Environmental & Energy Engineering, Suwon University, 17 Wauan-gil, Bongdam-eup, Hwaseong-si 18323, Korea
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Water 2022, 14(8), 1275; https://doi.org/10.3390/w14081275
Submission received: 16 March 2022 / Revised: 11 April 2022 / Accepted: 13 April 2022 / Published: 14 April 2022

Abstract

:
Algal blooms have various effects on drinking water supply systems; thus, proper monitoring is essential. Traditional visual identification using a microscope is a time-consuming method and requires extensive labor. Recently, advanced machine learning algorithms have been increasingly applied for the development of object detection models. The You-Only-Look-Once (YOLO) model is a novel machine learning algorithm used for object detection; it has been continuously improved in newer versions, and a tiny version of each standard model presented. The tiny versions applied a less complicated architecture using a smaller number of convolutional layers to enable faster object detection than the standard version. This study compared the applicability of the YOLO models for algal image detection from a practical aspect in terms of classification accuracy and inference time. Therefore, automated algal cell detection models were developed using YOLO v3 and YOLO v4, in which a tiny version of each model was also applied. The cell images of 30 algal genera were used for training and testing the models. The model performances were compared using the mean average precision (mAP). The mAP values of the four models were 40.9, 88.8, 84.4, and 89.8 for YOLO v3, YOLO v3-tiny, YOLO v4, and YOLO v4-tiny, respectively, demonstrating that YOLO v4 is more precise than YOLO v3. The tiny version models presented noticeably higher model accuracy than the standard models, allowing up to ten times faster object detection time. These results demonstrate the practical advantage of tiny version models for the application of object detection with a limited number of object classes.

Graphical Abstract

1. Introduction

Algae are important components of the natural ecosystem. However, their excessive growth causes various harmful effects on drinking water supply systems, including unfavorable taste and odor and release of toxic substances [1,2,3]. The management of algal blooms is an ongoing issue in water resource management, and the observation of algal cells in water resources is a fundamental approach.
Visual identification using a microscope is one of the most traditional and common methods for algal cell detection, which requires considerable expertise in research as well as time and labor. The development of an automated algal cell detection technology could potentially improve the efficiency of drinking water supply systems by reducing the labor and cost required for in-field algal monitoring.
In the past decade, advanced machine learning models have been continuously developed for the detection of target objects. The convolutional neural networks (CNN) is the most widely used fundamental algorithm for object detection, and various models based on CNN architecture have been developed [4,5]. The CNN process is composed of feature extraction using convolution and pooling processes. In the convolution process, various types of filters are used to extract the characteristics of the input features, and in the pooling process, the size of the input data is reduced to improve the practical applicability of the complicated deep learning model [6,7].
The main frameworks of CNN-based object detection models have been developed into two different types: one-stage and two-stage models [8]. The two-stage models were first developed where the object detection processes are separated by region proposal and consequent object detection [8]. The two-stage object detection models have been continuously improved from two aspects: to improve the model precision and to reduce the inference time required for object detection. Regions with a CNN (R-CNN) are considered to be one of the first developed two-stage object detection algorithms [9]. In the first stage of the model, approximately 2000 category-independent regions are proposed, where a target object is located using a selective search method. In the second stage of the model, a CNN is used to compute the features for each proposal, and each region is classified using a support vector machine (SVM). One limitation of R-CNN is the relatively slow inference time for object detection, and various algorithms based on R-CNN algorithms have been developed to improve both the model precision and inference time, such as Fast R-CNN and Faster R-CNN [10,11]. R-CNN computes the CNN process for each object proposal, which slows the object detection time of R-CNN. Fast R-CNN was developed to overcome the limitations of R-CNN. Fast R-CNN takes the entire image as an input for the computation of the CNN to produce a feature map. Then, a region of interest (ROI) pooling layer is used to extract the feature vector from the feature map. The extracted feature vector is flattened to produce one-dimensional fully connected layers for the classification of objects using softmax and regression of the boundary box of the object. From these modifications, Fast R-CNN considerably reduces the object detection time compared to R-CNN [11]. Faster R-CNN further reduces the object detection time of the model by the unified use of the region proposal network and Fast R-CNN [10] while these two-stage object detection models still consider the limitation of reducing object detection time.
The reduction in inference time of an object detection model is essential to improve the practical applicability of the model. The You-Only-Look-Once (YOLO) algorithm was developed to reduce the inference time of the object detection process [12]. The YOLO processes the region proposal and object classification at a single stage and is often referred to as a one-stage object detection model. YOLO v1 was the first version of the YOLO model. It was one of the first one-stage object detection models, still considered as a representative in this field. Although YOLO v1 reduced the time required for object detection (i.e., inference time), the relatively low accuracy was considered as a limitation compared to the two-stage model. The second version of the YOLO model, YOLO v2, offered comparable precision with up to 100 times faster inference time than the two-stage models available at that time [4,8,13]. In the past decade, various one-stage object detection models, including the YOLO model series, have been continuously developed. YOLO v3 and v4 models have been the most popular and widely used object detection models up to date [14,15,16].
CNN-based deep learning is a relatively recent approach in the detection of algae and has been applied to only a limited number of algal classes. Several recent studies have applied deep learning models for the detection of algal cells [17,18,19]. Medina et al. [18] compared the algal detection accuracy of multilayer perceptron (MLP) and CNNs in underwater pipelines, where CNN showed better accuracy up to 99.46% for testing data. More recently, Sonmez et al. [17] applied various CNN models to classify two different types of algae, Cyanobacteria and Chlorophyta, where a support vector machine (SVM) was used following a CNN process to improve classification accuracy from 98% to 99.66%. These studies have demonstrated the possibility of using CNN-based deep learning models for algal detection.
In practice, both precision and inference time are important factors for evaluating the performance of the object detection models. The YOLO v3 and v4 models offer a tiny version with less accuracy but up to tens of times faster inference time than the standard versions, which is an important advantage, especially for the implementation of the model in field object detection devices for practical use.
Previous studies evaluated the performance of models for algal cell detection; however, studies that analyze the effect of input variable characteristics (e.g., relative size of algal cell) on the algal detection model are rare. This study compared the model performances of YOLO v3, YOLO v3 tiny, YOLO v4, and YOLO v4 tiny, considering the field applicability of the models, emphasizing two aspects. First, the model performances of the standard and tiny versions were compared. Second, the effect of relative object size within the input image was tested. The model performances were evaluated based on the classification of 30 algal genera. The characteristics of each model were compared from a practical aspect in terms of classification accuracy and inference time.

2. Material and Methods

2.1. Model Selection

The YOLO models were improved from the previous version. The first version, YOLO v1, processes object detection as a single regression to provide fast object detection, using rectangular bounding boxes for the detection of objects [12]. The input image was resized to 448 × 448 pixels, and the resized image was divided into an S × S grid, where each grid cell had B bounding boxes for the detection of an object. The model was optimized to minimize the total loss (loss total) using a loss function composed of multiple parts, which compares the observed location and size of the target object referred to as ground truth with model prediction using Equations (1)–(4) [12]. Each grid cell calculated the confidence score (OC), which is the multiplication of the probability that the bounding box contains an object and the intersection over union (IOU). The IOU is the ratio of the overlapped area and union area between the ground truth and the predicted bounding box. Equation (2) represents the sum of the loss from the deviation between the bounding box and ground truth of the target object, Equation (3) represents the sum of the object confidence score loss in each bounding box, and Equation (4) represents the sum of object classification loss in each bounding box.
Loss total = Loss coordinate + Loss object confidence + Loss classification
Loss   coordinate = α c o o r d i = 0 S × S j = 0 B 1 i   j o b j [ x i , o b j x i , p r e d 2 + y i , o b j y i , p r e d 2 ] + α c o o r d i = 0 S × S j = 0 B 1 i   j o b j [ w i , o b j w i , p r e d 2 + h i , o b j h i , p r e d 2 ]
Loss   object   confidence   = i = 0 S × S j = 0 B 1 i   j o b j O C i , o b s O C i , p r e d 2 + α n o o b j i = 0 S × S j = 0 B 1 i   j n o o b j O C i , o b s O C i , p r e d 2
Loss   classification = i = 0 S × S 1 i o b j O C   c l a s s e s p i , o b j c p i , p r e d c 2
where the subscript obj represents the parameters related to the ground truth; the subscript pred represents the parameters predicted from the model; x i , y i coordinates represent the center of each bounding box relative to the bounds of the grid cell; w i , h i are the width and height of the bounding box relative to the entire image size, respectively; O C is the object confidence score; p i c is the object classification score that represents the probability of class prediction, α c o o r d   and   α n o o b j are the weight factors; 1 i   o b j denotes that the loss value is computed if an object appears in cell i; 1 i   j o b j denotes that the loss value is computed when the bounding box j predictor in cell i is responsible for that prediction; 1 i   j n o o b j denotes that the loss value is computed when the bounding box j predictor in cell i is not responsible for that prediction. The value of α n o o b j is much smaller than that of α c o o r d which limits the effect of loss for boxes that do not contain objects.
The second version of YOLO, called YOLO9000, made several significant changes from the previous version. The main model framework often referred to as the backbone of YOLO9000 is called Darknet-19 [13]. One of the critical changes from the previous version was replacing the fully connected layer after feature extraction to a convolutional layer, and anchor boxes were used to predict the bounding boxes. In YOLO9000, the priors of the bounding box dimensions were determined by k-means clustering of the input images in the training dataset.
The third version of YOLO model, YOLO v3, also followed the approach of YOLO9000 to predict the bounding boxes. The main model framework was replaced with a more complicated network called the Darknet-53 [20]. The model structure of the YOLO v3 is shown in Figure 1. The YOLO v3 model performance was better than the previous versions, especially for small-size target objects, by using three different scaled feature maps for object detection, a similar concept to the feature pyramid network [14,20,21].
Bochkovskiy et al. [22] developed an improved one-stage detection model called YOLO v4 based on the core concept of YOLO v3. The YOLO v4 model used CSPDarknet-53 as the backbone. The accuracy and speed of object detection were improved by implementing novel techniques to YOLO v4, such as spatial pyramid pooling (SPP) and path aggregation network (PAN). The SPP improved the object detection accuracy by increasing the receptive field while maintaining the operation speed, and the PAN provided a shortcut path in the feature extraction process, resulting in improved fusion of extracted features [16,22]. YOLO v4 also applied Mosaic, a new input data augmentation method, where the four different images were mixed during training, thus enabling the detection of objects outside of the normal shape [22].
Both YOLO v3 and v4 provided a tiny version [23,24,25]. The tiny version models followed the structure of the standard versions while reducing the number of convolutional layers, as shown in Table 1. YOLO v3 used three different scales (i.e., 52 × 52, 26 × 26, 13 × 13) outputs for prediction, while the tiny versions used two different scales (i.e., 26 × 26, 13 × 13) outputs [23,24,25]. The developer of YOLO v3 compared the model performance of the standard version YOLO v3 and the tiny version [24]. The mean average precision (mAP) of the standard version YOLO v3 ranged from 51.5 to 60.6 based on the resolution of input images, where the detection speed ranged from 20 to 45 fps. The mAP and detection speed of YOLO v3 tiny were 33.1 and 220, respectively. The fast object detection time is an important advantage of the tiny version model for field applications including real-time monitoring. In this study, the open-source one-stage object detection models, YOLO v3, YOLO v4, YOLO v3 tiny, and YOLO v4 tiny, were used for the development of an algal cell detection model.

2.2. Data Sources

The images of 30 algal genera collected by the Nakdonggang National Institute of Biological Resources in South Korea were used for model development (Table 2). Images were collected using a microscope (Eclipse Ni, Nikon, Tokyo, Japan). The number of images was 437, and the total number of labeled algae was 1164.

2.3. Model Training Environment

In this study, an algal cell detection model was developed for implementation on a field detection device. Thus, the model was developed using a different programming language compatible with the individual devices used for each step.
The algal cells within an image were labeled using a labeling program developed in this study, where the coordinate ground truth and class were saved in txt format. The labeling program was developed using C # program language. The ratio of the data used for training and testing of the model was 78:22 for the 437 images. The hyperparameters for the model simulation were determined as the default values provided by the developer of each model (Table 3).
The YOLO v3, YOLO v3 tiny, YOLO v4, and YOLO v4 tiny models were trained in the environment of OpenCV 4.4 and NVIDIA GPU toolkit 1.1 using a computer Intel Xeon Silver 4216, 2.1 GHz CPU, 64GB RAM with an NVIDIA Quadro RTX 4000 8GB GPU. The models were programmed using the C++ language.
The trained weight file was downloaded and used to test the model inference. The inference program was programmed using C # language to provide the genus of detected algal images, object classification score, and coordinates of the bounding box so that the output result can be visualized, as shown in Figure 2.

2.4. Effect of the Relative Object Size within the Image

The effect of the relative size of algal cells within the image on model performance was tested using images of various relative object sizes within an image. An image including only one algal cell object was selected where the object was enlarged or reduced so that the length of the longest axis of the bounding box became 10%, 30%, 50%, 70%, and 90% of the entire image size (Figure 3). One image per algal genera, Closterium sp. and Pseudopediastrum boryanum, was selected for the analysis.

2.5. Model Evaluation

The model performance was evaluated using the mAP with an IOU of 0.5, which is the most commonly used indicator for the evaluation of object detection models.
The average precision (AP) of each object class is calculated from the sum of the area under the precision–recall curve.
Precision and recall are defined as in Equations (5) and (6).
P r e c i s i o n = TP TP + FP
R e c a l l = TP TP + FN
where true positive (TP) is the number of observed positive values that were correctly predicted; false positive (FP) is the number of observed positive values that were incorrectly predicted; and, false negative (FN) is the number of observed negative values that were incorrectly predicted.
A precision–recall curve presents the relationship between precision and recall as a 1 to 1 curve with recall as the x-axis and precision as the y-axis, which shows the changed value of precision through the change of recall from 0 to 1. The AP of the model was calculated from the sum of the area under the precision–recall curve [26]. The mAP was calculated from the average AP for all genera.

3. Results and Discussion

3.1. Model Simulation Results

The performances of the four models are compared in Table 4. The training time of models were 145 h, 19 h, 213 h and 5 h for YOLO v3, YOLO v3 tiny, YOLO v4, and YOLO v4 tiny, respectively. The YOLO v3-tiny showed more than seven times faster training time than the standard version, and YOLO v4-tiny showed more than 40 times faster training time.
The results revealed that tiny versions of the models offered higher accuracy and required lower detection time, in contrast to previous findings (Table 5). In this study, YOLO v3-tiny showed approximately two times higher accuracy and object detection speed. The YOLO v4-tiny also shows better performance compared to the standard version with a detection speed that is approximately two times faster detection speed as 4.0 fps. The tiny versions showed a considerably short model training time, which is also an important advantage for model development and optimization.
Remon (2020), the original developer of the YOLO models(v1–v3), presented that YOLO v3-tiny shows up to approximately ten times faster object detection speed, while only about half the accuracy (Map) of the standard version for the model using MS COCO data with 80 classes (Table 5). In another previous study, Jiang, Zhao, Li and Jia [25] compared the model performance of YOLO models using the same MS COCO dataset. The mAPs of the tiny versions were reduced to approximately 58% of the standard versions (Table 5).
The results of this study showed that the performances of tiny version models were better than those of the standard version, in contrast to the results of previous studies (Table 4 and Table 5). The MS COCO dataset was composed of 80 classes of objects, including persons, dogs, and cars, having complicated characteristics. On the other hand, the dataset used in this study had a much smaller number (30) of classes. The shape of the algal cell is also relatively simple, such as a circular shape and an eclipse shape, compared to objects in MS COCO dataset (e.g., person, bicycle, bird, etc.).
The convolutional layers in the YOLO model extract the characteristics of an input image. An increase in the number of convolutional layers results in a decrease in the resolution of images and an increase in the channel depth. The tiny version models use fewer convolutional layers than the standard version. Thus, the tiny version models tend to have less information while maintaining relatively higher-resolution images. It is estimated that the algal cell images used in this study required less information for classification because they had a smaller number of classes and a relatively simple morphological shape than the MS COCO dataset used in previous studies. Thus, the tiny version models exhibited better performance in this study. The results verify the advantage of tiny version models in practical aspects, such as the implementation of the model in field object detection devices that require both accuracy and fast detection time.

3.2. Effect of the Relative Object Size

The model performances varied considerably through the relative object size within the image as shown in the example images (Figure 4, Figure 5 and Figure 6) where the percent value in the top of the bounding box represents probability of class prediction. The model performance with relatively large-size (≥10% of the image size), densely distributed relatively small-size, sparsely distributed relatively small-size algal cell objects are presented in Figure 4, Figure 5 and Figure 6, respectively. The tiny version models showed considerably better performance than the standard version models for the detection of relatively small sized objects (<10% of the image size), in this case, algal cells (Figure 5 and Figure 6). The two tiny version models (v3 tiny and v4 tiny) well detected algal cells, where YOLO v3 failed to detect both the small-size densely distributed target objects and the small-size sparsely distributed objects (Figure 5 and Figure 6). YOLO v4 showed a better performance than YOLO v3 for the detection of small-size objects. These examples suggested that the accuracy of the models for algal cell detection is influenced by the relative size of the object within the image rather than the crowdedness.
The relative ratios of the number of detected target cells in each model to YOLO v4 model were compared for quantitative analysis about the effect of relative object size on the model performance (Figure 7). The target objects were classified to three groups by relative object sizes (i.e., <15%, 15%~45% and <45%) which were determined from the average length of the bounding box detected by YOLO v4 model for 30 algal genera. The three models showed similar detection performances for various relative object size, while YOLO v3 showed noticeably low detection efficiency for the small size (<15%) objects than the other three models.
For further understanding about the effect of relative object size, the model performances were tested using the image including an object with 10%, 30%, 50%, 70%, and 90% of the size of the entire image. The images of two genera Closterium sp. and Pseudopediastrum boryanum were used for the test, where the relative object sizes were 61.7% and 35.2% of the entire images, respectively (Figure 8).
The object detection accuracy of the YOLO model for various object sizes was tested with three sample images used for comparing the model performance (Table 6).
For the image of Closterium sp., where the relative object size was 61.7% of the entire image, the three YOLO models (i.e., v3-tiny, v4, and v4-tiny) stably detected the reduced image with relative object sizes of 10%, 30%, and 50% of the entire image size, but did not detect the enlarged images with 70% and 90% relative object size (Table 6). YOLO v3 did not detect objects in this sample image.
The model also showed a similar stable performance with relative object sizes of 10%, 30%, and 50% of the entire image for the second example image of Pseudopediastrum boryanum (Table 6). In this image, YOLO v3 did not detect enlarged or reduced images. The two tiny version models detected the object with a relative object size from 10 to 70% but failed to detect the image with a relative object size of 90%.
The results indicated that YOLO v4 performed better for images with various relative object sizes. The tiny versions had a more stable performance than the standard versions for both v3 and v4 models.

4. Conclusions

In this study, algal cell detection models were developed using four YOLO model algorithms: YOLO v3, YOLO v3-tiny, YOLO v4, and YOLO v4-tiny.
The results showed that YOLO v4 had a better performance than YOLO v3, both in general and for the detection of small-sized objects. The tiny version models worked with noticeably higher model accuracies and ten times faster object detection time compared to previous studies. The performance of the tiny version models could be attributed to the fact that fewer algal genera with less complicated morphological shapes were screened. This resulted in extracting less information from convolutional layers during the object detection process for the classification of algal cell images.
Further analyses of images with small objects and various relative object sizes also indicated that tiny version models have a better performance than standard version models, especially for small size object.
In general, the type of dominant algal genera growing over a period of time at one site is often limited; thus, tiny version models can provide sufficient performance for field use in water quality monitoring. The results of this study verified the practical advantages of tiny version models for algal cell detection with a limited number of genera for classification.
The performance of the object detection model is affected by the various characteristics of the input image dataset. Thus, the result of this study may not generally be applicable to other academic/industrial fields or datasets with a larger number of classes. However, the results of this study provide a useful perspective to improve the practical applicability of the object detection model in future studies.

Author Contributions

J.P.: conceptualization, investigation, writing original draft preparation. J.B.: methodology, data curation. J.K.: visualization, investigation. K.Y.: software, validation. K.K.: conceptualization, writing review and editing, supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by Korea Ministry of Environment (MOE) (2020003030006).

Data Availability Statement

Due to confidentiality agreements, supporting data can only be made available to bona fide researchers, subject to a non-disclosure agreement.

Acknowledgments

This work was supported by Korea Environment Industry & Technology Institute (KEITI) through Aquatic Ecosystem Conservation Research Program, funded by the Korea Ministry of Environment (MOE) (2020003030006).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Codd, G.A.; Morrison, L.F.; Metcalf, J.S. Cyanobacterial toxins: Risk management for health protection. Toxicol. Appl. Pharmacol. 2005, 203, 264–272. [Google Scholar] [CrossRef] [PubMed]
  2. Paerl, H.W.; Otten, T.G. Harmful cyanobacterial blooms: Causes, consequences, and controls. Microb. Ecol. 2013, 65, 995–1010. [Google Scholar] [CrossRef] [PubMed]
  3. Otten, T.G.; Crosswell, J.R.; Mackey, S.; Dreher, T.W. Application of molecular tools for microbial source tracking and public health risk assessment of a Microcystis bloom traversing 300 km of the Klamath River. Harmful Algae 2015, 46, 71–81. [Google Scholar] [CrossRef]
  4. Zhao, Z.-Q.; Zheng, P.; Xu, S.-T.; Wu, X. Object detection with deep learning: A review. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 3212–3232. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
  6. Zeiler, M.D.; Fergus, R. Visualizing and understanding convolutional networks. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 818–833. [Google Scholar]
  7. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
  8. Sultana, F.; Sufian, A.; Dutta, P. A review of object detection models based on convolutional neural network. In Intelligent Computing: Image Processing Based Applications; Springer: Singapore, 2020; Volume 1157, pp. 1–16. [Google Scholar]
  9. Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Washington, DC, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
  10. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28, 91–99. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  11. Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Washington, DC, USA, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
  12. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 779–788. [Google Scholar]
  13. Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
  14. Tian, Y.; Yang, G.; Wang, Z.; Wang, H.; Li, E.; Liang, Z. Apple detection during different growth stages in orchards using the improved YOLO-V3 model. Comput. Electron. Agric. 2019, 157, 417–426. [Google Scholar] [CrossRef]
  15. Benjdira, B.; Khursheed, T.; Koubaa, A.; Ammar, A.; Ouni, K. Car detection using unmanned aerial vehicles: Comparison between faster R-CNN and yolov3. In Proceedings of the 2019 1st International Conference on Unmanned Vehicle Systems-Oman (UVS), Muscat, Oman, 5–7 February 2019; pp. 1–6. [Google Scholar]
  16. Wu, D.; Lv, S.; Jiang, M.; Song, H. Using channel pruning-based YOLO v4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environments. Comput. Electron. Agric. 2020, 178, 105742. [Google Scholar] [CrossRef]
  17. Sonmez, M.E.; Eczacıoglu, N.; Gumuş, N.E.; Aslan, M.F.; Sabanci, K.; Aşikkutlu, B. Convolutional neural network-Support vector machine based approach for classification of cyanobacteria and chlorophyta microalgae groups. Algal Res. 2021, 61, 102568. [Google Scholar] [CrossRef]
  18. Medina, E.; Petraglia, M.R.; Gomes, J.G.R.; Petraglia, A. Comparison of CNN and MLP classifiers for algae detection in underwater pipelines. In Proceedings of the 2017 Seventh International Conference on Image Processing Theory, Tools and Applications (IPTA), Montreal, QC, Canada, 28 November–1 December 2017; pp. 1–6. [Google Scholar]
  19. Park, J.; Baek, J.; You, K.; Nam, S.W.; Kim, J. Microalgae Detection Using a Deep Learning Object Detection Algorithm, YOLOv3. J. Korean Soc. Environ. Eng. 2021, 37, 275–285. [Google Scholar]
  20. Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
  21. Zhao, K.; Ren, X. Small aircraft detection in remote sensing images based on YOLOv3. In Proceedings of the IOP Conference Series: Materials Science and Engineering, Kazimierz Dolny, Poland, 21–23 November 2019; p. 012056. [Google Scholar]
  22. Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
  23. Bochkovskiy, A.D. Open Source Neural Networks in Python. 2020. Available online: https://github.com/AlexeyAB/darknet (accessed on 19 January 2021).
  24. Remon, J.D. Open Source Neural Networks in C, 2013–2016. Available online: https://pjreddie.com/darknet/ (accessed on 15 July 2020).
  25. Jiang, Z.; Zhao, L.; Li, S.; Jia, Y. Real-time object detection method based on improved YOLOv4-tiny. arXiv 2020, arXiv:2011.04244. [Google Scholar]
  26. Ozenne, B.; Subtil, F.; Maucort-Boulch, D. The precision–recall curve overcame the optimism of the receiver operating characteristic curve in rare diseases. J. Clin. Epidemiol. 2015, 68, 855–859. [Google Scholar] [CrossRef] [PubMed]
Figure 1. A schematic of YOLO v3 architecture.
Figure 1. A schematic of YOLO v3 architecture.
Water 14 01275 g001
Figure 2. Illustration of the object detection result.
Figure 2. Illustration of the object detection result.
Water 14 01275 g002
Figure 3. An example image set for testing effect of object size.
Figure 3. An example image set for testing effect of object size.
Water 14 01275 g003
Figure 4. Comparison of the algal detection model performance with relatively large-size ( 10% of the image size) objects.
Figure 4. Comparison of the algal detection model performance with relatively large-size ( 10% of the image size) objects.
Water 14 01275 g004
Figure 5. Comparison of the algal detection model performance with densely distributed relatively small-size algal cell objects.
Figure 5. Comparison of the algal detection model performance with densely distributed relatively small-size algal cell objects.
Water 14 01275 g005
Figure 6. Comparison of the algal detection model performance with sparsely distributed relatively small-size algal cell objects.
Figure 6. Comparison of the algal detection model performance with sparsely distributed relatively small-size algal cell objects.
Water 14 01275 g006
Figure 7. Comparison of the algal detection model performance with different relative object size.
Figure 7. Comparison of the algal detection model performance with different relative object size.
Water 14 01275 g007
Figure 8. The original images used to test the effect of relative object size on model performance. (a) Closterium sp. (b) Pseudopediastrum boryanum.
Figure 8. The original images used to test the effect of relative object size on model performance. (a) Closterium sp. (b) Pseudopediastrum boryanum.
Water 14 01275 g008
Table 1. The comparison of the models used for algal detection in this study.
Table 1. The comparison of the models used for algal detection in this study.
ModelYOLO v3YOLO v3 TinyYOLO v4YOLO v4 Tiny
Backbone
(Number of convolutional and pooling layers)
Darknet-53
(75)
7 convolutional and 6 max pooling layers (13)CSPDarknet-53
(110)
CSPDarknet-53-tiny (21)
Resolution of input image416 × 416416 × 416608 × 608416 × 416
Table 2. Algal images used for model development.
Table 2. Algal images used for model development.
GeneraNumber of ImageNumber of Label
TrainTestTrainTest
Acutodesmus obliquus826211
Ankistrodesmus falcatus124206
Chlamydomonas asymmetrica166556
Chlorella vulgaris1236011
Chlorococcum loculatum123538
Chroomonas coerulea1249815
Closterium sp.72112
Coelastrella sp.73145
Coelastrum astroideum var. rugosum62275
Cosmarium sp.135466
Cryptomonas lundii226226
Desmodesmus communis163183
Diplosphaera chodatii21326
Eudorina unicocca61406
Euglena sp.287287
Kirchneriella aperta1333017
Lithotrichon pulchrum824211
Micractinium pusillum91445
Micrasterias sp.6262
Monoraphidium sp.2184015
Mychonastes sp.943913
Nephrochlamys subsolitaria62245
Pectinodesmus pectinatus327398
Pediastrum duplex114134
Pseudopediastrum boryanum103103
Scenedesmus sp.6394
Selenastrum capricornutum63317
Sorastrum pediastriforme8181
Tetrabaena socialis112162
Tupiella speciosa41252
Total33998962202
Table 3. Hyperparameters used for model simulation.
Table 3. Hyperparameters used for model simulation.
ModelsYOLOv3YOLOv3 TinyYOLOv4YOLOv4 Tiny
Batch size64326432
Learning rate0.0010.0010.00130.00261
max_batch class60,00060,00060,000500,200
Table 4. Summary of model evaluation results.
Table 4. Summary of model evaluation results.
ModelYOLO v3YOLO v3 TinyYOLO v4YOLO v4 Tiny
mAP (IOU 0.5)40.988.884.489.8
Inference speed (fps)2.04.11.74.0
Table 5. YOLO model performances (Accuracy(mAP)/Speed(fps)) comparison in previous studies.
Table 5. YOLO model performances (Accuracy(mAP)/Speed(fps)) comparison in previous studies.
YOLO v3YOLO v3 TinyYOLO v4YOLO v4 TinyDatasetReferences
51.5–60.6/
20–45
33.1/
220
MS COCO
(80 classes, 83K images)
Remon [24]
52.5/4930.5/27764.9/4138.1/270MS COCO
(80 classes, 83K images)
Jiang et al. (2020)
Table 6. The performance comparison of YOLO models for various relative object sizes.
Table 6. The performance comparison of YOLO models for various relative object sizes.
GeneraRelative Size (%)Probability of Class Prediction (%)
YOLO v3YOLO v3-TinyYOLO v4YOLO v4-Tiny
Closterium sp.61.7 *ND98.399.690.6
10ND 51.168.981.2
30ND 99.998.698.1
50ND 41.997.355.5
70ND FDFDND
90ND NDFDND
Pseudopediastrum boryanum35.2 *81.5100.099.398.9
10FD96.399.992.5
30ND100.099.799.8
50ND100.099.0100.0
70ND99.9FD98.7
90NDFDFDAD
*: original image; FD: falsely detected; ND: not detected.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Park, J.; Baek, J.; Kim, J.; You, K.; Kim, K. Deep Learning-Based Algal Detection Model Development Considering Field Application. Water 2022, 14, 1275. https://doi.org/10.3390/w14081275

AMA Style

Park J, Baek J, Kim J, You K, Kim K. Deep Learning-Based Algal Detection Model Development Considering Field Application. Water. 2022; 14(8):1275. https://doi.org/10.3390/w14081275

Chicago/Turabian Style

Park, Jungsu, Jiwon Baek, Jongrack Kim, Kwangtae You, and Keugtae Kim. 2022. "Deep Learning-Based Algal Detection Model Development Considering Field Application" Water 14, no. 8: 1275. https://doi.org/10.3390/w14081275

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop