Codementor Events

New Segmentation Paradigm

Published Dec 01, 2017

The problem

The problem that we(me and my friend) have tackled is Bird detection and segmentation from images, this is a vital and challenging problem and has various real world applications such as environment check and protection, environmental dictionary and endangered animal rescue by checking endangered species. There are other various reasons to monitor birds species. To evaluate the environment quality in which we are living it is a very fine major, as it becomes important to know the situation and condition in which wild animals are living. As we know birds are in good number and are very sensitive to environmental and weather changes; also, and birds species are easier to monitor then other animals. This particular practical reason makes it more important to study bird species. In this project we aim to build a automated bird detection and segmentation system to evaluate the quality and diversity of birds that are living in our surrounding.

Saliency Algorithm

Saliency is a quality of an object to stand out relative to other things in an image. Visual attention is an the astonishing capability of human visual system to selectively process only the salient visual stimuli in details. These methods try to identify salient regions or objects using explicit saliency judgements for evaluation. Detecting large salient areas often causes severe false positives but there are no false negatives. Using a recent benchmarking of various saliency based methods, we first worked on identifying methods that would be best suited for segmentation of bird images.


Figure 1: Various saliency maps for a bird


Figure 2: Chosen Saliency Maps: FES, MC, PCA, SEG(Top to bottom).

After analysing many result and models we came up with four suitable saliency maps which were FES, SEG, MC, PCA. These were chosen because of specific reasons. Now we had to give weights to each saliency map so that we can find a weighted map of all saliency maps. For this we used a very simple neural network.


Figure 3: Results from single layered perceptron. Good results can be achieved using multilayered network.

Further improvements were done using deep learning model fig 4. This network divides the image into regions and provides the prediction accuracy and bounding boxes for each region and class present in that image. The bounding boxes are weighted by the predicted probabilities based on the class in which they are classified. We trained our network exclusively on bird images for better results.


Figure 4: Deep Network (Same as Yolo)

Results that we got after training on over 7000 image are in fig 5.


Figure 5: Results from Deep Network.

We used this deep network to further increase the accuracy of our saliency based model and to remove all those images which were not having any birds.

Some of Our Final results

I know you are thinking why so much pain if we can directly train a deep network like semantic segmentation or mask RCNN. I wrote in my heading, this is a different approach for some fun work!!

I hope you like it.

Discover and read more posts from Tushar Gupta
get started
post commentsBe the first to share your opinion
Show more replies