Conservation organizations increasingly rely on “remote camera traps” to capture images of wildlife. Remote camera traps are digital cameras enclosed in weather-proof cases triggered by heat or motion. The images captured by the camera traps are manually “classified” by volunteers to indicate what the images contain. Because camera traps generate hundreds of thousands of images captured it is difficult for the limited volunteers to classify the images in a timely manner. This problem is further exacerbated by the presence of “false-positive” images that do not contain any animals or humans. False positives occur when the remote camera trap is triggered by the movement of trees or shrubs in the wind, and are much more difficult to classify compared to images of humans or animals because any potential false positive could have a hidden animal, so volunteers must scour through the entire image to ensure it is a false positive.
A machine-learning model was trained using actual remote camera trap images from multiple conservation organizations to quickly identify and separate false positive images from images containing animals or humans. Two Python-language scripts were created to use the model to classify images. A “Phased” script identifies each image individually and no image has any effect on the classifications of other images. A “Sequenced” script places images in “buckets” based on a configurable threshold time between when each image is captured. Although the Sequenced script would in theory miss fewer images of animals, it could also misclassify false positives in a bucket when the animal has moved out of the frame. This is already done. The next step is to develop a web site or smartphone app to let volunteers confirm the automated classification. A third step is to embed the classifier into a third generation camera trap. This third generation camera trap would place images into false positive, animal or human folders so that viewers can prioritize images for review. Furthermore, the camera trap can be configured for additional activities, e.g., to alert via cell phone/SMS authorities to the presence of potential poachers upon detection of humans wearing camouflage or carrying guns or perhaps gunshot sounds outside hunting seasons.
I assume that I will have enough remote camera trap images to train a machine learning model with.
I assume that I will have the GPU resources to train a machine learning model.
One of the biggest difficulties I encountered was getting enough remote camera trap images to create a model. When I first began on the project, I was only working with a single relatively small organization, so I didn’t have enough remote camera trap images to compile a full training set, leading me to rely on Google Images for my training set. However, I soon discovered this was not a viable option, so I reached out to many conservation organizations telling them what I was doing and asking for their remote camera trap images. Few organizations got back to me and even fewer agreed to my request, but after a couple weeks I had enough remote camera trap images to train and test my machine learning model.
Working on possibly transferring to an object detection model, as well as simply increasing the scale of the number of remote camera trap images used to train the machine learning model.
More experience with machine learning and coding. Identification ("labeling") of object extents within existing images.