Multi-Spectral Image Synthesis for Crop/Weed Segmentation in Precision Farming

Farming robots can perform precise weed control by identifying and localizing crops and weeds in the field. Usually, image processing relies on machine learning. Nevertheless, it requires a big and diverse training dataset.

Image credit: Pixabay, free licence

A recent paper on suggests employing Generative Adversarial Networks to generate semi-artificial images that can be used to increase and diversify the original training dataset. Regions of the image corresponding to crop and weed plants are replaced with synthesized, photo-realistic counterparts.

Also, near-infrared data are used together with the RGB channel. During the performance evaluation, it was shown that segmentation quality increases drastically by using the original dataset augmented with the synthetic ones compared to using only the original dataset. Using only the synthetic dataset also leads to a competitive performance when compared with using only the original one.

An effective perception system is a fundamental component for farming robots, as it enables them to properly perceive the surrounding environment and to carry out targeted operations. The most recent approaches make use of state-of-the-art machine learning techniques to learn an effective model for the target task. However, those methods need a large amount of labelled data for training. A recent approach to deal with this issue is data augmentation through Generative Adversarial Networks (GANs), where entire synthetic scenes are added to the training data, thus enlarging and diversifying their informative content. In this work, we propose an alternative solution with respect to the common data augmentation techniques, applying it to the fundamental problem of crop/weed segmentation in precision farming. Starting from real images, we create semi-artificial samples by replacing the most relevant object classes (i.e., crop and weeds) with their synthesized counterparts. To do that, we employ a conditional GAN (cGAN), where the generative model is trained by conditioning the shape of the generated object. Moreover, in addition to RGB data, we take into account also near-infrared (NIR) information, generating four channel multi-spectral synthetic images. Quantitative experiments, carried out on three publicly available datasets, show that (i) our model is capable of generating realistic multi-spectral images of plants and (ii) the usage of such synthetic images in the training process improves the segmentation performance of state-of-the-art semantic segmentation Convolutional Networks.