Recently, the study of extending the service life of bridges has gained attention. In Japan, periodic inspections of bridges by the close visual inspection method are conducted once every five years. This bridge inspection method needs much cost. Because of the lack of engineers and budget, some local governments couldn’t complete the bridge’s aggressive preventive maintenance in Japan. To solve those problems, studies of automation have been made to reduce the cost of the inspection task which depends on human power.
Deep learning-based damage detection methods are one of the methods to reduce human power. An image processing method could detect damage from a photo image of a bridge. We focus on the state-of-the-art technology for semantic segmentation method which uses the Transformer model. This kind of method has high accuracy to detect the target from an input image. On the other hand, there is not enough discussion about the effectiveness of the size of an image for this new method. In many cases, the size of an input image is different from the assumption of the input layer of a detection model. It is necessary for preprocessing: rescale, cut out, split, and so on. Such preprocessing changes the information of an input image but there is not enough discussion about its effectiveness when using the Transformer model-based method.
In this study, we set the task of detecting the peeling and the rebar exposure on the surface image of a bridge as an evaluation. We prepared three image datasets generated from one image dataset for training detection models. Each dataset has split images generated by different split sizes. We evaluate three detection models by a suitable preprocessed input dataset for each model and compared the results.