Today, a large number of ships and nautical elements are active at sea. According to UNCTAD, approximately 80% of world trade is transported by sea, and this number is expected to further increase in the coming years. In addition, shipping companies have been reporting over the years that disruptions and deviations from the initial plan occur frequently, resulting in delays. These delays contribute to poor port optimization, disruptions in the market chain, and increased pollution, mainly greenhouse gas emissions and underwater radiate noise, due to prolonged idle times of vessels awaiting port calls. In fact, in April 2018, the IMO adopted the Initial Strategy for the reduction of GHG emissions from shipping which sets key ambitions, including cutting annual greenhouse gas emissions from international shipping by at least half by 2050, compared with their level in 2008. This strategy goes in line with the Zero-Emission waterborne Transport, the Horizon Europe partnership that aims to deliver and demonstrate zero-emission solutions for all major ship types and services before 2030.
In this paper, we present an innovative approach that incorporates artificial intelligence (AI) models, specifically machine learning (ML), and preprocessing techniques, to estimate the sailing time of vessels in port surroundings. All of this is accomplished by leveraging historical vessel data, such as ship characteristics, movement patterns, weather conditions, and port-specific factors (docks and areas of action). Preprocessing is crucial to achieving accurate AI models, enabling them to effectively learn from and leverage the power of data-driven insights to accurately estimate vessel dwell times. In this way, we study the impact of preprocessing on the data for prediction.
Also, by implementing underwater acoustic propagation model to each ship in its route, direct aspects related to the underwater noise pressure in the port context are studied. This study aligns with the MSFD, in particular regarding Descriptor 11, searching a balance between optimizing economical marine activities with good environmental status.
The data used to train the model cover a period of one natural year, from January 2022 to December 2022 in the Port of Cartagena area (Spain). The raw dataset consists of 32 columns and 1,259,616 rows. This dataset was divided into several CSV files (each covering a half-month period), and after concatenation into a single file, a descriptive analysis of the data was performed. From this analysis, interesting conclusions were drawn, such as the random routes taken by "tugboat" type ships, which only navigate when they need to assist another ship in entering the port, or the patterns followed by various types of ships in their routes, such as "support fishing vessel" types. It was also observed that, on average, most routes last around 2 hours, and the speed of ships in this study area averages around 9 knots (relatively high speed). The general source levels of the ships in this study range from 110 to 120 dB re 1µPa.
Various models, such as ANN (Artificial Neural Networks), Gradient Boosting, Random Forest, and linear models, were employed in this study. After several tests and cross-validation methods, the Gradient Boosting model was selected as the best among them, providing a first version of a model with an R2 of 0.82 and an MSE of 0.20.
Our results demonstrate promising accuracy in estimating vessel navigation times in a specific zone, providing valuable information for port operators, shipping companies, and other stakeholders (including bunkering) to optimize port operations, streamline logistics processes, and reduce environmental impacts. Thus, this research represents a significant step towards harnessing the potential of AI, specifically ML, in improving maritime logistics and addressing the challenges of port optimization.