The "Industry 4.0", a technology revolution emphasized automation, connectivity, and data-driven decision-making. As the world transitions to Industry 5.0, the focus shifts to more human-centred, robust, sustainable, and intelligent industrial systems. Here, collaborative robots (cobots) emerge as key enablers, work in shared space, enhancing human capability without compromising safety and flexibility.
Computer vision plays a key role in enabling such integration by offering perception intelligence to cobots for tasks such as object detection, gesture identification, defect detection, and adaptive navigation. These capabilities are powered by artificial intelligence strategies: classical approaches—including feature extraction, template matching, and traditional machine learning continue to offer robust solutions for structured tasks, while new methodologies – deep learning, reinforcement learning, and transformer-based architectures facilitate adaptability in unstructured and dynamic industrial environments.
Software platforms are also critical for implementation and deployment. MATLAB remains an excellent choice for quick prototyping and algorithm validation, whereas Python-based frameworks (e.g. TensorFlow, PyTorch, OpenCV) provide scalability, open source flexibility and integration with edge and cloud platforms. Their comparison is critical to grasp the performance, accessibility and deployment readiness trade-offs.
Applications of AI-driven, vision-enabled cobots are assembly, quality inspection, adaptive manufacturing and safe human-robot collaboration. This paper surveys conventional and emerging computer vision approaches, identifies the gaps and presents the future research directions – edge AI deployment, multimodal sensor fusion and explainable vision systems – toward reliable and efficient adoption in Industry 5.0.
