Applied Sciences, Vol. 15, Pages 6442: Application of Multimodal AI to Aid Scene Perception for the Visually Impaired
Applied Sciences doi: 10.3390/app15126442
Authors:
Piotr Skulimowski
In this paper, the use of generative multimodal models for image analysis is proposed, with the goal of determining the selection of parameters for 3D scene segmentation algorithms in systems designed to assist blind individuals in navigation. AI algorithms enable scene type detection, lighting condition assessment, and determination of whether a scene can be used to obtain parameters necessary for system initialization, such as the orientation of imaging sensors relative to the ground. Additionally, the effectiveness of extracting selected scene parameters using four multimodal models is evaluated, and the results are compared to annotations made by a human. The obtained results highlight the potential of utilizing such models to enhance the functionality of systems belonging to the Electronics Travel Aid group, particularly in terms of parameter selection for scene segmentation algorithms and scene presentation to visually impaired individuals.
Source link
Piotr Skulimowski www.mdpi.com