University of Louisville
Dynamic-GAN: Learning Spatial-Temporal Attention for Dynamic Object Removal in Feature Dense Environments
Grade Level at Time of Presentation
Junior
Major
Mathematics & Computer Science Engineering
Institution
University of Louisville
KY House District #
28
KY Senate District #
37
Faculty Advisor/ Mentor
Sumit Kumar Das, PhD and Dan O Popa, PhD
Department
Computer Science Engineering
Abstract
Robot navigation using simultaneous localization and mapping (SLAM) utilizes landmarks to localize the robot within the environment. This process of localization may fail in dynamic scenarios due to dynamic nature of the environment which can occlude critical landmarks. One example would be human walking in front of the robot and occluding the sensor data. To alleviate this problem, the work presented here utilizes an attention-based, deep learning framework that converts robot camera frames with dynamic content into static frames to more easily apply simultaneous localization and mapping (SLAM) algorithms. The vast majority of SLAM methods have difficulty in the presence of dynamic objects appearing in the environment and occluding the area being captured by the camera. Despite past attempts to deal with dynamic objects, challenges remain to reconstruct large, occluded areas with complex backgrounds. Our proposed Dynamic-GAN framework employed a generative adversarial network to remove dynamic objects from a scene and inpaint a static image, free of dynamic objects. The novelty of our approach is in utilizing spatial-temporal attention to encourage the generative model to focus on areas of the image occluded by dynamic content as opposed to equally weighting the whole image. The evaluation of Dynamic-GAN was conducted both quantitatively and qualitatively by testing it on benchmark datasets, and on a mobile robot in indoor navigation environments. As people appeared dynamically in close proximity to the robot, results showed that large, feature-rich occluded areas can be accurately reconstructed in real-time with our attention-based deep learning framework for dynamic object removal. Through experiments we demonstrated that our proposed algorithm has about 25% better performance on average, under various circumstances, as compared to the standard benchmark algorithms.
Dynamic-GAN: Learning Spatial-Temporal Attention for Dynamic Object Removal in Feature Dense Environments
Robot navigation using simultaneous localization and mapping (SLAM) utilizes landmarks to localize the robot within the environment. This process of localization may fail in dynamic scenarios due to dynamic nature of the environment which can occlude critical landmarks. One example would be human walking in front of the robot and occluding the sensor data. To alleviate this problem, the work presented here utilizes an attention-based, deep learning framework that converts robot camera frames with dynamic content into static frames to more easily apply simultaneous localization and mapping (SLAM) algorithms. The vast majority of SLAM methods have difficulty in the presence of dynamic objects appearing in the environment and occluding the area being captured by the camera. Despite past attempts to deal with dynamic objects, challenges remain to reconstruct large, occluded areas with complex backgrounds. Our proposed Dynamic-GAN framework employed a generative adversarial network to remove dynamic objects from a scene and inpaint a static image, free of dynamic objects. The novelty of our approach is in utilizing spatial-temporal attention to encourage the generative model to focus on areas of the image occluded by dynamic content as opposed to equally weighting the whole image. The evaluation of Dynamic-GAN was conducted both quantitatively and qualitatively by testing it on benchmark datasets, and on a mobile robot in indoor navigation environments. As people appeared dynamically in close proximity to the robot, results showed that large, feature-rich occluded areas can be accurately reconstructed in real-time with our attention-based deep learning framework for dynamic object removal. Through experiments we demonstrated that our proposed algorithm has about 25% better performance on average, under various circumstances, as compared to the standard benchmark algorithms.