Abstract: Recently, video recognition is emerging with the help of multi-modal learning, which focuses on integrating distinct modalities to improve the performance or robustness of the model.
Visual (Single) Object Tracking aims to continuously localize and estimate the scale of a target in subsequent video frames, given only its initial state in the first frame. This task can be ...
Abstract: This paper explores the use of visual scripting languages in game engines, particularly in Unity, comparing different third-party frameworks. The typical applications of visual scripting in ...
We plan to release TensorRT accelerated implementation and adapting more matching networks for MAC-VO. If you are interested, please star ⭐ this repo to stay tuned. [Nov 2025] We release the ...