A vast majority of multi-modal AI systems function as a relay race. For example, an image will come in through the Vision ...