Multimodal

AI that can work with more than one type of input, such as text, images, audio, or video.