How Far Can Off-the-Shelf Multimodal Large Language Models Go in Online Episodic Memory Question Answering?

Publication
Proceedings of the 23rd International Conference on Image Analysis and Processing (ICIAP)