人気の記事一覧

TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding

2か月前

Video Annotator: A framework for efficiently building video classifiers using vision-language models and active learning

4か月前