LISA-AVS: LISA 7B Model Finetuned on AVS-Bench Dataset

This is an adapted version of the online demo for LISA, where we finetune from scratch the LISA model (7B) with data from AVS-Bench (Search-TTA).

Note: Different prompts can lead to significantly varied results. Please standardize your input text prompts to avoid ambiguity, and pay attention to whether the punctuations of the input are correct.

Usage:
 (1) To let LISA-AVS segment something, input prompt like: "Where can I find the Common Name (Taxonomy Name) in this image? Please output segmentation mask.";
 (2) To let LISA-AVS output an explanation, input prompt like: "Where can I find the Common Name (Taxonomy Name) in this image? Please output segmentation mask and explain why.";
 (3) To obtain solely language output, you can input like what you should do in current multi-modal LLM (e.g., LLaVA), like: "Where can I find the Common Name (Taxonomy Name) in this image?"

In-Domain Taxonomy

Examples
Input Image Text Instruction

Out-Domain Taxonomy

Examples
Input Image Text Instruction