LISA-AVS: LISA 7B Model Finetuned on AVS-Bench Dataset
This is an adapted version of the online demo for LISA, where we finetune from scratch the LISA model (7B) with data from AVS-Bench (Search-TTA).
Note: Different prompts can lead to significantly varied results. Please standardize your input text prompts to avoid ambiguity, and pay attention to whether the punctuations of the input are correct.
Usage:
(1) To let LISA-AVS segment something, input prompt like: "Where can I find the Common Name (Taxonomy Name) in this image? Please output segmentation mask.";
(2) To let LISA-AVS output an explanation, input prompt like: "Where can I find the Common Name (Taxonomy Name) in this image? Please output segmentation mask and explain why.";
(3) To obtain solely language output, you can input like what you should do in current multi-modal LLM (e.g., LLaVA), like: "Where can I find the Common Name (Taxonomy Name) in this image?"
In-Domain Taxonomy
Examples
| Input Image | Text Instruction |
|---|
Out-Domain Taxonomy
Examples
| Input Image | Text Instruction |
|---|