
* First commit * Improvements * More improvements * Converted original checkpoint to HF checkpoint * Fix style * Fixed forward * More improvements * More improvements * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Remove asserts * Remove unnecessary attributes * Changed model name to camel case * Improve forward doc * Improve tests * More improvements * Fix copies * Fix doc * Make SegGptImageProcessor more flexible * Added few-shot test * Fix style * Update READMEs and docs * Update READMEs * Make inputs required * Add SegGptForImageSegmentation * Make tests pass * Rename to out_indicies * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Fixed naming convention * Copying SegGptMlp from modeling_sam.py * Some minor improvements * Remove mlp_ratio * Fix docstrings * Fixed docstring match * Objects defined before use * Storing only patch_size and beta for SegGptLoss * removed _prepare_inputs method * Removed modified from headers * Renamed to output_indicies * Removed unnecessary einsums * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fixing issues * Raise error as soon as possible * More fixes * Fix merge * Added palette to SegGptImageProcessor * Fixed typo * Fixed shape typo * Added permute before doing palette to class mapping * Fixed style * Fixed and added tests * Fixed docstrings * Matching SegFormer API for post_processing_semantic_segmentation * Fixed copies * Fixed SegGptImageProcessor to handle both binary and RGB masks * Updated docstrings of SegGptImageProcessor * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/seggpt.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/convert_seggpt_to_hf.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Object definitions above & fix style * Renamed output_indices to intermediate_feature_indices * Removed unnecessary check on bool_masked_pos * Loss first in the outputs * Added validation for do_normalize * Improved SegGptImageProcessor and added new tests * Added comment * Added docstrings to SegGptLoss * Reimplemented ensemble condition logic in SegGptEncoder * Update src/transformers/models/seggpt/__init__.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/convert_seggpt_to_hf.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Updated docstrings to use post_process_semantic_segmentation * Fixed typo on docstrings * moved pixel values test to test_image_processing_seggpt * Addressed comments * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/image_processing_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Updated docstrings for SegGptLoss * Address comments * Added SegGpt example to model docs * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * moved patchify and unpatchify * Rename checkpoint * Renamed intermediate_features to intermediate_hidden_states for consistency * Update src/transformers/models/seggpt/configuration_seggpt.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Replaced post_process_masks for post_process_semantic_segmentation in the docs --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Niels <niels.rogge1@gmail.com> Co-authored-by: Eduardo Pacheco <eduardo.pacheco@limehome.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
4.0 KiB
SegGPT
Overview
The SegGPT model was proposed in SegGPT: Segmenting Everything In Context by Xinlong Wang, Xiaosong Zhang, Yue Cao, Wen Wang, Chunhua Shen, Tiejun Huang. SegGPT employs a decoder-only Transformer that can generate a segmentation mask given an input image, a prompt image and its corresponding prompt mask. The model achieves remarkable one-shot results with 56.1 mIoU on COCO-20 and 85.6 mIoU on FSS-1000.
The abstract from the paper is the following:
We present SegGPT, a generalist model for segmenting everything in context. We unify various segmentation tasks into a generalist in-context learning framework that accommodates different kinds of segmentation data by transforming them into the same format of images. The training of SegGPT is formulated as an in-context coloring problem with random color mapping for each data sample. The objective is to accomplish diverse tasks according to the context, rather than relying on specific colors. After training, SegGPT can perform arbitrary segmentation tasks in images or videos via in-context inference, such as object instance, stuff, part, contour, and text. SegGPT is evaluated on a broad range of tasks, including few-shot semantic segmentation, video object segmentation, semantic segmentation, and panoptic segmentation. Our results show strong capabilities in segmenting in-domain and out-of
Tips:
- One can use [
SegGptImageProcessor
] to prepare image input, prompt and mask to the model. - It's highly advisable to pass
num_labels
(not considering background) during preprocessing and postprocessing with [SegGptImageProcessor
] for your use case. - When doing infenrece with [
SegGptForImageSegmentation
] if yourbatch_size
is greater than 1 you can use feature ensemble across your images by passingfeature_ensemble=True
in the forward method.
Here's how to use the model for one-shot semantic segmentation:
import torch
from datasets import load_dataset
from transformers import SegGptImageProcessor, SegGptForImageSegmentation
model_id = "BAAI/seggpt-vit-large"
image_processor = SegGptImageProcessor.from_pretrained(checkpoint)
model = SegGptForImageSegmentation.from_pretrained(checkpoint)
dataset_id = "EduardoPacheco/FoodSeg103"
ds = load_dataset(dataset_id, split="train")
# Number of labels in FoodSeg103 (not including background)
num_labels = 103
image_input = ds[4]["image"]
ground_truth = ds[4]["label"]
image_prompt = ds[29]["image"]
mask_prompt = ds[29]["label"]
inputs = image_processor(
images=image_input,
prompt_images=image_prompt,
prompt_masks=mask_prompt,
num_labels=num_labels,
return_tensors="pt"
)
with torch.no_grad():
outputs = model(**inputs)
target_sizes = [image_input.size[::-1]]
mask = image_processor.post_process_semantic_segmentation(outputs, target_sizes, num_labels=num_labels)[0]
This model was contributed by EduardoPacheco. The original code can be found here.
SegGptConfig
autodoc SegGptConfig
SegGptImageProcessor
autodoc SegGptImageProcessor - preprocess - post_process_semantic_segmentation
SegGptModel
autodoc SegGptModel - forward
SegGptForImageSegmentation
autodoc SegGptForImageSegmentation - forward