You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Firstly, thank you for this work. I've been trying to use your work in order to segment the diseases from X-Ray images. I've tried it with "pretrained" checkpoints of CLIP and CRIS. Using the following script:
prompts = [
"Atelectasis",
"Cardiomegaly",
"Consolidation",
"Edema",
"Pleural Effusion",
]
clip_pretrain = "pretrained/RN50.pt"
word_len = 77
fpn_in = [512, 1024, 1024]
fpn_out = [256, 512, 1024]
vis_dim = 512
word_dim = 1024
num_layers = 3
num_head = 8
dim_ffn = 2048
dropout = 0.2
context_length = 77 # 77 for clipseg
intermediate = False
cris_pretrain = "pretrained/cris.pt"
tokenizer_type = "clipseg"
img_mean = [0.48145466, 0.4578275, 0.40821073]
img_std = [0.26862954, 0.26130258, 0.27577711]
prompts = [f"Findings of {el}" for el in prompts] + ["Support devices"]
model = CRIS(
clip_pretrain=clip_pretrain,
word_len=word_len,
fpn_in=fpn_in,
fpn_out=fpn_out,
vis_dim=vis_dim,
word_dim=word_dim,
num_layers=num_layers,
num_head=num_head,
dim_ffn=dim_ffn,
dropout=dropout,
intermediate=intermediate,
cris_pretrain=cris_pretrain,
)
model.to(device)
if tokenizer_type == "biomedclip":
tokenizer = open_clip.get_tokenizer("hf-hub:microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224").tokenizer
else: # ie. tokenizer_type == "clipseg":
tokenizer = CLIPTokenizer.from_pretrained("CIDAS/clipseg-rd64-refined")
# Load image and text
transorm = T.Compose([T.Resize((416, 416)), T.ToTensor(), T.Normalize(mean=img_mean, std=img_std)])
image = Image.open(
"image.png"
)
pixel_values = transorm(image).unsqueeze(0)
input_ids = tokenizer(
prompts,
max_length=context_length,
truncation=True,
padding="max_length",
return_tensors="pt",
).input_ids
print(pixel_values.shape, input_ids.shape)
pixel_values = pixel_values.to(device).expand(len(prompts), -1, -1, -1)
input_ids = input_ids.to(device)
out = model(pixel_values, input_ids)
out = out.squeeze(1)
_, ax = plt.subplots(1, len(prompts) + 1, figsize=(15, 4))
[a.axis("off") for a in ax.flatten()]
ax[0].imshow(image)
[ax[i + 1].imshow(torch.sigmoid(out[i]).cpu().detach().numpy()) for i in range(len(prompts))]
[ax[i + 1].text(0, -15, prompts[i]) for i in range(len(prompts))]
plt.savefig("result.png")
This was just to try the model out. As expected, I got the following result:
Which is not exactly what I needed. Thus, I wanted to ask if you are planning to share your trained models so that I can try them out to see whether they work better than pretrained densenet121+gradcam?
Thanks in advance and looking forward for your answer.
Berke
The text was updated successfully, but these errors were encountered:
Hi,
Firstly, thank you for this work. I've been trying to use your work in order to segment the diseases from X-Ray images. I've tried it with "pretrained" checkpoints of CLIP and CRIS. Using the following script:
This was just to try the model out. As expected, I got the following result:
Which is not exactly what I needed. Thus, I wanted to ask if you are planning to share your trained models so that I can try them out to see whether they work better than pretrained densenet121+gradcam?
Thanks in advance and looking forward for your answer.
Berke
The text was updated successfully, but these errors were encountered: