-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding PaliGemma2 to KerasHub #1998
Merged
divyashreepathihalli
merged 5 commits into
keras-team:master
from
divyashreepathihalli:paligemma2
Dec 5, 2024
Merged
Adding PaliGemma2 to KerasHub #1998
divyashreepathihalli
merged 5 commits into
keras-team:master
from
divyashreepathihalli:paligemma2
Dec 5, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Add PaliGemma2 arch * Enable mixed precision check for PaliGemma * Add conversion script * Revert ImageConverter and reduce mem usage in the conversion script * Remove `compute_output_spec` * Fix `compute_output_shape` issue for keras 3.1 * Add model cards and update conversion script * update presets --------- Co-authored-by: divyashreepathihalli <[email protected]>
mattdangerw
approved these changes
Dec 5, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm!
divyashreepathihalli
added a commit
to divyashreepathihalli/keras-nlp
that referenced
this pull request
Dec 5, 2024
* Add PaliGemma2 (keras-team#96) * Add PaliGemma2 arch * Enable mixed precision check for PaliGemma * Add conversion script * Revert ImageConverter and reduce mem usage in the conversion script * Remove `compute_output_spec` * Fix `compute_output_shape` issue for keras 3.1 * Add model cards and update conversion script * update presets --------- Co-authored-by: divyashreepathihalli <[email protected]> * Update pali_gemma_presets.py - remove mix presets * Update pali_gemma_presets.py * Update convert_pali_gemma2_checkpoints.py --------- Co-authored-by: james77777778 <[email protected]>
divyashreepathihalli
added a commit
that referenced
this pull request
Dec 5, 2024
* Adding PaliGemma2 to KerasHub (#1998) * Add PaliGemma2 (#96) * Add PaliGemma2 arch * Enable mixed precision check for PaliGemma * Add conversion script * Revert ImageConverter and reduce mem usage in the conversion script * Remove `compute_output_spec` * Fix `compute_output_shape` issue for keras 3.1 * Add model cards and update conversion script * update presets --------- Co-authored-by: divyashreepathihalli <[email protected]> * Update pali_gemma_presets.py - remove mix presets * Update pali_gemma_presets.py * Update convert_pali_gemma2_checkpoints.py --------- Co-authored-by: james77777778 <[email protected]> * Version bump to 0.18.0 * Update pali_gemma_presets.py (#2003) * Update pali_gemma_presets.py * code reformat * Adding PaliGemma2 to KerasHub (#1998) * Add PaliGemma2 (#96) * Add PaliGemma2 arch * Enable mixed precision check for PaliGemma * Add conversion script * Revert ImageConverter and reduce mem usage in the conversion script * Remove `compute_output_spec` * Fix `compute_output_shape` issue for keras 3.1 * Add model cards and update conversion script * update presets --------- Co-authored-by: divyashreepathihalli <[email protected]> * Update pali_gemma_presets.py - remove mix presets * Update pali_gemma_presets.py * Update convert_pali_gemma2_checkpoints.py --------- Co-authored-by: james77777778 <[email protected]> * Update pali_gemma_presets.py (#2003) * Update pali_gemma_presets.py * code reformat --------- Co-authored-by: james77777778 <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Sanity Check Colab : https://colab.sandbox.google.com/drive/1xejmaZvLgMFrzIrIRm2gHzrXF5cprp7P
Model summary
PaliGemma 2 is an update of the PaliGemma vision-language model (VLM) which incorporates the capabilities of the Gemma 2 models. The PaliGemma family of models is inspired by PaLI-3 and based on open components such as the SigLIP vision model and Gemma 2 language models. It takes both image and text as input and generates text as output, supporting multiple languages. It is designed for class-leading fine-tune performance on a wide range of vision-language tasks such as image and short video caption, visual question answering, text reading, object detection and object segmentation.
Model architecture
PaliGemma 2 is the composition of a Transformer decoder and a Vision Transformer image encoder. The text decoder is initialized from Gemma 2 in the 2B, 9B, and 27B parameter sizes. The image encoder is initialized from SigLIP-So400m/14. Similar to the original PaliGemma model, PaliGemma 2 is trained following the PaLI-3 recipes.
Inputs and outputs
Input: Image and text string, such as a prompt to caption the image, or a question. Output: Generated text in response to the input, such as a caption of the image, an answer to a question, a list of object bounding box coordinates, or segmentation codewords.
Model implementation author: @james77777778
KerasHub PaliGemma implementation lead: @divyashreepathihalli