Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llms/bedrock: Fixed error when using Claude3 model and giving MessageContent with images #713

Merged

Conversation

mashiike
Copy link
Contributor

@mashiike mashiike commented Mar 22, 2024

The problem want to solve with this PR

This PR concerns the execution of a prompt to recognize images using llms/bedrock's Claude3 model. In this PR, I would like to address some errors that occur when executing the following code.

package main

import (
	"context"
	_ "embed"
	"log"

	"github.com/tmc/langchaingo/llms"
	"github.com/tmc/langchaingo/llms/bedrock"
	"github.com/tmc/langchaingo/schema"
)

//go:embed image.png
var image []byte

func main() {
	llm, err := bedrock.New(
		bedrock.WithModel(bedrock.ModelAnthropicClaudeV3Haiku),
	)
	if err != nil {
		log.Fatal(err)
	}
	ctx := context.Background()
	resp, err := llm.GenerateContent(
		ctx,
		[]llms.MessageContent{
			{
				Role: schema.ChatMessageTypeHuman,
				Parts: []llms.ContentPart{
					llms.BinaryPart("image/png", image),
					llms.TextPart("Please text what is written on this image."),
				},
			},
		},
	)
	if err != nil {
		log.Fatal(err)
	}
	for _, choice := range resp.Choices {
		log.Println(choice.Content)
	}
}

The code itself is simple, interpreting the image/png image given in the embedding as text.

The errors that occur are the following two types of errors that occur when executing this code.

  • 2024/03/22 11:23:14 operation error Bedrock Runtime: InvokeModel, https response error StatusCode: 400, RequestID: 0fa522ef-9201-4f62-9f69-f40ff9046896, ValidationException: messages.0.content.0.image.source: Input tag 'image' found using 'type' does not match any of the expected tags: 'base64'
  • 2024/03/22 11:37:33 operation error Bedrock Runtime: InvokeModel, https response error StatusCode: 400, RequestID: 5741321f-54e4-4ef0-9351-d28eb104ee07, ValidationException: messages: roles must alternate between "user" and "assistant", but found multiple "user" roles in a row

The specifications for this area are described in the following document.

https://docs.anthropic.com/claude/reference/claude-on-amazon-bedrock
https://docs.anthropic.com/claude/reference/messages_post

PR Checklist

  • Read the Contributing documentation.
  • Read the Code of conduct documentation.
  • Name your Pull Request title clearly, concisely, and prefixed with the name of the primarily affected package you changed according to Good commit messages (such as memory: add interfaces for X, Y or util: add whizzbang helpers).
  • Check that there isn't already a PR that solves the problem the same way to avoid creating a duplicate.
  • Provide a description in this PR that addresses what the PR is solving, or reference the issue that it solves (e.g. Fixes #123).
  • Describes the source of new concepts.
  • References existing implementations as appropriate.
  • Contains test coverage for new functions.
  • Passes all golangci-lint checks.

…InputSource.Type is fixed to base64.

According to the Claude3 API documentation, the current image input format only accepts base64.
https://docs.anthropic.com/claude/reference/messages_post
Therefore, the existing implementation will generate the following error when making a request with an image

````
operation error Bedrock Runtime: InvokeModel, https response error StatusCode: 400, RequestID: 00000000-0000-0000-0000-0000000000000000,. ValidationException: messages.0.content.0.image.source: Input tag 'image' found using 'type' does not match any of the expected tags: 'base64'
exit status 1
```

This commit corrects the above error and allows Claude3 to be called via Bedrock with image input.
The current implementation of llms/bedrock/internal/bedrockclient/provider_anthropic.processInputMessagesAnthropic does not seem to account for MessageContent containing multiple Part MessageContent with multiple parts.

Passing a MessageContent like the following will result in an error.
```
		[]llms.MessageContent{
			{
				Role: schema.ChatMessageTypeHuman,.
				Parts: []llms.ContentPart{
					llms.BinaryPart("image/png", image), llms.
					TextPart("Please text what is written on this image."), llms.
				}, }
			}, }
		}, }
```

```
operation error Bedrock Runtime: InvokeModel, https response error StatusCode: 400, RequestID: 00000000-0000-0000-0000-0000000000000000, ValidationException: messages: roles must alternate between "user" and "assistant", but found multiple "user" roles in a row
````

This is due to the fact that []llms.MessageContent is converted to []bedrockclient.Message.
So, this commit fixes the above by modifying the procssInputMessagesAnthropic code.
Chunking the argument []bedrockclient.Message with a group of the same Role.
Then, each Chunk is converted to anthropicTextGenerationInputMessage.
@mashiike mashiike changed the title Feature/llms bedrock provider anthropic for vision Fix llms/bedrock for Claude3 Vision Prompt Mar 22, 2024
…ting `currentChunk` (prealloc)

golang-ci lint error message
```
Error: Consider pre-allocating `currentChunk` (prealloc)
```
fix this
fix golang lint
```
string `text` has 3 occurrences, but such constant `AnthropicMessageTypeText` already exists (goconst)
```
@mashiike mashiike changed the title Fix llms/bedrock for Claude3 Vision Prompt Fix llms/bedrock package for Claude3 Vision Prompt Mar 22, 2024
@mashiike mashiike changed the title Fix llms/bedrock package for Claude3 Vision Prompt llms/bedrock: Fixed error when using Claude3 model and giving MessageContent with images Mar 22, 2024
Copy link
Owner

@tmc tmc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lovely contribution, thank you. Please consider including the sample above as either an example program or even as a documentation site page, I think it would be helpful to others.

I welcome your contributions! please consider driving up line coverage and including in-line comments that explain rationale for subtle parts.

@tmc tmc merged commit 5460983 into tmc:main Mar 22, 2024
3 checks passed
@mashiike mashiike deleted the feature/llms-bedrock-provider-anthropic-for-vision branch March 24, 2024 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants