Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out why images seem to be getting sent for translation despite logic to strip them out #5942

Open
arsalansufi opened this issue Jan 31, 2025 · 2 comments

Comments

@arsalansufi
Copy link
Contributor

arsalansufi commented Jan 31, 2025

Not a huge deal but could be relevant for our VxMark pilot town (New London) if they have an image in their ballot measure. Worth getting ahead of accordingly

https://votingworks.slack.com/archives/C085YT4798C/p1738349866764009

@arsalansufi arsalansufi added this to the VxDesign Self-Serve Pilot milestone Jan 31, 2025
@eventualbuddha
Copy link
Collaborator

Maybe in whole or in part because we don't actually use <img> tags or src= for images, since we convert them to SVG here.

async function bitmapImageToSvg(imageDataUrl: string) {
const { width, height } = await getBitmapImageDimensions(imageDataUrl);
return `<svg xmlns="http://www.w3.org/2000/svg" width="${width}" height="${height}" viewBox="0 0 ${width} ${height}">
<image href="${imageDataUrl}" width="${width}" height="${height}" />
</svg>`;
}

So, we probably should actually update both locations to look for SVG instead:

// Google Cloud will preserve HTML tags fairly well, so we can pass HTML
// rich text directly to the API. However, it has a max string length limit,
// so base64 encoded img src attributes are generally too long to include.
// We strip them out in order and replace them after translating.
const srcRegex = /src="([^"]*)"/g;
const srcAttrsArray = textArray.map((text) =>
iter(text.matchAll(srcRegex))
.map((match) => match[0])
.toArray()
);

return { textContent: '[image]' };

@arsalansufi arsalansufi assigned kshen0 and unassigned kshen0 Feb 5, 2025
@arsalansufi
Copy link
Contributor Author

Confirmed that New London doesn't have an image in their ballot, so moving to Medium pri. Still want us to get to it, but it's not blocking for this ballot production + machine configuration run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants