-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio block: test the block for visual regressions #59986
Conversation
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.
To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
Warning: Type of PR label mismatch To merge this PR, it requires exactly 1 label indicating the type of PR. Other labels are optional and not being checked here.
Read more about Type labels in Gutenberg. Don't worry if you don't have the required permissions to add labels; the PR reviewer should be able to help with the task. |
Size Change: 0 B Total Size: 1.71 MB ℹ️ View Unchanged
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I think this could be helpful if
- we use it sparingly only when there is no other way to provide confidence and the potential for the regression is particularly impactful
- we mitigate the potental for false flags by using Element screenshots
I get what you mean. Thanks for pointing out to Element screenshots, it would make sense to me to just target the main element and keep the screenshots to one instance of the block like in this example, to keep it contained. The way I see it, this doesn't help if we don't cover all the blocks, because changes like the one Isabel is trying to make on #57841 are already wide enough in scope that if we don't cover as much as we can, we might as well not do any of this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together @MaggieCabrera 👍
If we proceed, the element screenshots approach Dave suggested gets my vote.
we use it sparingly only when there is no other way to provide confidence and the potential for the regression is particularly impactful
I also agree in sentiment with this as visual regression testing can be a bit flaky.
That said, if these tests aren't automated, it might be trickier to keep them updated and/or meaningful as you suggested.
If such tests wouldn't provide long term value for effort, it also makes it hard to argue for writing them all for the purposes of #57841 only. Don't get me wrong, I think they'd be a great help for that testing 🔍
I guess the flakiest blocks are going to be the blocks that aren't ready immediately—any with a loading state or other async nature. Something else to mention is the concern about storing large binary files in the repo, as git doesn't handle them very well. Github has a size limit for files for this reason - https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-large-files-on-github. I know Git LFS is the generally considered solution, though I've never tried it, so not much of a clue about how it works 😄 |
@afercia and @kevin940726 have PRs for introducing visual regression testing for the editor. IIRC loading state/styles was an issue for those. See #51287 and #46210. |
Thank you all so much for your input! @kevin940726 also mentioned how different OSes render fonts differently, and the flakiness of these kinds of tests is also a concern. I think it's still valid to explore even if we don't merge it into trunk, so I will start there, and then we can decide how valuable this is. |
I think a separate repo to store all the images might be beneficial. I believe GH Actions can be configured to be triggered from different repos. So that might be something we can explore? 🤔 |
The Playwright's |
This is an interesting idea! But in my experience, they will still have some minor differences even when they share the same font. The only way to create stable images is to use the same OS (docker, linux), AFAIK. |
Do the images need to be stored though? Can they not be generated locally? |
Another problem is that it is not always the default settings that regress. |
If we run it locally, it needs to be run before the changes and then after the changes. Most people will not even bother. But like you said, the coverage that this kind of testing needs is pretty big. We can't really expect to get to every case, but we could cover a lot that gets complicated, like layouts, and with minimal changes, we could test the same thing for all default themes automatically. |
The main problem is that block settings get changed every day, the permutations I think are already way too many. |
What?
This PR introduces screenshots to test for visual regressions on blocks. Hopefully it can help testing changes like #57841 against regressions. This is a proof of concept and right now it's only testing the audio block, but if we agree this is a good thing to have we can extend to other blocks and variations of them (or even test against different default themes)
Why?
Visual regressions are tricky to prevent some times, having another tool (automated even better) should help with confidence checks on our changes.
How?
Every time we run the test, we compare against the screenshot that the test uses as a snapshot. If there are any visual changes, the test will fail.
Testing Instructions
Run
npm run test:e2e:playwright -- -g audio.spec.js
on a first run this should passNow make a change to
packages/block-library/src/audio/style.scss
like:Run the test again, it should fail now.