-
-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix escaped symbols breaking markdown checks #209
fix escaped symbols breaking markdown checks #209
Conversation
75f6e88
to
1154f49
Compare
public interface LineFormatterCallback { | ||
/** | ||
* Used in AbstractStaticCheck's findLineNumber method to apply rules | ||
* about escaping special characters to a line in the .mardown file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo .mardown -> markdown
But maybe it is better .md as this is the file extension that we are using in the check, or just say from the markdown file (without the .)
* Used in AbstractStaticCheck's findLineNumber method to apply rules | ||
* about escaping special characters to a line in the .mardown file | ||
* | ||
* @param line - an unparsed line from the .markdown file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is really an optimal solution I also do not see sufficient tests which prove that this actually resolves the problem and does not have some nasty corner cases where it might make things worse
|
||
@Override | ||
public String formatLine(String line) { | ||
Pattern needsEscape = Pattern.compile("\\\\" + Escaping.ESCAPABLE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But you can not be sure that all characters which have to escaped are actually escaped are you sure this will suffice for the most common or better all cases? Should we maybe determine the line number in a different way instead of trying to match it again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given such changes I would expect new tests? Or maybe changed tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Escaping.ESCAPABLE constant is defined in the external markdown library and is used in the markdown parsing logic. It will cover all the cases in which the markdown parser will make changes to the original text.
The reason why I didn't include any new tests is that when I use the verify method that is provided by com.puppycrawl.tools.checkstyle.BaseCheckTestSupport and I provide no error messages to that method, it doesn't fail the test, it only logs the error. Thus, every test that I include, would pass even if it shouldn't.
Do you have any ideas how to create tests that could fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please discuss it with some of your college's first
* @return the number of the line starting from 1, where the searched text occurred for the first | ||
* time | ||
* @throws NoResultException when no match was found | ||
*/ | ||
protected int findLineNumber(FileText fileContent, String searchedText, int startLineNumber) | ||
throws NoResultException { | ||
protected int findLineNumberFormatted(FileText fileContent, String searchedText, int startLineNumber, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imho this is not the best way forward it looks a little over designed to me with this callback logic, isn't there an easy way to solve this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also seems that source from which we receive the string has not the same origin as lines we are searching through which seems dangerous for a lot of reasons imho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sources differ is because the searchedText is generated by the markdown parsing logic and the fileContent is the raw text in the markdown file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I did understand that, but isn't this part of the problem and the thing we would actually like to resolve, can't we just request the parsed text from markdown and search through that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martinvw I will try to jump in here, because we discussed it in the office with @lpapazow and @doandzhi : Originally the check is reporting errors into the raw markdown file and for this reason we want to find the exact line in the raw file and report the error there. So if we get the parsed text and report the errors into its lines, we will probably hit an inconsistent line. For this reason we decided to parse a bit the special symbols (taken from the parser) in order to find the correct raw content. What do you think, are you tend to agree or disagree ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand what we try to achieve but it feels a little fragile to me, in most languages there are more character which can be escaped than the ones you have to escape so mismatches are possible. Without any proper tests that is hard to tell without me diving into the code myself.
I found an issue referring to the actual underlying problem that the markdown Node
s do not have a reference back to the original location: commonmark/commonmark-java#1
Given that the issue is there, seems it cannot be done (easily) so we might go for this current solution but not without proper tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that good example where escaping is not always performed can be seen here:
http://docs.openhab.org/addons/bindings/netatmo/readme.html
https://github.com/openhab/openhab2-addons/edit/master/addons/binding/org.openhab.binding.netatmo/README.md on line 49 there is a unescaped _ which does not cause problems in some views but in others it does. Docs renders fine, github view renders fine but diff's don't.
*/ | ||
protected int findLineNumber(FileText fileContent, String searchedText, int startLineNumber) | ||
throws NoResultException { | ||
return findLineNumberFormatted(fileContent, searchedText, startLineNumber, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like null
I would have prefered a no-op LineFormatterCallback
if we go for this solution than you also do not need the null check, just let it explode if you are called with a null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this suggestion, it does seem to make the code more robust. I will rewrite it in the suggested way.
} | ||
|
||
@Override | ||
public void log(int line, String message) { | ||
MarkdownCheck.this.log(line, message); | ||
} | ||
}; | ||
MarkdownVisitor visitor = new MarkdownVisitor(callBack, fileText); | ||
MarkdownVisitor visitor = new MarkdownVisitor(callBack, fileText, new LineFormatterCallback() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the design stays as is, I would make the LineFormatterCallback
a FunctionalInterface
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great suggestion, I will use it.
a6ed210
to
01c9beb
Compare
01c9beb
to
b43b1f6
Compare
@lpapazow, do you have any updates on this PR ? |
Replaced by #245 |
Create a fix for issue #205 .
Fix a single backslash before special symbols (!"#$%&'()*+,./:;<=>?@][^_`{|}~-) not escaping that symbol in .markdown files.