Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect when f-string-syntax is used in a string, but not marked as an f-string #2507

Open
glasnt opened this issue Sep 21, 2018 · 8 comments
Labels
Blocked 🚧 Blocked by a particular issue Checkers Related to a checker Enhancement ✨ Improvement to a component Hacktoberfest Help wanted 🙏 Outside help would be appreciated, good for new contributors High priority Issue with more than 10 reactions

Comments

@glasnt
Copy link

glasnt commented Sep 21, 2018

Problem statement

Consider the following codeblock

name = "Katie"

stmt1 = f"Hello {name}"
stmt2 = "Hello {name}"

While the block is correct syntax-wise, the content is almost certainly logically invalid, as seen below:

>>> print(stmt1)
Hello Katie
>>> print(stmt2)
Hello {name}

stmt2 is a valid string, but declares string interpolation formatting that is compatible with f-strings, but the string does not have the f prefix to invoke the interpolation.

Describe the solution you'd like

PyLint should detect strings that use f-string-like syntax, but have not been declared as f-strings, in Python 3.6+

This could either be by:

  • attempting to evaluate the content within {}, and if valid Python, return a warning
  • attempting to evaluate the entire string as an f-string, and if the evaluation does not equal the original string, return a warning

Edge Case One: Having { and } characters are perfectly valid, but on the edge case that the content within them is a reference to an in-scope variable, or some other expression, it is likely the string was meant to be an f-string. This would be desirable to catch as a linting error.

Edge Case Two: f'...' is effectively shorthand for '...'.format(locals()); so this:

mystring = 'hello {name}'
print(mystring.format(name='Katie'))

is entirely legal, and isn't an error. There is a possibly design decision about whether this linting check should be

  • enforced by default
  • enforceable as an option
  • enforceable if the string that triggers the problem doesn't have '.format' invoked on it in the same method.

Additional context

I understand PyLint already warns against f-strings in logging, but I can't locate the base functionality to detect f-strings in the first place. I'd be willing to help implement this feature, but would require some mentoring.

@PCManticore
Copy link
Contributor

Hey @glasnt !

This makes sense, but will be tricky due to the string potentially being used as a formatting string later on down the line. But we can definitely try to look for this pattern as it might become more prevalent now with the adoption of f-strings.

Here's what we can do. Let me know if something of the following doesn't make sense.

  • f-strings are internally marked as astroid.JoinedStr (because we use typed_ast to parse the source code, there's currently a bug where some f-strings are marked as FormattedValue, but we can check just JoinedStr for now until that gets fixed)

  • being a new check regarding strings, this can go in pylint/checkers/strings.py, where you only have to implement a def visit_str(self, node) method to visit every string in the analyzed program. This does not include JoinedStr/f-strings, but it's exactly what we want here.

  • next we'll probably need to figure out that a given string was intended as an f-string in the first place. As you mentioned, we can take a hint from the local scope, which you can grab with node.scope(). If you have the variables names extracted from the string, you can grab the relevant nodes that they represent with node.scope().locals.get(name). .locals is just a dictionary from strings to nodes. Once we know that all the names are there (or in the global scope, which you can access with node.root(), there's only the matter of figuring out if the said string was formatted later on via .format(**locals()) or something akin to that construct. You might be able to check by looking for Attribute nodes in the same scope where the left hand side of the attribute is our string, something along the lines of:

# the skip_class is to make sure we don't go into inner scopes, but might not be needed per se
for attr in scope.nodes_of_class(astroid.Attribute, skip_class=(astroid.FunctionDef, )):
    if isinstance(attr.expr, astroid.Name): # LHS of `name.format`
      # Check if it's using `.format`
       if attr.expr.name == 'the name you are looking for' and attr.expr.attrname == 'format':
           # then the string was being used to be formatted after all so stop checking

  • as mentioned early, you'll need most likely to extract the variables that the said string contains. You can use pylint.checkers.strings.parse_format_method_string for that, which is already being used in the mentioned file.

  • another note is that node.parent and node._astroid_fields are your friends. .parent walks up the tree to the parent, so if you need to grab the name of the string that was supposed to be used as a f-string, just call .parent until you get to an Assign node. If it has a name, it might be possible though that it was intended for .format() to be called on it, but you can follow the above instructions to make sure that's the case. And _astroid_fields is usually a sequence that shows what AST children a node can have. There's also _other_fields which is useful as well, as it contains non-AST members of a node. For instance Attribute.attrname, which is the name of the attribute being accessed, will be found in _other_fields. Another useful utility is node.repr_tree() which shows exactly how the subtree looks for that node.

That would be all that's needed to get this going. Let me know if there's anything else I can do to help.

@wadetregaskis-linkedin
Copy link

I would also like this check. Missing those dastardly little 'f's is a common occurrence in my experience, and while the edge cases noted in this thread are hypothetically valid, as best I can tell, not a single one of them apply to the code bases I worked in day to day. So I'd be more than happy to have at least a way to opt-in to this check (or, of course, having this be the default).

@rwdrich
Copy link

rwdrich commented Apr 3, 2020

I think this would be a very useful addition to have - +1

Though it's worth noting that there are edge cases where { and } exist in a string that is not an f string, and that's a very valid case - i.e. s = "awk '{print $1}'"

I still think some analysis on these would be an improvement

@YodaEmbedding
Copy link

The analysis would need to check if the contents are

  1. valid python and
  2. any referenced variables are in locals/globals

@kbauer
Copy link

kbauer commented Jan 11, 2021

Example of relevance: Many examples on https://thedailywtf.com/series/errord

One of the most common errors people run into is interpolation code not being interpolated, resulting in things like

Dear {customer.displayname},
...

or frustrating things like

An error has occurred: {errormessage}

Such examples have been posted from many sources, including some of the largest IT companies, so missing interpolation seems to be a widespread and very user-facing bug type.

@Pierre-Sassoulas Pierre-Sassoulas added the High priority Issue with more than 10 reactions label May 30, 2021
@Pierre-Sassoulas Pierre-Sassoulas added the Help wanted 🙏 Outside help would be appreciated, good for new contributors label Jul 1, 2021
@Pierre-Sassoulas Pierre-Sassoulas pinned this issue Jul 20, 2021
DanielNoord added a commit to DanielNoord/pylint that referenced this issue Aug 2, 2021
This checks if text in strings in between `{}`'s are variables.
The var the string is assigned to is also checked for a format() call.
If this does not happen, the string should probably be a f-string and a
message is emitted.
This closes pylint-dev#2507
@DanielNoord DanielNoord unpinned this issue Mar 23, 2022
@Pierre-Sassoulas
Copy link
Member

Pierre-Sassoulas commented Jul 2, 2022

Blocked by pylint-dev/astroid#1156 or by the end of life of python 3.7, see #4787 (comment).

@Pierre-Sassoulas Pierre-Sassoulas added the Blocked 🚧 Blocked by a particular issue label Jul 2, 2022
@bje-
Copy link

bje- commented Aug 17, 2022

A related issue is when using multi-line strings like so:

s = f'{x}' \
     '{y}'

Without thinking too hard, my initial assumption was that the second string wouldn't need an f qualifier. It's a new string, of course, so it does. This would be a pretty common mistake, I imagine.

@sam-s
Copy link

sam-s commented Feb 7, 2023

Additionally, it would be nice to detect old-style unused formatting, e.g., "foo %d" should elicit a warning when used without a % or outside a logging call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocked 🚧 Blocked by a particular issue Checkers Related to a checker Enhancement ✨ Improvement to a component Hacktoberfest Help wanted 🙏 Outside help would be appreciated, good for new contributors High priority Issue with more than 10 reactions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants