diff --git a/peps/pep-0750.rst b/peps/pep-0750.rst index 3291da787ea..689e09c4008 100644 --- a/peps/pep-0750.rst +++ b/peps/pep-0750.rst @@ -6,14 +6,15 @@ Author: Jim Baker , Koudai Aono , Lysandros Nikolaou , Dave Peck -Discussions-To: https://discuss.python.org/t/pep-750-tag-strings-for-writing-domain-specific-languages/60408 +Discussions-To: https://discuss.python.org/t/pep750-template-strings-new-updates/71594 Status: Draft Type: Standards Track Created: 08-Jul-2024 Python-Version: 3.14 Post-History: `09-Aug-2024 `__, `17-Oct-2024 `__, - `21-Oct-2024 `__ + `21-Oct-2024 `__, + `18-Nov-2024 `__ Abstract @@ -124,11 +125,16 @@ the ability to nest template strings within interpolations, as well as the abili to use all valid quote marks (``'``, ``"``, ``'''``, and ``"""``). Like other string prefixes, the ``t`` prefix must immediately precede the quote. Like f-strings, both lowercase ``t`` and uppercase ``T`` prefixes are supported. Like -f-strings, t-strings may not be combined with the ``b`` or ``u`` prefixes. +f-strings, t-strings may not be combined with ``u`` prefix. + Additionally, f-strings and t-strings cannot be combined, so the ``ft`` -prefix is invalid as well. t-strings *may* be combined with the ``r`` prefix; +prefix is invalid. t-strings *may* be combined with the ``r`` prefix; see the `Raw Template Strings`_ section below for more information. +The combination of t-strings and bytes (``tb``) is considered out of scope for +this PEP. However, unlike f-strings, there is no fundamental reason why t-strings +and bytes cannot be combined. Support could be considered in a future PEP. + The ``Template`` Type --------------------- @@ -138,11 +144,23 @@ Template strings evaluate to an instance of a new type, ``templatelib.Template`` .. code-block:: python class Template: - args: Sequence[str | Interpolation] + args: tuple[str | Interpolation, ...] def __init__(self, *args: str | Interpolation): ... + @property + def strings(self) -> tuple[str, ...]: + ... + + @property + def interpolations(self) -> tuple[Interpolation, ...]: + ... + + @property + def values(self) -> tuple[object, ...]: + ... + The ``args`` attribute provides access to the string parts and any interpolations in the literal: @@ -256,6 +274,25 @@ It would be surprising if, for example, a template string that uses ``{value:.2f did not round the value to two decimal places when processed. +Convenience Accessors in ``Template`` +------------------------------------- + +The ``Template.strings`` and ``Template.interpolations`` properties provide +convenient strongly-typed access to the ``str`` and ``Interpolation`` instances +found in ``Template.args``. + +Finally, the ``Template.values`` property is equivalent to: + +.. code-block:: + + @property + def values(self) -> tuple[object, ...]: + return tuple(i.value for i in self.interpolations) + +Details about the layout of ``Template.args`` are explained in the +`Interleaving of Template.args`_ section below. + + Processing Template Strings --------------------------- @@ -382,6 +419,28 @@ Two instances of ``Interpolation`` are defined to be equal if their ``value``, ) +Template and Interpolation Hashing +---------------------------------- + +The ``Template`` and ``Interpolation`` types implement the ``__hash__()`` method +roughly as follows: + +.. code-block:: python + + class Template: + def __hash__(self) -> int: + return hash(self.args) + + class Interpolation: + def __hash__(self) -> int: + return hash((self.value, self.expr, self.conv, self.format_spec)) + +An ``Interpolation`` instance is hashable if its ``value`` attribute +is also hashable. Likewise, a ``Template`` instance is hashable if +all of its ``Interpolation`` instances are hashable. In all other cases, +``TypeError`` is raised. + + No Support for Ordering ----------------------- @@ -399,7 +458,7 @@ The debug specifier, ``=``, is supported in template strings and behaves similar to how it behaves in f-strings, though due to limitations of the implementation there is a slight difference. -In particular, ``t'{expr=}'`` is treated as ``t'expr={expr}'``: +In particular, ``t'{expr=}'`` is treated as ``t'expr={expr!r}'``: .. code-block:: python @@ -407,6 +466,13 @@ In particular, ``t'{expr=}'`` is treated as ``t'expr={expr}'``: template = t"Hello {name=}" assert template.args[0] == "Hello name=" assert template.args[1].value == "World" + assert template.args[1].conv == "r" + +If a separate format string is also provided, ``t'{expr=:fmt}`` is treated instead as +``t'expr={expr!s:fmt}'``. + +Whitespace is preserved in the debug specifier, so ``t'{expr =}'`` is treated as +``t'expr ={expr!r}'``. Raw Template Strings @@ -452,11 +518,22 @@ Exceptions raised in t-string literals are the same as those raised in f-string literals. +No ``Template.__str__()`` Implementation +---------------------------------------- + +The ``Template`` type does not provide a specialized ``__str__()`` implementation; +it inherits the default implementation from the ``object`` class. + +This is because ``Template`` instances are intended to be used by template processing +code, which may return a string or any other type. There is no canonical way to +convert a Template to a string. + + Interleaving of ``Template.args`` --------------------------------- -In the ``Template`` type, the ``args`` attribute is a sequence that will always -alternate between string literals and ``Interpolation`` instances. Specifically: +In the ``Template`` type, the ``args`` attribute is a tuple that always +alternates between string literals and ``Interpolation`` instances. Specifically: - Even-indexed elements (0, 2, 4, ...) are always of type ``str``, representing the literal parts of the template. @@ -475,7 +552,7 @@ For example, the following assertions hold: assert template.args[2] == "" These rules imply that the ``args`` attribute will always have an odd length. -As a consequence, empty strings are added to the sequence when the template +As a consequence, empty strings are added to the tuple when the template begins or ends with an interpolation, or when two interpolations are adjacent: .. code-block:: python @@ -490,15 +567,27 @@ begins or ends with an interpolation, or when two interpolations are adjacent: assert template.args[4] == "" Most template processing code will not care about this detail and will use -either structural pattern matching or ``isinstance()`` checks to distinguish -between the two types of elements in the sequence. +one or more of: -The detail exists because it allows for performance optimizations in template -processing code. For example, a template processor could cache the static parts -of the template and only reprocess the dynamic parts when the template is -evaluated with different values. Access to the static parts can be done with -``template.args[::2]``. +- Structural pattern matching +- ``isinstance()`` checks +- The ``strings``, ``interpolations``, and ``values`` properties +For example: + +.. code-block:: python + + name = "World" + template = t"Hello {name}!" + assert template.strings == ("Hello ", "!") + assert template.interpolations == (Interpolation(value=name, expr="name"),) + assert template.values == (name,) + +Interleaving allows for performance optimizations in template processing code. +For example, a template processor could cache the static parts of the template +and only reprocess the dynamic parts when the template is evaluated with +different values. + Interleaving is an invariant maintained by the ``Template`` class. Developers can take advantage of it but they are not required to themselves maintain it. Specifically, ``Template.__init__()`` can be called with ``str`` and @@ -546,7 +635,6 @@ specifiers like ``:.2f``. The full code is fairly simple: return str(value) return value - def f(template: Template) -> str: parts = [] for arg in template.args: @@ -571,10 +659,9 @@ specifiers like ``:.2f``. The full code is fairly simple: Example: Structured Logging --------------------------- -Structured logging allows developers to log data in both a human-readable format -*and* a structured format (like JSON) using only a single logging call. This is -useful for log aggregation systems that process the structured format while -still allowing developers to easily read their logs. +Structured logging allows developers to log data in machine-readable +formats like JSON. With t-strings, developers can easily log structured data +alongside human-readable messages using just a single log statement. We present two different approaches to implementing structured logging with template strings. @@ -793,6 +880,15 @@ developers call a processing function that they get the result they want: typically, a string, although processing code can of course return any arbitrary type. +Developers will also want to understand how template strings relate to other +string formatting methods like f-strings and :meth:`str.format`. They will need +to decide when to use each method. If a simple string is all that is needed, and +there are no security implications, f-strings are likely the best choice. For +most cases where a format string is used, it can be replaced with a function +wrapping the creation of a template string. In cases where the format string is +obtained from user input, the filesystem, or databases, it is possible to write +code to convert it into a ``Template`` instance if desired. + Because developers will learn that t-strings are nearly always used in tandem with processing functions, they don't necessarily need to understand the details of the ``Template`` type. As with descriptors and decorators, we expect many more @@ -857,12 +953,12 @@ The structure of ``Template`` objects allows for effective memoization: .. code-block:: python - source = template.args[::2] # Static string parts - values = [i.value for i in template.args[1::2]] # Dynamic interpolated values + strings = template.strings # Static string parts + values = template.values # Dynamic interpolated values -This separation enables caching of processed static parts, while dynamic parts can be -inserted as needed. Authors of template processing code can use the static -``source`` as cache keys, leading to significant performance improvements when +This separation enables caching of processed static parts while dynamic parts +can be inserted as needed. Authors of template processing code can use the static +``strings`` as cache keys, leading to significant performance improvements when similar templates are used repeatedly. @@ -1008,17 +1104,17 @@ and await it before processing the template string: return "Sleepy" template = t"Hello {get_name}" - assert await aformat(template) == "Hello Sleepy" + assert await async_f(template) == "Hello Sleepy" -This assumes that the template processing code in ``aformat()`` is asynchronous +This assumes that the template processing code in ``async_f()`` is asynchronous and is able to ``await`` an interpolation's value. .. note:: Example code - See `aformat.py`__ and `test_aformat.py`__. + See `afstring.py`__ and `test_afstring.py`__. - __ https://github.com/davepeck/pep750-examples/blob/main/pep/aformat.py - __ https://github.com/davepeck/pep750-examples/blob/main/pep/test_aformat.py + __ https://github.com/davepeck/pep750-examples/blob/main/pep/afstring.py + __ https://github.com/davepeck/pep750-examples/blob/main/pep/test_afstring.py Approaches to Template Reuse @@ -1038,11 +1134,70 @@ values, they can write a function to return a ``Template`` instance: This is, of course, no different from how f-strings can be reused. +Relation to Format Strings +-------------------------- + +The venerable :meth:`str.format` method accepts format strings that can later +be used to format values: + +.. code-block:: python + + alas_fmt = "We're all out of {cheese}." + assert alas_fmt.format(cheese="Red Leicester") == "We're all out of Red Leicester." + +If one squints, one can think of format strings as a kind of function definition. +The *call* to :meth:`str.format` can be seen as a kind of function call. The +t-string equivalent is to simply define a standard Python function that returns +a ``Template`` instance: + +.. code-block:: python + + def make_alas(*, cheese: str) -> Template: + return t"We're all out of {cheese}." + + alas_t = make_alas(cheese="Red Leicester") + # Using the f() function from the f-string example, above + assert f(alas_t) == "We're all out of Red Leicester." + +The ``make_alas()`` function itself can be thought of as analogous to the +format string. The call to ``make_alas()`` is analogous to the call to +:meth:`str.format`. + +Of course, it is common to load format strings from external sources like a +filesystem or database. Thankfully, because ``Template`` and ``Interpolation`` +are simple Python types, it is possible to write a function that takes an +old-style format string and returns an equivalent ``Template`` instance: + +.. code-block:: python + + def from_format(fmt: str, /, *args: object, **kwargs: object) -> Template: + """Parse `fmt` and return a `Template` instance.""" + ... + + # Load this from a file, database, etc. + alas_fmt = "We're all out of {cheese}." + alas_t = from_format(alas_fmt, cheese="Red Leicester") + # Using the f() function from the f-string example, above + assert f(alas_t) == "We're all out of Red Leicester." + +This is a powerful pattern that allows developers to use template strings in +places where they might have previously used format strings. A full implementation +of ``from_format()`` is available in the examples repository. It supports the +full grammar of format strings including positional and keyword arguments, +automatic and manual field numbering, etc. + +.. note:: Example code + + See `format.py`__ and `test_format.py`__. + + __ https://github.com/davepeck/pep750-examples/blob/main/pep/format.py + __ https://github.com/davepeck/pep750-examples/blob/main/pep/test_format.py + + Reference Implementation ======================== -At the time of this PEP's announcement, a fully-working implementation is -`available `_. +A CPython implementation of PEP 750 is `available `_. There is also a public repository of `examples and tests `_ built around the reference implementation. If you're interested in playing with @@ -1117,8 +1272,11 @@ This was rejected for several reasons: static analysis. Most importantly, there are viable (if imperfect) alternatives to implicit -lambda wrapping when lazy evaluation is desired. See the section on -`Approaches to Lazy Evaluation`_, above, for more information. +lambda wrapping in many cases where lazy evaluation is desired. See the section +on `Approaches to Lazy Evaluation`_, above, for more information. + +While delayed evaluation was rejected for *this* PEP, we hope that the community +continues to explore the idea. Making ``Template`` and ``Interpolation`` Into Protocols @@ -1143,10 +1301,10 @@ allowing combination of ``r`` and ``t`` prefixes. Other Homes for ``Template`` and ``Interpolation`` -------------------------------------------------- -Previous versions of this PEP proposed that the ``Template`` and ``Interpolation`` -types be placed in the ``types`` module. This was rejected in favor of creating -a new top-level standard library module, ``templatelib``. This was done to avoid -polluting the ``types`` module with seemingly unrelated types. +Previous versions of this PEP proposed placing the ``Template`` and +``Interpolation`` types in the ``types`` module. Members of the Python core +team internally discussed the various options and ultimately decided to create +a new top-level standard library module, ``templatelib``. Enable Full Reconstruction of Original Template Literal @@ -1251,16 +1409,6 @@ This was rejected in favor of keeping t-string syntax as close to f-string synta as possible. -A Lazy Conversion Specifier ---------------------------- - -We considered adding a new conversion specifier, ``!()``, that would explicitly -wrap the interpolation expression in a lambda. - -This was rejected in favor of the simpler approach of using explicit lambdas -when lazy evaluation is desired. - - Alternate Layouts for ``Template.args`` --------------------------------------- @@ -1268,10 +1416,10 @@ During the development of this PEP, we considered several alternate layouts for the ``args`` attribute of the ``Template`` type. This included: - Instead of ``args``, ``Template`` contains a ``strings`` attribute of type - ``Sequence[str]`` and an ``interpolations`` attribute of type - ``Sequence[Interpolation]``. There are zero or more interpolations and + ``tuple[str, ...]`` and an ``interpolations`` attribute of type + ``tuple[Interpolation, ...]``. There are zero or more interpolations and there is always one more string than there are interpolations. Utility code - could build an interleaved sequence of strings and interpolations from these + could build an interleaved tuple of strings and interpolations from these separate attributes. This was rejected as being overly complex. - ``args`` is typed as a ``Sequence[tuple[str, Interpolation | None]]``. Each @@ -1279,7 +1427,7 @@ the ``args`` attribute of the ``Template`` type. This included: string part has no corresponding interpolation. This was rejected as being overly complex. -- ``args`` remains a ``Sequence[str | Interpolation]`` but does not support +- ``args`` remains a ``tuple[str | Interpolation, ...]`` but does not support interleaving. As a result, empty strings are not added to the sequence. It is no longer possible to obtain static strings with ``args[::2]``; instead, instance checks or structural pattern matching must be used to distinguish