Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new "type system concepts" section #1743

Merged
merged 48 commits into from
Jun 24, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
b323afb
initial draft of new concepts section
carljm May 22, 2024
bd4416e
review feedback
carljm May 22, 2024
3c332bf
prefer 'assignable to' over 'consistent subtype of'
carljm May 22, 2024
6603f43
minor tweaks
carljm May 22, 2024
4e2a1ce
Merge branch 'main' into concepts
carljm May 22, 2024
137ab2a
terms have types
carljm May 22, 2024
24e6ee5
review comments
carljm May 22, 2024
9e9ebb4
use asymmetric example of assignable-to
carljm May 22, 2024
1271558
Any is equivalent to Any
carljm May 22, 2024
2ddd4ad
add a short para on the gradual guarantee
carljm May 22, 2024
971e9f2
conciseness tweak
carljm May 22, 2024
10fda4e
use 'assignable to' in Any description
carljm May 22, 2024
f4110bb
incorporate some of Eric Traut's feedback
carljm May 23, 2024
ac6273e
more on union
carljm May 24, 2024
15c0d56
add an example of bounded gradual type
carljm May 24, 2024
efdbc5f
add section on attributes and methods
carljm May 24, 2024
e4537ac
more feedback and tweaks
carljm May 26, 2024
c3bbb52
Merge branch 'main' into concepts
carljm Jun 1, 2024
6bebab4
review comments
carljm Jun 1, 2024
2900cce
a bit more on gradual unions
carljm Jun 1, 2024
182a058
a few more review comments
carljm Jun 1, 2024
351136e
add terms to glossary
carljm Jun 1, 2024
cd03de8
Update glossary.rst
carljm Jun 2, 2024
07941d7
Update glossary.rst
carljm Jun 2, 2024
c55d40a
review comments on glossary
carljm Jun 2, 2024
b1775b1
re-apply review comment
carljm Jun 2, 2024
1a71a72
Apply suggestions from code review
carljm Jun 2, 2024
c18d9e1
audit remainder of type spec for terminology usage
carljm Jun 2, 2024
cca3bcd
review comments
carljm Jun 2, 2024
3d6b406
some review comments
carljm Jun 2, 2024
5d40036
one more review tweak on protocol wording
carljm Jun 2, 2024
1e0d118
Merge branch 'main' into concepts
carljm Jun 12, 2024
cbe6e23
add equivalent, narrow, and wide to glossary
carljm Jun 12, 2024
8954595
add table of type relations
carljm Jun 12, 2024
efaa7ab
explicitly allow inference on missing function annotations
carljm Jun 12, 2024
12ae9eb
some review comments
carljm Jun 12, 2024
ccfef86
Update docs/spec/callables.rst
carljm Jun 12, 2024
e990bda
more review comments
carljm Jun 12, 2024
045d7c2
link 'assignable' to glossary more often in callables doc
carljm Jun 14, 2024
dd6ffd1
add nominal/structural to concepts and glossary
carljm Jun 14, 2024
673a467
don't use 'compatible' in callables doc
carljm Jun 14, 2024
0f5fba4
equivalence of gradual types and union simplification
carljm Jun 14, 2024
a6b3ab0
simplify description of structural subtyping
carljm Jun 15, 2024
e5943a4
define equivalence of gradual types in glossary
carljm Jun 15, 2024
effcdec
more review comments
carljm Jun 17, 2024
9ebaef8
review comments
carljm Jun 19, 2024
6a3b716
Update glossary.rst
carljm Jun 20, 2024
ab7f9ac
review comments
carljm Jun 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
273 changes: 181 additions & 92 deletions docs/spec/concepts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,97 +3,186 @@
Type system concepts
====================

.. _`union-types`:
Static, dynamic, and gradual typing
carljm marked this conversation as resolved.
Show resolved Hide resolved
-----------------------------------

A **statically-typed** programming language runs a type-checker before running
a program. The program is required to be well-typed according to the language's
type system. The type system assigns a type to all expressions in the language
and verifies that their uses obey the typing rules. Normally, a program that is
not well-typed (i.e., one that contains a type error) will not run. Java and
C++ are examples of statically-typed object-oriented languages.

A **dynamically-typed** programming language does not run a type-checker before
running a program. Instead, it will check the types of values before performing
carljm marked this conversation as resolved.
Show resolved Hide resolved
operations on them at runtime. This is not to say that the language is
"untyped". Values at runtime have a type and their uses obey typing rules. Not
every operation will be checked, but certain primitive operations in the
language such as attribute access or arithmetic will be. Python was
carljm marked this conversation as resolved.
Show resolved Hide resolved
historically a dynamically-typed language.

**Gradual typing** is the name for a specific way to combine static and dynamic
carljm marked this conversation as resolved.
Show resolved Hide resolved
typing. It allows mixing static and dynamic checking at a fine level of
carljm marked this conversation as resolved.
Show resolved Hide resolved
granularity. For instance, in type-annotated Python, individual variables,
parameters, and returns can optionally be given a static type. Data structures
such as objects can have a mix of static and dynamic-only checking. As an
carljm marked this conversation as resolved.
Show resolved Hide resolved
example, a Python dictionary could be annotated to have static checking of the
key type but only dynamic checking of the value type.

In gradual typing, a dynamically typed value is indicated by a special
carljm marked this conversation as resolved.
Show resolved Hide resolved
"unknown" or "dynamic" type. In Python, the unknown type is spelled
:ref:`Any`. It indicates to the static type checker that this value should not
be subject to static checking. The system should not signal a static type
error for use of an expression with type :ref:`Any`. Instead, the expression's
value will be dynamically checked, according to the Python runtime's usual
checks on the types of runtime values.

This specification describes a gradual type system for Python.

Static, dynamic, and gradual types
----------------------------------

We will refer to types that do not contain :ref:`Any` as a sub-part as **static
carljm marked this conversation as resolved.
Show resolved Hide resolved
carljm marked this conversation as resolved.
Show resolved Hide resolved
types**. :ref:`Any` itself is the **dynamic type**. A **gradual type** can be a
static type, :ref:`Any` itself, or a type that contains :ref:`Any` as a
sub-part.

Static types
~~~~~~~~~~~~

A static type can intuitively be understood as denoting a set of runtime
carljm marked this conversation as resolved.
Show resolved Hide resolved
values. For instance, the Python static type ``object`` is the set of all
Python objects. The Python static type ``bool`` is the set of values ``{ True,
False }``. The Python static type ``str`` is the set of all Python strings;
more precisely, the set of all Python objects whose runtime type (i.e.
``__class__`` attribute) is either ``str`` or a class that inherits ``str``,
including transitively; i.e. a type with ``str`` in its method resolution
carljm marked this conversation as resolved.
Show resolved Hide resolved
order. Static types can also be specified in other ways. For example,
:ref:`Protocols` specify a static type which is the set of all objects which
share a certain set of attributes and/or methods.

The dynamic type
~~~~~~~~~~~~~~~~

The dynamic type :ref:`Any` represents an unknown static type. It denotes some
carljm marked this conversation as resolved.
Show resolved Hide resolved
unknown set of values.

At first glance, this may appear similar to the static type ``object``, which
carljm marked this conversation as resolved.
Show resolved Hide resolved
represents the set of all Python objects. But it is quite different.
carljm marked this conversation as resolved.
Show resolved Hide resolved

If a value has the static type ``object``, a static type checker should ensure
that all operations on the value are valid for all Python objects, or else emit
a static type error. This allows very few operations! For example, if ``x`` is
typed as ``object``, ``x.foo`` should be a static type error, because not all
carljm marked this conversation as resolved.
Show resolved Hide resolved
Python objects have an attribute ``foo``.

A value typed as :ref:`Any`, on the other hand, should be assumed to have
carljm marked this conversation as resolved.
Show resolved Hide resolved
_some_ specific static type, but _which_ static type is not known. A static
checker should not emit any errors that depend on assuming a particular static
type; a static checker should instead assume that the runtime is responsible
for checking the type of operations on this value, as in a dynamically-typed
language.

The subtype relation
--------------------

A static type ``B`` is a **subtype** of another static type ``A`` if and only
carljm marked this conversation as resolved.
Show resolved Hide resolved
if the set of values represented by ``B`` is a subset of the set of values
represented by ``A``. Because the subset relation on sets is transitive and
reflexive, the subtype relation is also transitive (if ``C`` is a subtype of
``B`` and ``B`` is a subtype of ``A``, then ``C`` is a subtype of ``A``) and
reflexive (``A`` is always a subtype of ``A``).
carljm marked this conversation as resolved.
Show resolved Hide resolved

The **supertype** relation is the inverse of subtype: ``A`` is a supertype of
``B`` if and only if ``B`` is a subtype of ``A``; or equivalently, if and only
if the set of values represented by ``A`` is a superset of the values
represented by ``B``. The supertype relation is also transitive and reflexive.

We also define an **equivalence** relation on static types: the types ``A`` and
``B`` are equivalent (or "the same type") if and only if ``A`` is a subtype
of ``B`` and ``B`` is a subtype of ``A``. This means that the set of values
represented by ``A`` is both a superset and a subset of the values represented
by ``B``, meaning ``A`` and ``B`` must represent the same set of values.

We may describe a type ``B`` as "narrower" than a type ``A`` (or as a "strict
carljm marked this conversation as resolved.
Show resolved Hide resolved
carljm marked this conversation as resolved.
Show resolved Hide resolved
subtype" of ``A``) if ``B`` is a subtype of ``A`` and ``B`` is not equivalent
to ``A``.

The consistency relation
------------------------

Since :ref:`Any` represents an unknown static type, it does not represent any
known single set of values, and thus it is not in the domain of the subtype,
carljm marked this conversation as resolved.
Show resolved Hide resolved
supertype, or equivalence relations on static types described above.
carljm marked this conversation as resolved.
Show resolved Hide resolved

We define a **materialization** relation on gradual types as follows: if
carljm marked this conversation as resolved.
Show resolved Hide resolved
replacing zero or more occurrences of ``Any`` in gradual type ``A`` with some
gradual type (which can be different for each occurrence of ``Any``) results in
the gradual type ``B``, then ``B`` is a materialization of ``A``. For instance,
``tuple[int, str]`` (a static type) and ``tuple[Any, str]`` (a gradual type)
are both materializations of ``tuple[Any, Any]``.

If ``B`` is a materialization of ``A``, we can say that ``B`` is a "more
carljm marked this conversation as resolved.
Show resolved Hide resolved
static" type than ``A``, and ``A`` is a "more dynamic" type than ``B``.

The materialization relation is both transitive and reflexive, so it defines a
preorder on gradual types.

We also define a **consistency** relation on gradual types.

A static type ``A`` is consistent with another static type ``B`` if and only if
carljm marked this conversation as resolved.
Show resolved Hide resolved
they are the same type (``A`` is equivalent to ``B``.)

A gradual type ``A`` is consistent with a gradual type ``B``, and ``B`` is
consistent with ``A``, if and only if ``B`` is a materialization of ``A`` or
``A`` is a materialization of ``B``.
carljm marked this conversation as resolved.
Show resolved Hide resolved

The dynamic type ``Any`` is consistent with every type, and every type is
consistent with ``Any``. (This must follow from the above definitions of
carljm marked this conversation as resolved.
Show resolved Hide resolved
materialization and consistency, but is worth stating explicitly.)
carljm marked this conversation as resolved.
Show resolved Hide resolved

The consistency relation is not transitive. ``tuple[int, int]`` is consistent
with ``tuple[Any, int]`` and ``tuple[Any, int]`` is consistent with
``tuple[str, int]``, but ``tuple[int, int]`` is not consistent with
``tuple[str, int]``.

The consistency relation is symmetric. If ``A`` is consistent with ``B``, ``B``
carljm marked this conversation as resolved.
Show resolved Hide resolved
is also consistent with ``A``. It is also reflexive: ``A`` is always consistent
with ``A``.

The consistent subtype relation
-------------------------------

Given the materialization relation and the subtyping relation, we define the
**consistent subtype** relation over all types. A type ``A`` is a consistent
carljm marked this conversation as resolved.
Show resolved Hide resolved
subtype of a type ``B`` if there exists a materialization ``A'`` of ``A`` and a
materialization ``B'`` of ``B``, where ``A'`` and ``B'`` are both static types,
and ``A'`` is a subtype of ``B'``.

For example, ``Any`` is a consistent subtype of ``int``, because ``int`` is a
carljm marked this conversation as resolved.
Show resolved Hide resolved
materialization of ``Any``, and ``int`` is a subtype of ``int``. The same
materialization also gives that ``int`` is a consistent subtype of ``Any``.

Consistent subtyping defines assignability
------------------------------------------

Consistent subtyping defines "assignability" for Python. An expression can be
assigned to a variable (including passed as a parameter or returned from a
carljm marked this conversation as resolved.
Show resolved Hide resolved
function) if it is a consistent subtype of the variable's type annotation
carljm marked this conversation as resolved.
Show resolved Hide resolved
(respectively, parameter's type annotation or return type annotation).

We can say that a type ``A`` is "assignable to" a type ``B`` if ``A`` is a
consistent subtype of ``B``.

In the remainder of this specification, we will usually prefer the term
**assignable to** over "consistent subtype of". The two are synonyms, but
"assignable to" is shorter, and may communicate a clearer intuition to many
readers.

References
----------

Union types
-----------
The concepts presented here are derived from the research literature in gradual
typing. See e.g.:

Since accepting a small, limited set of expected types for a single
argument is common, the type system supports union types, created with the
``|`` operator.
Example::

def handle_employees(e: Employee | Sequence[Employee]) -> None:
if isinstance(e, Employee):
e = [e]
...

A type factored by ``T1 | T2 | ...`` is a supertype
of all types ``T1``, ``T2``, etc., so that a value that
is a member of one of these types is acceptable for an argument
annotated by ``T1 | T2 | ...``.

One common case of union types are *optional* types. By default,
``None`` is an invalid value for any type, unless a default value of
``None`` has been provided in the function definition. Examples::

def handle_employee(e: Employee | None) -> None: ...

A past version of this specification allowed type checkers to assume an optional
type when the default value is ``None``, as in this code::

def handle_employee(e: Employee = None): ...

This would have been treated as equivalent to::

def handle_employee(e: Employee | None = None) -> None: ...

This is no longer the recommended behavior. Type checkers should move
towards requiring the optional type to be made explicit.

Support for singleton types in unions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

A singleton instance is frequently used to mark some special condition,
in particular in situations where ``None`` is also a valid value
for a variable. Example::

_empty = object()

def func(x=_empty):
if x is _empty: # default argument value
return 0
elif x is None: # argument was provided and it's None
return 1
else:
return x * 2

To allow precise typing in such situations, the user should use
a union type in conjunction with the ``enum.Enum`` class provided
by the standard library, so that type errors can be caught statically::

from enum import Enum

class Empty(Enum):
token = 0
_empty = Empty.token

def func(x: int | None | Empty = _empty) -> int:

boom = x * 42 # This fails type check

if x is _empty:
return 0
elif x is None:
return 1
else: # At this point typechecker knows that x can only have type int
return x * 2

Since the subclasses of ``Enum`` cannot be further subclassed,
the type of variable ``x`` can be statically inferred in all branches
of the above example. The same approach is applicable if more than one
singleton object is needed: one can use an enumeration that has more than
one value::

class Reason(Enum):
timeout = 1
error = 2

def process(response: str | Reason = '') -> str:
if response is Reason.timeout:
return 'TIMEOUT'
elif response is Reason.error:
return 'ERROR'
else:
# response can be only str, all other possible values exhausted
return 'PROCESSED: ' + response
* `Victor Lanvin. A semantic foundation for gradual set-theoretic types. <https://theses.hal.science/tel-03853222/file/va_Lanvin_Victor.pdf>`_ Computer science. Université Paris Cité, 2021. English. NNT : 2021UNIP7159. tel-03853222
Loading