Skip to content

Commit

Permalink
Long list optimization (#52)
Browse files Browse the repository at this point in the history
* Avoid unnecessary stringifying of long lists

Lists are truncated to "List (n elements)" if the stringified
form is too long, but string conversion is expensive for long lists.
To avoid that, we can calculate the minimum possible string length
and skip stringifying if we exceed it. Based on benchmarking with
the S2 image collection, this is a 10 - 20% speedup for
convert_to_html.

The minimum length formula takes brackets, delimiters, and
whitespace into account, so e.g. the shortest possible 3-element
list is 9 characters: "[1, 1, 1]".

* Update changelog
  • Loading branch information
aazuspan authored Jan 21, 2025
1 parent a5ed9e3 commit de03fe8
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 4 deletions.
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@ All notable changes to this project will be documented in this file.

## [Unreleased]

Nothing yet.
### Performance

- Avoid stringifying long lists that will definitely be truncated in the repr (~20% speedup when testing with a 25-image Sentinel-2 collection)

## [0.1.0] - 2025-01-10

Expand Down
13 changes: 10 additions & 3 deletions eerepr/html.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,13 +54,20 @@ def convert_to_html(obj: Any, key: Hashable | None = None) -> str:

def list_to_html(obj: list, key: Hashable | None = None) -> str:
"""Convert a Python list to an HTML <li> element."""
contents = str(obj)
n = len(obj)
noun = "element" if n == 1 else "elements"
header = f"{key}: " if key is not None else ""
header += f"List ({n} {noun})" if len(contents) > MAX_INLINE_LENGTH else contents
children = [convert_to_html(item, key=i) for i, item in enumerate(obj)]

# Skip the expensive stringification for lists that are definitely too long to
# include inline (counting whitespace and delimiters). This is a substantial
# performance improvement for large collections.
min_length = 3 * (n - 1) + 3
if min_length < MAX_INLINE_LENGTH and len(contents := str(obj)) < MAX_INLINE_LENGTH:
header += contents
else:
header += f"List ({n} {noun})"

children = [convert_to_html(item, key=i) for i, item in enumerate(obj)]
return _make_collapsible_li(header, children)


Expand Down

0 comments on commit de03fe8

Please sign in to comment.