Skip to content

Commit

Permalink
Rollup merge of #112725 - notriddle:notriddle/advanced-search, r=Guil…
Browse files Browse the repository at this point in the history
…laumeGomez

rustdoc-search: add support for type parameters

r? `@GuillaumeGomez`

## Preview

* https://notriddle.com/rustdoc-html-demo-4/advanced-search/rustdoc/read-documentation/search.html
* https://notriddle.com/rustdoc-html-demo-4/advanced-search/std/index.html?search=option%3Coption%3CT%3E%3E%20-%3E%20option%3CT%3E
* https://notriddle.com/rustdoc-html-demo-4/advanced-search/std/index.html?search=option%3CT%3E,%20E%20-%3E%20result%3CT,%20E%3E
* https://notriddle.com/rustdoc-html-demo-4/advanced-search/std/index.html?search=-%3E%20option%3CT%3E

## Description

When writing a type-driven search query in rustdoc, specifically one with more than one query element, non-existent types become generic parameters instead of auto-correcting (which is currently only done for single-element queries) or giving no result. You can also force a generic type parameter by writing `generic:T` (and can force it to not use a generic type parameter with something like `struct:T` or whatever, though if this happens it means the thing you're looking for doesn't exist and will give you no results).

There is no syntax provided for specifying type constraints for generic type parameters.

When you have a generic type parameter in a search query, it will only match up with generic type parameters in the actual function, not concrete types that match, not concrete types that implement a trait. It also strictly matches based on when they're the same or different, so `option<T>, option<U> -> option<U>` matches `Option::and`, but not `Option::or`. Similarly, `option<T>, option<T> -> option<T>` matches `Option::or`, but not `Option::and`.

## Motivation

This feature is motivated by the many "combinitor"-type functions found in generic libraries, such as Option, Future, Iterator, and Entry. These highly-generic functions have names that are almost completely arbitrary, and a type signature that tells you what it actually does.

This PR is a major step towards[^closure] being able to easily search for generic functions by their type signature instead of by name. Some examples of combinators that can be found using this PR (try them out in the preview):

* `option<option<T>> -> option<T>` returns Option::flatten
* `option<T> -> result<T>` returns Option::ok_or
* `option<result<T>> -> result<option<T>>` returns Option::transpose
* `entry<K, V>, FnOnce -> V` returns `Entry::or_insert_with` (and `or_insert_with_key`, since there's no way to specify the generics on FnOnce)

[^closure]:

    For this feature to be as useful as it ought to be, you should be able to search for *trait-associated types* and *closures*. This PR does not implement either of these: they are **Future possibilities**.

    Trait-associated types would allow queries like `option<T> -> iterator<item=T>` to return `Option::iter`. We should also allow `option<T> -> iterator<T>` to match the associated type version.

    Closures would make a good way to query for things like `Option::map`. Closure support needs associated types to be represented in the search index, since `FnOnce() -> i32` desugars to `FnOnce<Output=i32, ()>`, so associated trait types should be implemented first. Also, we'd want to expose an easy way to query closures without specifying which of the three traits you want.
  • Loading branch information
GuillaumeGomez authored Sep 19, 2023
2 parents ae9c330 + 4cf06e8 commit 3f68468
Show file tree
Hide file tree
Showing 18 changed files with 1,199 additions and 462 deletions.
1 change: 1 addition & 0 deletions src/doc/rustdoc/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
- [Command-line arguments](command-line-arguments.md)
- [How to read rustdoc output](how-to-read-rustdoc.md)
- [In-doc settings](read-documentation/in-doc-settings.md)
- [Search](read-documentation/search.md)
- [How to write documentation](how-to-write-documentation.md)
- [What to include (and exclude)](write-documentation/what-to-include.md)
- [The `#[doc]` attribute](write-documentation/the-doc-attribute.md)
Expand Down
55 changes: 5 additions & 50 deletions src/doc/rustdoc/src/how-to-read-rustdoc.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,56 +75,11 @@ or the current item whose documentation is being displayed.
## The Theme Picker and Search Interface

When viewing `rustdoc`'s output in a browser with JavaScript enabled,
a dynamic interface appears at the top of the page composed of the search
interface, help screen, and options.

### The Search Interface

Typing in the search bar instantly searches the available documentation for
the string entered with a fuzzy matching algorithm that is tolerant of minor
typos.

By default, the search results given are "In Names",
meaning that the fuzzy match is made against the names of items.
Matching names are shown on the left, and the first few words of their
descriptions are given on the right.
By clicking an item, you will navigate to its particular documentation.

There are two other sets of results, shown as tabs in the search results pane.
"In Parameters" shows matches for the string in the types of parameters to
functions, and "In Return Types" shows matches in the return types of functions.
Both are very useful when looking for a function whose name you can't quite
bring to mind when you know the type you have or want.

Names in the search interface can be prefixed with an item type followed by a
colon (such as `mod:`) to restrict the results to just that kind of item. Also,
searching for `println!` will search for a macro named `println`, just like
searching for `macro:println` does.

Function signature searches can query generics, wrapped in angle brackets, and
traits are normalized like types in the search engine. For example, a function
with the signature `fn my_function<I: Iterator<Item=u32>>(input: I) -> usize`
can be matched with the following queries:

* `Iterator<u32> -> usize`
* `trait:Iterator<primitive:u32> -> primitive:usize`
* `Iterator -> usize`

Generics and function parameters are order-agnostic, but sensitive to nesting
and number of matches. For example, a function with the signature
`fn read_all(&mut self: impl Read) -> Result<Vec<u8>, Error>`
will match these queries:

* `Read -> Result<Vec<u8>, Error>`
* `Read -> Result<Error, Vec>`
* `Read -> Result<Vec<u8>>`

But it *does not* match `Result<Vec, u8>` or `Result<u8<Vec>>`.

Function signature searches also support arrays and slices. The explicit name
`primitive:slice<u8>` and `primitive:array<u8>` can be used to match a slice
or array of bytes, while square brackets `[u8]` will match either one. Empty
square brackets, `[]`, will match any slice regardless of what it contains.
a dynamic interface appears at the top of the page composed of the [search]
interface, help screen, and [options].

[options]: read-documentation/in-doc-settings.html
[search]: read-documentation/search.md

Paths are supported as well, you can look for `Vec::new` or `Option::Some` or
even `module::module_child::another_child::struct::field`. Whitespace characters
Expand Down
237 changes: 237 additions & 0 deletions src/doc/rustdoc/src/read-documentation/search.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,237 @@
# Rustdoc search

Typing in the search bar instantly searches the available documentation,
matching either the name and path of an item, or a function's approximate
type signature.

## Search By Name

To search by the name of an item (items include modules, types, traits,
functions, and macros), write its name or path. As a special case, the parts
of a path that normally get divided by `::` double colons can instead be
separated by spaces. For example:

* [`vec new`] and [`vec::new`] both show the function `std::vec::Vec::new`
as a result.
* [`vec`], [`vec vec`], [`std::vec`], and [`std::vec::Vec`] all include the struct
`std::vec::Vec` itself in the results (and all but the last one also
include the module in the results).

[`vec new`]: ../../std/vec/struct.Vec.html?search=vec%20new&filter-crate=std
[`vec::new`]: ../../std/vec/struct.Vec.html?search=vec::new&filter-crate=std
[`vec`]: ../../std/vec/struct.Vec.html?search=vec&filter-crate=std
[`vec vec`]: ../../std/vec/struct.Vec.html?search=vec%20vec&filter-crate=std
[`std::vec`]: ../../std/vec/struct.Vec.html?search=std::vec&filter-crate=std
[`std::vec::Vec`]: ../../std/vec/struct.Vec.html?search=std::vec::Vec&filter-crate=std
[`std::vec::Vec`]: ../../std/vec/struct.Vec.html?search=std::vec::Vec&filter-crate=std

As a quick way to trim down the list of results, there's a drop-down selector
below the search input, labeled "Results in \[std\]". Clicking it can change
which crate is being searched.

Rustdoc uses a fuzzy matching function that can tolerate typos for this,
though it's based on the length of the name that's typed in, so a good example
of how this works would be [`HahsMap`]. To avoid this, wrap the item in quotes,
searching for `"HahsMap"` (in this example, no results will be returned).

[`HahsMap`]: ../../std/collections/struct.HashMap.html?search=HahsMap&filter-crate=std

### Tabs in the Search By Name interface

In fact, using [`HahsMap`] again as the example, it tells you that you're
using "In Names" by default, but also lists two other tabs below the crate
drop-down: "In Parameters" and "In Return Types".

These two tabs are lists of functions, defined on the closest matching type
to the search (for `HahsMap`, it loudly auto-corrects to `hashmap`). This
auto-correct only kicks in if nothing is found that matches the literal.

These tabs are not just methods. For example, searching the alloc crate for
[`Layout`] also lists functions that accept layouts even though they're
methods on the allocator or free functions.

[`Layout`]: ../../alloc/index.html?search=Layout&filter-crate=alloc

## Searching By Type Signature for functions

If you know more specifically what the function you want to look at does,
Rustdoc can search by more than one type at once in the parameters and return
value. Multiple parameters are separated by `,` commas, and the return value
is written with after a `->` arrow.

Before describing the syntax in more detail, here's a few sample searches of
the standard library and functions that are included in the results list:

| Query | Results |
|-------|--------|
| [`usize -> vec`][] | `slice::repeat` and `Vec::with_capacity` |
| [`vec, vec -> bool`][] | `Vec::eq` |
| [`option<T>, fnonce -> option<U>`][] | `Option::map` and `Option::and_then` |
| [`option<T>, fnonce -> option<T>`][] | `Option::filter` and `Option::inspect` |
| [`option -> default`][] | `Option::unwrap_or_default` |
| [`stdout, [u8]`][stdoutu8] | `Stdout::write` |
| [`any -> !`][] | `panic::panic_any` |
| [`vec::intoiter<T> -> [T]`][iterasslice] | `IntoIter::as_slice` and `IntoIter::next_chunk` |

[`usize -> vec`]: ../../std/vec/struct.Vec.html?search=usize%20-%3E%20vec&filter-crate=std
[`vec, vec -> bool`]: ../../std/vec/struct.Vec.html?search=vec,%20vec%20-%3E%20bool&filter-crate=std
[`option<T>, fnonce -> option<U>`]: ../../std/vec/struct.Vec.html?search=option<T>%2C%20fnonce%20->%20option<U>&filter-crate=std
[`option<T>, fnonce -> option<T>`]: ../../std/vec/struct.Vec.html?search=option<T>%2C%20fnonce%20->%20option<T>&filter-crate=std
[`option -> default`]: ../../std/vec/struct.Vec.html?search=option%20-%3E%20default&filter-crate=std
[`any -> !`]: ../../std/vec/struct.Vec.html?search=any%20-%3E%20!&filter-crate=std
[stdoutu8]: ../../std/vec/struct.Vec.html?search=stdout%2C%20[u8]&filter-crate=std
[iterasslice]: ../../std/vec/struct.Vec.html?search=vec%3A%3Aintoiter<T>%20->%20[T]&filter-crate=std

### How type-based search works

In a complex type-based search, Rustdoc always treats every item's name as literal.
If a name is used and nothing in the docs matches the individual item, such as
a typo-ed [`uize -> vec`][] search, the item `uize` is treated as a generic
type parameter (resulting in `vec::from` and other generic vec constructors).

[`uize -> vec`]: ../../std/vec/struct.Vec.html?search=uize%20-%3E%20vec&filter-crate=std

After deciding which items are type parameters and which are actual types, it
then searches by matching up the function parameters (written before the `->`)
and the return types (written after the `->`). Type matching is order-agnostic,
and allows items to be left out of the query, but items that are present in the
query must be present in the function for it to match.

Function signature searches can query generics, wrapped in angle brackets, and
traits will be normalized like types in the search engine if no type parameters
match them. For example, a function with the signature
`fn my_function<I: Iterator<Item=u32>>(input: I) -> usize`
can be matched with the following queries:

* `Iterator<u32> -> usize`
* `Iterator -> usize`

Generics and function parameters are order-agnostic, but sensitive to nesting
and number of matches. For example, a function with the signature
`fn read_all(&mut self: impl Read) -> Result<Vec<u8>, Error>`
will match these queries:

* `Read -> Result<Vec<u8>, Error>`
* `Read -> Result<Error, Vec>`
* `Read -> Result<Vec<u8>>`

But it *does not* match `Result<Vec, u8>` or `Result<u8<Vec>>`.

Function signature searches also support arrays and slices. The explicit name
`primitive:slice<u8>` and `primitive:array<u8>` can be used to match a slice
or array of bytes, while square brackets `[u8]` will match either one. Empty
square brackets, `[]`, will match any slice or array regardless of what
it contains, while a slice with a type parameter, like `[T]`, will only match
functions that actually operate on generic slices.

### Limitations and quirks of type-based search

Type-based search is still a buggy, experimental, work-in-progress feature.
Most of these limitations should be addressed in future version of Rustdoc.

* There's no way to write trait constraints on generic parameters.
You can name traits directly, and if there's a type parameter
with that bound, it'll match, but `option<T> -> T where T: Default`
cannot be precisely searched for (use `option<Default> -> Default`).

* Type parameters match type parameters, such that `Option<A>` matches
`Option<T>`, but never match concrete types in function signatures.
A trait named as if it were a type, such as `Option<Read>`, will match
a type parameter constrained by that trait, such as
`Option<T> where T: Read`, as well as matching `dyn Trait` and
`impl Trait`.

* `impl Trait` in argument position is treated exactly like a type
parameter, but in return position it will not match type parameters.

* Any type named in a complex type-based search will be assumed to be a
type parameter if nothing matching the name exactly is found. If you
want to force a type parameter, write `generic:T` and it will be used
as a type parameter even if a matching name is found. If you know
that you don't want a type parameter, you can force it to match
something else by giving it a different prefix like `struct:T`.

* It's impossible to search for references, pointers, or tuples. The
wrapped types can be searched for, so a function that takes `&File` can
be found with `File`, but you'll get a parse error when typing an `&`
into the search field. Similarly, `Option<(T, U)>` can be matched with
`Option<T, U>`, but `(` will give a parse error.

* Searching for lifetimes is not supported.

* It's impossible to search for closures based on their parameters or
return values.

* It's impossible to search based on the length of an array.

## Item filtering

Names in the search interface can be prefixed with an item type followed by a
colon (such as `mod:`) to restrict the results to just that kind of item. Also,
searching for `println!` will search for a macro named `println`, just like
searching for `macro:println` does. The complete list of available filters is
given under the <kbd>?</kbd> Help area, and in the detailed syntax below.

Item filters can be used in both name-based and type signature-based searches.

## Search query syntax

```text
ident = *(ALPHA / DIGIT / "_")
path = ident *(DOUBLE-COLON ident) [!]
slice = OPEN-SQUARE-BRACKET [ nonempty-arg-list ] CLOSE-SQUARE-BRACKET
arg = [type-filter *WS COLON *WS] (path [generics] / slice / [!])
type-sep = COMMA/WS *(COMMA/WS)
nonempty-arg-list = *(type-sep) arg *(type-sep arg) *(type-sep)
generics = OPEN-ANGLE-BRACKET [ nonempty-arg-list ] *(type-sep)
CLOSE-ANGLE-BRACKET
return-args = RETURN-ARROW *(type-sep) nonempty-arg-list
exact-search = [type-filter *WS COLON] [ RETURN-ARROW ] *WS QUOTE ident QUOTE [ generics ]
type-search = [ nonempty-arg-list ] [ return-args ]
query = *WS (exact-search / type-search) *WS
type-filter = (
"mod" /
"externcrate" /
"import" /
"struct" /
"enum" /
"fn" /
"type" /
"static" /
"trait" /
"impl" /
"tymethod" /
"method" /
"structfield" /
"variant" /
"macro" /
"primitive" /
"associatedtype" /
"constant" /
"associatedconstant" /
"union" /
"foreigntype" /
"keyword" /
"existential" /
"attr" /
"derive" /
"traitalias" /
"generic")
OPEN-ANGLE-BRACKET = "<"
CLOSE-ANGLE-BRACKET = ">"
OPEN-SQUARE-BRACKET = "["
CLOSE-SQUARE-BRACKET = "]"
COLON = ":"
DOUBLE-COLON = "::"
QUOTE = %x22
COMMA = ","
RETURN-ARROW = "->"
ALPHA = %x41-5A / %x61-7A ; A-Z / a-z
DIGIT = %x30-39
WS = %x09 / " "
```
4 changes: 0 additions & 4 deletions src/librustdoc/clean/types.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1639,10 +1639,6 @@ impl Type {
matches!(self, Type::Generic(_))
}

pub(crate) fn is_impl_trait(&self) -> bool {
matches!(self, Type::ImplTrait(_))
}

pub(crate) fn is_unit(&self) -> bool {
matches!(self, Type::Tuple(v) if v.is_empty())
}
Expand Down
19 changes: 15 additions & 4 deletions src/librustdoc/html/render/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ pub(crate) struct IndexItem {
pub(crate) path: String,
pub(crate) desc: String,
pub(crate) parent: Option<DefId>,
pub(crate) parent_idx: Option<usize>,
pub(crate) parent_idx: Option<isize>,
pub(crate) search_type: Option<IndexItemFunctionType>,
pub(crate) aliases: Box<[Symbol]>,
pub(crate) deprecation: Option<Deprecation>,
Expand All @@ -122,7 +122,10 @@ impl Serialize for RenderType {
let id = match &self.id {
// 0 is a sentinel, everything else is one-indexed
None => 0,
Some(RenderTypeId::Index(idx)) => idx + 1,
// concrete type
Some(RenderTypeId::Index(idx)) if *idx >= 0 => idx + 1,
// generic type parameter
Some(RenderTypeId::Index(idx)) => *idx,
_ => panic!("must convert render types to indexes before serializing"),
};
if let Some(generics) = &self.generics {
Expand All @@ -140,14 +143,15 @@ impl Serialize for RenderType {
pub(crate) enum RenderTypeId {
DefId(DefId),
Primitive(clean::PrimitiveType),
Index(usize),
Index(isize),
}

/// Full type of functions/methods in the search index.
#[derive(Debug)]
pub(crate) struct IndexItemFunctionType {
inputs: Vec<RenderType>,
output: Vec<RenderType>,
where_clause: Vec<Vec<RenderType>>,
}

impl Serialize for IndexItemFunctionType {
Expand All @@ -170,10 +174,17 @@ impl Serialize for IndexItemFunctionType {
_ => seq.serialize_element(&self.inputs)?,
}
match &self.output[..] {
[] => {}
[] if self.where_clause.is_empty() => {}
[one] if one.generics.is_none() => seq.serialize_element(one)?,
_ => seq.serialize_element(&self.output)?,
}
for constraint in &self.where_clause {
if let [one] = &constraint[..] && one.generics.is_none() {
seq.serialize_element(one)?;
} else {
seq.serialize_element(constraint)?;
}
}
seq.end()
}
}
Expand Down
Loading

0 comments on commit 3f68468

Please sign in to comment.