Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for LinkedList cursors #58533

Open
1 of 4 tasks
Amanieu opened this issue Feb 17, 2019 · 36 comments
Open
1 of 4 tasks

Tracking issue for LinkedList cursors #58533

Amanieu opened this issue Feb 17, 2019 · 36 comments
Labels
A-collections Area: `std::collection` B-RFC-implemented Blocker: Approved by a merged RFC and implemented. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@Amanieu
Copy link
Member

Amanieu commented Feb 17, 2019

This is a tracking issue for the RFC "Linked list cursors" (rust-lang/rfcs#2570).

Steps:

Unresolved questions:

  • Only for linked lists?
@kennytm kennytm added C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Feb 17, 2019
@Centril Centril added the B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. label Feb 17, 2019
@gnzlbg
Copy link
Contributor

gnzlbg commented Feb 18, 2019

While the RFC doesn't explicitly show how these APIs should be documented. I think that for operations like insert_list, etc. it might be worth documenting that they have constant-time complexity.

@programmerjake
Copy link
Member

one thing that I think would be a useful addition is a Walker type for linked lists, where the walker keeps track of a position in a linked list, but to access/modify the list, it has to be combined with borrowing the list, allowing you to have several walkers that are not limited in which elements they can point to that can all be used to modify the list, similar to being able to have several indexes into a Vec that can all be used to modify elements of the Vec.

@gnzlbg
Copy link
Contributor

gnzlbg commented Feb 21, 2019

one thing that I think would be a useful addition is a Walker type for linked lists, where the walker keeps track of a position in a linked list, but to access/modify the list, it has to be combined with borrowing the list,

Does the Walker "borrow" the list, preventing removals? If so, why does one need to combine it with borrowing the list? If not, how does the Walker handle the case, in which the position it refers to has been deleted?

@programmerjake
Copy link
Member

Does the Walker "borrow" the list, preventing removals? If so, why does one need to combine it with borrowing the list? If not, how does the Walker handle the case, in which the position it refers to has been deleted?

no, the walker doesn't borrow the list. the list would panic or return None when passed a walker to a deleted item. to prevent needing the extra space in a list to track deleted nodes to prevent dangling pointers, we could add a WalkableList struct that borrows the list and keeps deleted nodes from being freed until the WalkableList is dropped or changes Walkers pointing to the deleted item to point to nothing. all access of the list would go through WalkableList. WalkableList would keep track of the Walkers and on drop would panic if there are any walkers left.

we may want to make a generic Walkable trait to allow walking types other than linked lists.

Something like this:

struct WalkableList<'a, T> {
    list: &'a mut LinkedList<T>,
    deletedNodes: Option<NonNull<Node<T>>, // element already dropped when in this list
    walkerTracker: Rc<()>,
}

#[derive(Clone)]
struct Walker<T> {
    node: Option<NonNull<Node<T>>>,
    walkerTracker: Rc<()>,
}

struct Node<T> {
    next: Option<NonNull<Node<T>>>,
    prev: Option<NonNull<Node<T>>>,
    element: ManuallyDrop<T>,
}

impl<T> Drop for WalkableList<'_, T> {
    fn drop(&mut self) {
        assert!(self.walkerTracker.get_mut().is_some(), "Walkers left");
        while let Some(node) = self.deletedNodes {
            let node = Box::from_raw(node.as_ptr());
            self.deletedNodes = node.next;
            // don't drop node.element; already dropped
        }
    }
}

impl<T> WalkableList<'_, T> {
    pub fn begin(&self) -> Walker<T> {
        Walker { node: self.list.head, walkerTracker: self.walkerTracker.clone(), }
    }
    pub fn remove(&mut self, w: &Walker<T>) -> Option<T> {
        assert!(Rc::ptr_eq(&self.walkerTracker, &w.walkerTracker);
        let node = w.node?;
        unsafe {
            if node.as_ref().prev == Some(node) {
                // node is deleted
                return None;
            }
            self.list.unlink_node(node);
            let retval = ManuallyDrop::take(&mut node.as_mut().element);
            node.prev = Some(node); // mark as deleted
            node.next = self.deletedNodes;
            self.deletedNodes = node;
            Some(retval)
        }
    }
    pub fn get(&self, w: &Walker<T>) -> Option<&T> {
        // ...
    }
    pub fn get_mut(&mut self, w: &Walker<T>) -> Option<&mut T> {
        assert!(Rc::ptr_eq(&self.walkerTracker, &w.walkerTracker);
        let node = w.node?;
        unsafe {
            if node.as_ref().prev == Some(node) {
                // node is deleted
                return None;
            }
            Some(&mut *(*node.as_ptr()).element)
        }
    }
    // other members
}

@crlf0710
Copy link
Member

I'm working on this. Currently i'm writing unit tests. Will put up a PR soon.

@crlf0710
Copy link
Member

rust-lang/rfcs#2847 was open to suggest making a number of changes to this rfc.

Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Jan 14, 2020
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Jan 15, 2020
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Jan 15, 2020
Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue Jan 15, 2020
@crlf0710
Copy link
Member

#68123 has been merged.

@smessmer
Copy link

smessmer commented Apr 8, 2020

Any reason why Cursors aren't Clone?

@Amanieu
Copy link
Member Author

Amanieu commented Apr 9, 2020

It seems to just be an oversight. Feel free to send a PR to add Clone.

@main--
Copy link
Contributor

main-- commented May 2, 2020

Something I needed just now was to move an element from somewhere within the list to the front. remove_current does this nicely, but also destroys the list node which then has to be re-allocated in the followup push_front call. If I want to avoid that, the only way I see is to call both split_after and split_before and then splice everything back together in the right order. Perhaps something like remove_current_node(&mut self) -> LinkedList<T> could be added, to make this case more convenient.

@Amanieu
Copy link
Member Author

Amanieu commented May 2, 2020

Maybe remove_current_as_list? Feel free to submit a PR!

Dylan-DPC-zz pushed a commit to Dylan-DPC-zz/rust that referenced this issue May 4, 2020
Add remove_current_as_list to LinkedList's CursorMut

The `remove_current` method only returns the inner `T` and deallocates the list node. This is unnecessary for move operations, where the element is going to be linked back into this (or even a different) `LinkedList`. The `remove_current_as_list` method avoids this by returning the unlinked list node as a new single-element `LinkedList` structure.

(per rust-lang#58533 (comment))
@crlf0710 crlf0710 added B-RFC-implemented Blocker: Approved by a merged RFC and implemented. and removed B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. labels May 18, 2020
@KodrAus KodrAus added the Libs-Tracked Libs issues that are tracked on the team's project board. label Jul 29, 2020
Manishearth added a commit to Manishearth/rust that referenced this issue Jul 31, 2020
Remove `linked_list_extras` methods.

Removing these in favor of the `Cursor` API in rust-lang#58533 .
Closes rust-lang#27794.

r? @Amanieu
JohnTitor added a commit to JohnTitor/rust that referenced this issue Jul 31, 2020
Remove `linked_list_extras` methods.

Removing these in favor of the `Cursor` API in rust-lang#58533 .
Closes rust-lang#27794.

r? @Amanieu
@Amanieu
Copy link
Member Author

Amanieu commented Apr 1, 2021

A cursor API for BTree is a big item on my wishlist. I started working on one a while ago (gist) but I never ended up completing it. If you are interested in implementing BTree cursors, feel free to use my draft code as a base.

@crlf0710
Copy link
Member

crlf0710 commented Apr 3, 2021

I've got one for Vec too :)
Not sure if it would be useful.

@Amanieu
Copy link
Member Author

Amanieu commented Apr 3, 2021

I'd say that a Vec cursor is less useful since you can always implement it yourself with just a usize. LinkedList and BTreeMap cursors actually need unsafe access to the internals of the data structure to be efficiently implemented.

@iwahbe
Copy link
Contributor

iwahbe commented Apr 10, 2021

Is there any reason why CursorMut can't expose push_front, pop_front, pop_front, etc, from LinkedList?

I often want to both browse and append to a LinkedList. This is possible with combinations of split and splice, but I think it makes sense to include these methods directly on CursorMut.

@yoshuawuyts
Copy link
Member

C++ has a notion of "Bidirectional Iterators" (ref1, ref2). If I understand what it does correctly, translated to Rust it would resemble:

/// An iterator able to step backwards.
pub trait BidirectionalIterator: Iterator {
    /// Steps backwards in the iterator and returns the previous value.
    ///
    /// Returns [`None`] when there is no previous item left to return.
    fn prev(&mut self) -> Option<Self::Item>;
}

Cursor::move_next feels like it would naturally map to Iterator::next. And Cursor::move_prev could map to BidirectionalIterator::prev. I haven't dug too deep into the discussion around linked_list::Cursor; but I didn't see any mention of exposing forward / backward iteration through traits in this thread, so I figured I'd bring it up (:

@programmerjake
Copy link
Member

programmerjake commented Aug 4, 2021

C++ has a notion of "Bidirectional Iterators" (ref1, ref2). If I understand what it does correctly, translated to Rust it would resemble:

C++'s bidirectional iterators are waay more similar to cursors than to Rust's iterators. A C++ bidirectional iterator is like (as of C++17):

// traits mixed-up slightly, check links for actual requirements

// https://en.cppreference.com/w/cpp/named_req/Iterator
unsafe trait CppIterator: Clone + Default {
    type Item;
    /// Safety: must not be past-the-end or otherwise invalid
    /// Equivalent to C++'s `++iter`
    unsafe fn move_to_next(&mut self);
    // omitted `iter++`
    /// Safety: must be past-the-end or before-begin or otherwise invalid
    /// Equivalent to `*iter` and `iter.operator ->()`
    unsafe fn get(&self) -> *mut Self::Item;
}

unsafe impl<T> CppIterator for *mut T {
    type Item = T;
    unsafe fn move_to_next(&mut self) {
        *self = self.add(1);
    }
    unsafe fn get(&self) -> *mut Self::Item {
        *self
    }
}

// https://en.cppreference.com/w/cpp/named_req/InputIterator
unsafe trait InputIterator: PartialEq + CppIterator {}

unsafe impl<T> InputIterator for *mut T {}

// https://en.cppreference.com/w/cpp/named_req/ForwardIterator
unsafe trait ForwardIterator: InputIterator {}

unsafe impl<T> ForwardIterator for *mut T {}

// https://en.cppreference.com/w/cpp/named_req/BidirectionalIterator
unsafe trait BidirectionalIterator: ForwardIterator {
    /// Safety: must not be before-begin or otherwise invalid
    /// Equivalent to C++'s `--iter`
    unsafe fn move_to_prev(&mut self);
    // omitted `iter--`
}

unsafe impl<T> BidirectionalIterator for *mut T {}

// https://en.cppreference.com/w/cpp/iterator/begin
// https://en.cppreference.com/w/cpp/iterator/end
unsafe trait BeginAndEnd {
    type Iter: CppIterator;
    /// return an iterator that is either past-the-end (in case of an empty container) or at the beginning
    unsafe fn begin(self: *mut Self) -> Self::Iter;
    /// return a past-the-end iterator
    unsafe fn end(self: *mut Self) -> Self::Iter;
}

unsafe impl<T> BeginAndEnd for [T] {
    type Iter = *mut T;
    unsafe fn begin(self: *mut Self) -> Self::Iter {
        self as *mut T
    }
    unsafe fn end(self: *mut Self) -> Self::Iter {
        let len = (*self).len();
        (self as *mut T).add(len)
    }
}

Example usage:

fn cpp_example() {
    let mut range = [1, 2, 3];
    let range_ptr = &mut range[..] as *mut [_];
    let mut iter = unsafe { BeginAndEnd::begin(range_ptr) };
    while iter != unsafe { BeginAndEnd::end(range_ptr) } {
        let v = unsafe { *iter.get() };
        println!("{}", v);
        unsafe { iter.move_to_next() };
    }
}

fn rust_equivalent() {
    for v in &mut [1, 2, 3] {
        println!("{}", v);
    }
}

@bbqsrc
Copy link

bbqsrc commented Sep 26, 2022

Any progress on this lately? One of my crates is still pegged to nightly as a consequence of this not being stable. Alternatively, would be happy to be directed to a crate that implements this that I can use on stable.

@Amanieu
Copy link
Member Author

Amanieu commented Sep 30, 2022

I just noticed that the documentation for cursors say this:

Cursors always rest between two elements in the list, and index in a logically circular way.

This is incorrect: cursors point to an element, not to a location between two elements.

@crlf0710
Copy link
Member

I just noticed that the documentation for cursors say this:

Cursors always rest between two elements in the list, and index in a logically circular way.

This is incorrect: cursors point to an element, not to a location between two elements.

I think i wrote this doc comment, sorry for my bad English! Basically i wanted to mean that "there's always a prior element and a next element", sorry for the confusing wording. The actual code behavior matches what the RFC says, i think.

@typedrat
Copy link

typedrat commented Nov 6, 2022

What work would need to be done to help move this forward? Is it a matter of process that I can't contribute to as an outsider, or what exactly is the current blocker?

@FichteFoll
Copy link

FichteFoll commented Dec 21, 2022

I was looking for a way to cheaply wrap a CursorMut around at the end of the list but that does not appear to be possible. The best workaround I found was to create a new cursor from the original list and replace the old one because it keeps a mutable reference to the list. Considering how various *front and *back opations are natively available on the cursor, I'd also expect a seek_to_front/seek_to_back or move_next_wrapping to be possible. Whether the use case is frequent enough probably remains up for debate.

I implemented the following macro for moving n times in an arbitrary direction, but it's not portable.

    use std::cmp::Ordering::*;
    // Note: can exit at the "ghost" element.
    macro_rules! move_cursor_by {
        ($by:expr) => {
            match ($by).cmp(&0) {
                Equal => (),
                Greater => {
                    for _ in 0..($by) {
                        if cursor.current().is_none() {
                            cursor = list.cursor_front_mut();
                        }
                        cursor.move_next();
                    }
                }
                Less => {
                    for _ in 0..-($by) {
                        if cursor.current().is_none() {
                            cursor = list.cursor_back_mut();
                        }
                        cursor.move_prev();
                    }
                }
            };
        };
    }

@iwahbe
Copy link
Contributor

iwahbe commented Dec 21, 2022

I think we can divide methods on the cursor that function by delegating to the underlying list into two camps:

  1. Methods that effect the list, but preserve cursor position, such as pop_front and push_back.
  2. Methods the don't preserve cursor position, such as your proposed seek_to_front/seek_to_back.

I think (1) needs to be handled by dedicated methods on the Cursor(Mut) to get around the borrow checker.

I think (2) could be handled in the general case by implementing Into<&'a mut LinkedList> for CursorMut<'a> and Into<&'a LinkedList> for Cursor<'a>. Methods that don't preserve the cursor position could discard the cursor for the underlying LinkedList, and then call the relevant method there.

Reverting to the start of the list could be done in O(1) with cursor.into().cursor_front_mut().

@FichteFoll
Copy link

FichteFoll commented Dec 21, 2022

Going back from a cursor to a list sounds like a good idea, especially if you want to use some traits/functions on the list itself like .iter() or IntoIter. This would make it possible for a custom function to only take in a cursor and implement what I did with the macro by consuming the cursor and potentially returning a new one.

A function like insert_before_n(&mut self, usize: n, item: T) that performs n curser movements before inserting an item but does not alter the cursor's position would then fall into (1) and need an implementation on CursorMut, correct? It seems like a rather specific use case but it doesn't seem possible to implement this with safe Rust in the current state.

@iwahbe
Copy link
Contributor

iwahbe commented Dec 21, 2022

A function like insert_before_n(&mut self, usize: n, item: T) that performs n curser movements before inserting an item but does not alter the cursor's position would then fall into (1) and need an implementation on CursorMut, correct? It seems like a rather specific use case but it doesn't seem possible to implement this with safe Rust in the current state.

You could implement it in safe rust efficiently, it would just be annoying. You can preserve cursor state by splitting and splicing the list.

insert_before_n could be implemented as (in un-optimized sudo-code):

fn insert_before_n(list, item, n) {
  let before = list.split_before()
  // Not handling the case where `n > cursor.index()`, 
  // or the optimization where `n < before.len()/2`
  before.cursor_back_mut().move_back_by(n).insert_after(item);
  list.splice_before(before)
  return list
}

@Amanieu
Copy link
Member Author

Amanieu commented Dec 29, 2022

I like the idea of converting a CursorMut back into a reference to the original list. I think this should be done with an explicit method rather than an Into impl, since the semantics aren't immediately obvious.

@safinaskar
Copy link
Contributor

@4e554c4c and others. This article https://diziet.dreamwidth.org/13476.html says that allowing not more than one mutable cursor in a time makes linked list concept useless (see section "The Cursor concept")

@4e554c4c
Copy link

4e554c4c commented Mar 4, 2023

@safinaskar First of all, that article does not use the same terminology that you did. It just described this as

a serious API limitation

which is true! Linked lists are a niche datastructure. That doesn't mean that the API is useless, and it will still be faster than other vec-based APIs when you must splice several different LLs together: a linear operation for linked lists, but quadratic for vectors.
Secondly, the article fails to give examples of where long-lived mutable cursors are simultaneously useful and safe. I would appreciate if you would give examples of these, instead of mentioning the author of a five-year-old RFC out of the blue, because someone wrote a blog post.

@Salabar
Copy link

Salabar commented Jun 29, 2024

I would suggest to remove index and add unsafe fn cursor_from_ptr to LinkedList which takes a pointer to the element of the list. The one useful property of linked list is O(1) insertions. It's completely nullified if you cannot access an element in O(1). E.g.,
struct LRU {
entries : LinkedList,//Sorted in last used order
map : HashSet<*const T>//points to Entries
}
As it stands, I can cache pointers, but there is no way to use them for list manipulation, safe or unsafe.

@GrigorenkoPV
Copy link
Contributor

I like the idea of converting a CursorMut back into a reference to the original list. I think this should be done with an explicit method rather than an Into impl, since the semantics aren't immediately obvious.

I have thought the same and made a PR: #127189

@jmelo11
Copy link

jmelo11 commented Jul 6, 2024

hi guys, quick question, when do you think this api will be stabilized?

@Noratrieb
Copy link
Member

Noratrieb commented Jul 7, 2024

@jmelo11 when someone who cares about this in my eyes pretty useless collection does the work of looking at the new API surface, writing down the reasoning behind everything, making sure everything makes sense and then proposing stabilization for it. that's a bunch of work though, and it seems like no one has cared enough to do it, which i understand.

edit: accidentally closed this

@Noratrieb Noratrieb reopened this Jul 7, 2024
@donottellmetonottellyou

Is there a reason this couldn't be generalized for other non-Ord data types such as Vec, VecDeque, and arrays? I understand there is a lack of interest in a LinkedList-specific implementation.

@4e554c4c
Copy link

the point of this API proposal is that splitting and indexing are constant time about a certain element for linked lists. This isn't too useful for vecs where there's no need to traverse the container, and where splitting is a linear time operation

cursors could be useful for trees (and other linked structures), but I haven't looked into how it would be useful for B-trees

@Amanieu
Copy link
Member Author

Amanieu commented Jul 26, 2024

There is an unstable Cursor API for BTreeMap: #107540

It's extremely useful when you want to mutate or remove multiple adjacent entries in a map.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-collections Area: `std::collection` B-RFC-implemented Blocker: Approved by a merged RFC and implemented. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests