Spec refining: Support of IPLD pointers as links #3

nicola · 2016-07-25T10:18:57Z

I define the following terms:

IPLD pointers (HASH/path):

hash/object pointer HASH
plus and an attribute pointer (e.g. /friends/0/name)

IPLD links: {'/': HASH}

As far as I remember, current implementations only support links that have hash pointers, however, IPLD links should support full IPLD pointers. So that these things can happen:

hashNicola
{
  name: {
    'first': 'Nicola',
    'last': 'Greco'
  }
}
hashNicola2
{
  fullname: {'/': 'hashNicola/name' }
}

hashNicola2/fullname/nicola === 'Nicola'

cc @jbenet, @mildred, @dignifiedquire

The text was updated successfully, but these errors were encountered:

dignifiedquire · 2016-07-26T11:15:48Z

I am 👍 on adding this to the libraries and making it explicit in the spec.

jbenet · 2016-08-06T18:02:42Z

Yeah, this should be the case. 👍 to this.
It's necessary for full addressability.
I meant the current spec to enable this, but this should be explained better.

jbenet · 2016-08-06T18:03:51Z

Why the new abstraction "IPLD Pointer"? it seems exactly the same as an "IPLD Path" to me. Seems superfluous. maybe i'm missing something.

nicola · 2016-08-07T08:26:58Z

Perfect, it sounds like we are approving this! 🎉

I have called them pointers to have a consistent wording in this issue (I am trying to use this new term). The way I am seeing it is that a path is from after the CID onwards (I treat the CID like a sort of origin). Path is still a valid (and preferred name)

jbenet · 2016-08-07T11:04:27Z

We've used structures like "/ipfs//a/b/c" as IPFS Paths for a long
time. We've used "/ipld//a/b/c" to be IPLD Paths for months. The path
is the whole thing. This is important for treating these "fs-IRIs" as just
unix paths.

I do not think we gain much from IPLD Pointers vs the existing notion of
IPLD Paths.

Every name we add causes friction and baggage for newcomers. Since they're
very expensive, abstractions should prove to have substantial winnings to
remain in our model.
On Sun, Aug 7, 2016 at 04:27 Nicola Greco [email protected] wrote:

Perfect, it sounds like we are approving this!

I have called them pointers to have a consistent wording in this issue (I
am trying to use this new term). The way I am seeing it is that a path is
from after the CID onwards (I treat the CID like a sort of origin). Path is
still a valid (and preferred name)

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAIcoaqD5L9KLv6_QFtks9NXYFsPD0beks5qdZbTgaJpZM4JT-rc
.

nicola · 2016-08-07T12:52:15Z

🎉 Let's call them IPLD paths then, I will update the rest of my writings soon

jakobvarmose · 2016-08-22T08:46:39Z

Will this allow paths to traverse multiple objects? I.e. is this object valid:

hashTest:
{test: {"/":"hashNicola2/fullName/first"}}

This would add a lot of complexity because each of the path components could potentially also resolve multiple other objects. So essentially, path resolving is no longer a constant-time operation but may resolve a potentially infinite number of objects.

So I think only links in their canonical format should be allowed within objects:

hashTest:
{test: {"/":"hashNicola/name/first"}}

nicola · 2016-08-22T17:29:09Z

Hey, thanks for chipping in!

potentially infinite number of objects

it would not be infinite (it can't!)

I do see your concern in making IPLD slightly more complex than just pointing to simple hashes (or hashes + path of the attribute the hash itself) so avoiding having a path that works across object.

However, I don't see why this could be a problem. The IPLD should abstract away the resolving of a hash, in that particular case, we'll need to resolve two hashes. However, I guess it would be equivalent to link to a hash that links to another hash and do the hop yourself.

A great idea I had from thinking about this is that we could layer up different IPLD.

IPLD-level-0: links are only hashes
IPLD-level-1: links are hashes + paths that can be resolved in the object itself
IPLD-level-2: links are hashes + path that can have multiple hops to be resolved

One of the purpose of the IPLD path was originally IPLD-level-2, imagine having a merkle tree, and we want to point to the first leaf, then I could just do hash/left/left. It would not be possible without level-2 to do this type of pointing.

Although it is great to have this type of layers in mind, I think it would be rather complex to distinguish amongst them and the parsers are just slightly more complex, that it would be trivial. In this case however, we turn resolving into O(n) n being the number of hops across objects, however, if that is the amount of hops, there is little we can do (so, maybe for time-critical application it might be a great idea to structure your data as in IPLD-level-0/1)

cc @jbenet

jakobvarmose · 2016-08-22T18:20:19Z

it would not be infinite (it can't!)

Not in the matematical sense, but a simple-looking path like hashEvil/a could download 1 billion objects (or more).

Example:

hashEvil: {
  a: {/: "hash1/a"}
}
hash1: {
  a: {/: "hash2/a"}
}
...
[999999997 objects omitted]
...
hash999999999: {
  a: {/: "hash1000000000/a"}
}
hash1000000000: {
  a: "someValue"
}

jbenet · 2016-08-23T03:33:31Z

Thanks for noting this.

However: Resolvers could have a resolution max. Nothing prevents infinite
loops in DNS, HTTP 302, etc. or just good old regular "a href"s. The
unlimitedness is by design. Note that the same exact thing (downloading a
billion objects) can happen if your graph is just very deep.

This is an implementation detail that's subject to the times. In 20 years,
our limitations will look foolish. Resolvers and retrievers can establish
limits if they wish. The mathematical model will not pull in such
complexity.

Also note that we could have a "resolve paths" operation that transforms a
level2 graph into level0.
On Mon, Aug 22, 2016 at 14:20 Jakob Varmose Bentzen <
[email protected]> wrote:

it would not be infinite (it can't!)

Not in the matematical sense, but a simple-looking path like hashEvil/a
could download 1 billion objects (or more).

Example:

hashEvil: {
a: {/: "hash1/a"}
}
hash1: {
a: {/: "hash2/a"}
}
...
[999999997 objects omitted]
...
hash999999999: {
a: {/: "hash1000000000/a"}
}
hash1000000000: {
a: "someValue"
}

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAIcoXvF6ylUvsTkKdCtlOPSPJew-wphks5qiehjgaJpZM4JT-rc
.

Stebalien · 2018-01-29T20:30:37Z

So, someone has brought the fact that we haven't fixed this up and we really should. However, we'll have to decide how to encode these in CBOR.

Option 1: Just append it. That is, we could just have CID/path and take advantage of the fact CIDs are are length delimited.
Option 2: Use an array. We'd use the CID tag on an array instead of a byte string. The array would then be populated with the path components. The advantage is that this would allow us to support references (at this point, they aren't really paths) with slashes in them (do we even want to do that)?

@diasdavid ^^

daviddias · 2018-01-30T05:01:23Z

Option 1: Just append it. That is, we could just have CID/path and take advantage of the fact CIDs are length delimited.

👍 I vote for this one

mikeal · 2019-05-06T17:24:20Z

I’m trying to mentally rebase this against the current stack and I’m running into some problems.

We’d need to add this as another “kind” to the Data Model, which adds another barrier to creating new codecs that support the full Data Model.
What exactly does “path” mean?
- Is this restricted to a path contained in the same block?
- If this path could traverse through multiple blocks, is this restricted to pathing at the Data Model or does it need to be schema aware (in other words, if it refers to data in a HAMT do we need to interpret it through a collection lookup or is it already parsed down into a Data Model path).
Where does validation live? If this refers to data that does not exist, who would be responsible for presenting an error? All of our encoders/decoders operate on a per-block basis, they don’t have any means to validate a link that traverses through many blocks.

I’m also lacking clarity on what the use case here is. I’m just not aware of any particular use case this solves or aware of any place where we can’t do something because we don’t have this.

TBH, this throws a wrench in most of what we’ve built and the direction we’ve taken so I’m inclined to close it for now and return to the concept much later when we tackle some form of “mutable links.”

Stebalien · 2019-05-06T21:48:04Z

I believe this is the same as #83.

We’d need to add this as another “kind” to the Data Model, which adds another barrier to creating new codecs that support the full Data Model.

I assume we'd just replace the Cid kind with a more generalized Link or Pointer kind.

Is this restricted to a path contained in the same block?

No.

If this path could traverse through multiple blocks, is this restricted to pathing at the Data Model or does it need to be schema aware (in other words, if it refers to data in a HAMT do we need to interpret it through a collection lookup or is it already parsed down into a Data Model path).

Data model, I assume. We can resolve non-datamodel links when serializing the DAG.

mikeal · 2019-05-06T21:57:38Z

I assume we'd just replace the Cid kind with a more generalized Link or Pointer kind.

See, this sort of requires us to tear up everything we’ve build recently. We’ve already built quite a bit on top of the Data Model, so changing it now has much broader implications.

And if this path stretches multiple blocks then it’s a much better fit for the Schema layer anyway. We could create a “Link Type” at the Schema layer that is much more flexible and could accomplish this as well as open the door for extensions down the road for IPNS based links. There’s actually a long list of “more things we need to be able to do with links,” similar to our long list of collections people need, so it’s probably better to do what we did with Maps and split between the simple “kind” that is in the Data Model and the “type” which is extensible and at the Schema layer.

This would also make the implementations easier, since we would only need to implement these once in schemas rather than for every codec that supports the data model.

Stebalien · 2019-05-06T23:00:45Z

See, this sort of requires us to tear up everything we’ve build recently. We’ve already built quite a bit on top of the Data Model, so changing it now has much broader implications.

It shouldn't make much of a difference at all. The issue is that "links" can currently only point to blocks. This issue is about allowing those links to point to nodes.

This already came up: https://github.com/ipld/specs/blob/master/REQUIREMENTS.md#linked.

mikeal · 2019-05-06T23:41:21Z

Perhaps in your conceptual model this doesn’t break anything, but in the actual code we’ve written most things break. Just one example: every IPLD codec in JS returns instances of CID for links.

The number of assumptions in code that depends on current libraries that assume a “Link === CID” are also quite large. AFAIK every piece of code that even handles links makes this assumption.

Changing the link kind definition is a substantial breaking change, even just adding this as another kind in the data model is a substantial amount of new code we’d have to write throughout the stack.

By comparison, doing a Link Type in schemas can be done without breaking anything. Even if we had far more code written and dependent on schemas we would be able to add the type without breaking anything. And most importantly, we don’t need to get everything perfect in the first implementation because iterating in the Schema Layer is 100x easier than at the Data Model layer.

This already came up: https://github.com/ipld/specs/blob/master/REQUIREMENTS.md#linked.

Nobody is arguing that this should not be supported, we just don’t want to support it at the Data Model layer for all the reasons we’ve already mentioned.

Stebalien · 2019-05-29T22:00:10Z

I'm aware this is hard, but this is absolutely critical. The alternative is:

Introduce a Link type at the schema layer.
Never use CIDs anywhere in schemas. Instead, always use this Link type.
Never use the /ipld namespace, always use some path namespace that actually understands this link type and can traverse paths through it.
Implement this everywhere.

But that seems like even more work.

mikeal · 2019-05-29T23:43:56Z

I’ve got an early implementation of the multi-block type system which can support this. I should have a demo ready in a few days.

We expect most people to “live” in this layer anyway, so implementing this above the data model should be fine. The reality is, any non-trivial use cases require features we just don’t have with only the data model, so user facing abstractions will always be a layer up.

Never use the /ipld namespace

My assumption has always been that once we actually write a spec for this that it would be for Layer 2 paths. Not just for use cases like (Link + Path) but so that it can support HAMT and other multi-block collections. Given that /ipfs transparently resolves through HAMT’s I had assumed our goal was always to support the same in IPLD once we had a way to do it modularly.

rvagg · 2019-08-14T08:20:07Z

Closing due to staleness as per team agreement to clean up the issue tracker a bit (ipld/team-mgmt#28). This doesn't mean this issue is off the table entirely, it's just not on the current active stack but may be revisited in the near future. If you feel there is something pertinent here, please speak up, reopen, or open a new issue. [/boilerplate]

nicola added the needs-agreement label Jul 25, 2016

nicola mentioned this issue Jul 25, 2016

Roadmap towards IPLD ipfs/specs#115

Closed

15 tasks

nicola added the spec label Jul 25, 2016

nicola mentioned this issue Jul 26, 2016

IPLD spec refining call - round 1 ipfs/team-mgmt#124

Closed

nicola mentioned this issue Jul 26, 2016

Spec refining: requiring namespace in IPLD pointers #7

Closed

nicola mentioned this issue Aug 4, 2016

Captain.log - IPLD v1 spec #13

Closed

2 tasks

nicola added awaiting review and removed discussion labels Aug 9, 2016

nicola mentioned this issue Sep 11, 2016

IPLD refining call 2 ipfs/team-mgmt#181

Closed

nicola mentioned this issue Sep 19, 2016

IPLD refining call 3 ipfs/team-mgmt#197

Closed

daviddias mentioned this issue Feb 13, 2017

The IPLD ROADMAP #41

Closed

15 tasks

Stebalien mentioned this issue Jan 29, 2018

Suggestion for New IPLD Interface ipfs/go-ipld-format#32

Open

Stebalien mentioned this issue Feb 7, 2018

Think through the CoreAPI Path API and implementation. ipfs/kubo#4666

Closed

daviddias added the status/deferred Conscious decision to pause or backlog label Mar 19, 2018

daviddias removed needs spec labels May 12, 2018

ghost assigned mikeal May 6, 2019

ghost added awaiting review status/in-progress In progress and removed status/deferred Conscious decision to pause or backlog labels May 6, 2019

mikeal mentioned this issue May 6, 2019

Schema <> Block interaction + "advanced layouts logic" #118

Closed

rvagg closed this as completed Aug 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spec refining: Support of IPLD pointers as links #3

Spec refining: Support of IPLD pointers as links #3

nicola commented Jul 25, 2016 •

edited

Loading

dignifiedquire commented Jul 26, 2016

jbenet commented Aug 6, 2016

jbenet commented Aug 6, 2016

nicola commented Aug 7, 2016 •

edited

Loading

jbenet commented Aug 7, 2016

nicola commented Aug 7, 2016 •

edited

Loading

jakobvarmose commented Aug 22, 2016 •

edited

Loading

nicola commented Aug 22, 2016

jakobvarmose commented Aug 22, 2016

jbenet commented Aug 23, 2016

Stebalien commented Jan 29, 2018

daviddias commented Jan 30, 2018

mikeal commented May 6, 2019

Stebalien commented May 6, 2019

mikeal commented May 6, 2019 •

edited

Loading

Stebalien commented May 6, 2019

mikeal commented May 6, 2019

Stebalien commented May 29, 2019

mikeal commented May 29, 2019

rvagg commented Aug 14, 2019

Spec refining: Support of IPLD pointers as links #3

Spec refining: Support of IPLD pointers as links #3

Comments

nicola commented Jul 25, 2016 • edited Loading

dignifiedquire commented Jul 26, 2016

jbenet commented Aug 6, 2016

jbenet commented Aug 6, 2016

nicola commented Aug 7, 2016 • edited Loading

jbenet commented Aug 7, 2016

nicola commented Aug 7, 2016 • edited Loading

jakobvarmose commented Aug 22, 2016 • edited Loading

nicola commented Aug 22, 2016

jakobvarmose commented Aug 22, 2016

jbenet commented Aug 23, 2016

Stebalien commented Jan 29, 2018

daviddias commented Jan 30, 2018

mikeal commented May 6, 2019

Stebalien commented May 6, 2019

mikeal commented May 6, 2019 • edited Loading

Stebalien commented May 6, 2019

mikeal commented May 6, 2019

Stebalien commented May 29, 2019

mikeal commented May 29, 2019

rvagg commented Aug 14, 2019

nicola commented Jul 25, 2016 •

edited

Loading

nicola commented Aug 7, 2016 •

edited

Loading

nicola commented Aug 7, 2016 •

edited

Loading

jakobvarmose commented Aug 22, 2016 •

edited

Loading

mikeal commented May 6, 2019 •

edited

Loading