feat: add pinned storage to user account API #1044

GaryHomewood · 2022-02-25T14:27:45Z

Resolves #1043

Amend /user/account API endpoint to return both uploaded and pinned storage usage for the user.

DB function amended to include a separate query to get the aggregated size of pinned content. This query is potentially expensive, joining 4 tables. Some benchmarking done with indexes added to the psa_pin_request table, but should be further tested against large data volumes before deploy to production.

flea89

Nice one!
Left some comments!

packages/db/postgres/functions.sql

flea89 · 2022-02-25T16:43:55Z

packages/db/postgres/functions.sql

+    JOIN content c ON c.cid = pr.content_cid
+    JOIN auth_key a ON a.id = pr.auth_key_id
+    WHERE a.user_id = query_user_id::BIGINT
+    AND a.deleted_at is null


Interesting!
TL;DR I'm wondering we should keep consistency with uploads and ignore the auth_key.deleted_at for now.

I can see why you added this, but I'm wondering if should care about this or not.

At the moment it's not entirely "consistent" what an auth_key represents. While in the context of the pinning APIs can be considered a child "scope" (or application if you like) in the web3.storage HTTP APIs tokens are just a means to authorizing requests.
I'm not too opinionated really and, not sure what's best, but wondering if we should take the path of consistency over semantics.

Once we migrate to UCAN auth, I feel we'll have a more clear definition and we'll be more informed as to if/how to exclude uploads for deleted keys.

Ok have removed for now, for consistency.

packages/db/postgres/functions.sql

packages/db/test/pinning.spec.js

flea89 · 2022-02-28T08:34:58Z

packages/db/index.js

   */
  async getUsedStorage (userId) {
-    /** @type {{ data: string, error: PostgrestError }} */
+    /** @type {{ data: import('./db-client-types').UsedStorage, error: PostgrestError }} */


The function is returning
/** @type {{ data: {uploaded: string, pinned: string}, error: PostgrestError }} */
rather than UsedStorage, isn't it?
(Please run node scripts/cli.js pg-rest-api-types and you will get the right type from type ./postgres/pg-rest-api-types').definitions).

Have made this function responsible for converting to number so this return type is now correct.

flea89 · 2022-02-28T08:49:02Z

packages/db/index.js

    const { data, error } = await this._client.rpc('user_used_storage', { query_user_id: userId }).single()

    if (error) {
      throw new DBError(error)
    }

-    return data || 0 // No uploads for the user
+    return data


As mentioned in the comment above the values of this map arent' numbers atm.
I wonder if we should have this function accountable for parsing the string.

Being the largest integer 9007199254740991, I wonder if it should be ok to just convert the string to Number here.
If we do, I reckon we should use Number.isSafeInteger and raise otherwise.

Conversion now done here, with error handling for invalid number.

packages/db/postgres/migrations/001-user-used-storage.sql

packages/db/index.d.ts

packages/db/postgres/functions.sql

vasco-santos

Good call on separating both used storage for uploads and pinning for the consumer to decide what to do with them 👍🏼

vasco-santos · 2022-03-07T13:55:41Z

packages/db/index.js

    const { data, error } = await this._client.rpc('user_used_storage', { query_user_id: userId }).single()

    if (error) {
      throw new DBError(error)
    }

-    return data || 0 // No uploads for the user
+    return {
+      uploaded: parseTextToNumber(data.uploaded),


Thee numbers might get super big and not fit in JS number. Should we keep it as is in text and WebUI must deal with it?

The MAX_SAFE_INTEGER is 9007199254740991, if I've done my conversion right that translates to 7.999999999999999112 Pebibyte, which for a single user should be safe enough?
And even if not safe we're raising if that's the case.
I wonder if letting the client deal with it is just moving a problem that we'd have anyway?

which for a single user should be safe enough?

I think it should be enough yes, even though I feel it would be safer to simply rely on BigInt in client instead.

I wonder if letting the client deal with it is just moving a problem that we'd have anyway?

The client would be able to use BigInt. Otherwise we would need to serialize from API to client.

What do you think?

I think both solutions have their merits and cons, and I'm not too opinionated to go one route or the other.

Here's main drivers for having the DB package accountable for the conversion (and possible error):

db client packages don't need to be aware of the casting (and its problems). That means if we end up having to use that number in other contexts is less likely we end up with a "transparent" bugs.

The same applies to APIs clients.

I'd like to avoid that, those are the kind of bugs that are really tricky to find.

it's not a big deal for now. If we are lucky enough to get a user with 7 PiB, then we can review this. For now, let's do the most convenient thing for the calling code and give them numbers as numbers. If this turns out to be a problem, then we can send them as strings and let the rendering code parse it with BigInt, which is well supported now.

vasco-santos · 2022-03-07T13:58:50Z

packages/website/pages/account.js


  const mailTo = useMemo(() => {
    const { mail, subject, body } = emailContent
    return `mailto:${mail}?subject=${subject}&body=${encodeURIComponent(body.join('\n'))}`
  }, [])

  const isLoaded = !isLoading && !isFetching
-  const percentage = Math.ceil((storageData.usedStorage || 0) / MAX_STORAGE * 100)
+  const percentage = Math.ceil(((storageData.usedStorage.uploaded)) / MAX_STORAGE * 100)


Maybe not part of your PR, but we should consider both?

vasco-santos · 2022-03-07T14:31:24Z

packages/db/postgres/functions.sql

+    FROM psa_pin_request psa_pr
+    JOIN content c ON c.cid = psa_pr.content_cid
+    JOIN pin p ON p.content_cid = psa_pr.content_cid
+    JOIN auth_key a ON a.id = psa_pr.auth_key_id


This query is dangerous in terms of execution time needed considering we are using 4 tables.

We should add an index for psa_pr.content_cid. Also not entirely sure if we need an individual index for p.content_cid. Can we have some benchmarks on these?

Some benchmarking based on this data volumne

No index

Index on content_cid

Index on content_cid and auth_key_id

So, I think we did not get super better results in psa_pin_request.content_cid index because we still have a low number of pin requests. I think we should just add both indexes

@alanshaw @hugomrdias what do you think about these indexes?

@vasco-santos I'm happy to get this PR shipped without the indexes and monitor it. We can follow up with a PR to add indexes on all the foreign key source columns that are used in joins.

olizilla · 2022-03-23T14:55:20Z

should be further tested against large data volumes before deploy to production.

Has this happened? Are you satisfied that this change will performant enough not to be a problem?

olizilla · 2022-03-23T14:55:54Z

Please update the PR with conflict fixes

olizilla · 2022-03-23T14:58:28Z

Please always include relevant links in the PR description like a link to issue where one exists
#1043

GaryHomewood · 2022-03-23T15:31:15Z

Have included the issue number in the PR description for context, and will resolve conflicts.

re: should be further tested against large data volumes before deploy to production. I have done some bench-marking as suggested, screenshots here: #1044 (comment) but against thousands not millions of rows, so wanted to flag as a possible risk.

olizilla · 2022-03-23T15:36:04Z

Thinking about it, Heroku supports forking the db, so we could clone prod and test it out manually if we think the query perf is a risk. A forked of prod will have real amounts of data, but wont have the same load, so it's not perfect, but is an option that's available.

olizilla · 2022-03-23T16:07:01Z

Also, we can roll it out and monitor it. If it starts to get slow we can refactor to store the usage stats in the db and update them during an upload.

packages/db/postgres/functions.sql

packages/db/postgres/migrations/001-user-used-storage.sql

Has been addressed

olizilla

Nice! Good work on adding the extra test to check that it's summing the dag_size for unique root CIDs.

one nitpick, this changeset includes packages/website/package-lock.json which shouldn't be included in this PR. In general, we shouldn't have package-lock.json files in the packages/* directories, and they are not used in prod, so it's minor, but please trim your PRs down to just the files related to the issue being fixed.

Otherwise, merge when ready!

All changes addressed.

feat: add pinned storage to user account API

24a6cfd

GaryHomewood linked an issue Feb 25, 2022 that may be closed by this pull request

Add the user's pinned storage to the API #1043

Closed

GaryHomewood requested review from flea89 and alexandrastoica February 25, 2022 14:28

flea89 self-assigned this Feb 25, 2022

flea89 previously requested changes Feb 28, 2022

View reviewed changes

chore: fixed types, add null size handling

2b64200

GaryHomewood requested a review from flea89 February 28, 2022 14:09

chore: comment and util function added

8a78ac4

GaryHomewood requested review from vasco-santos and alanshaw March 4, 2022 09:38

vasco-santos previously requested changes Mar 7, 2022

View reviewed changes

mbommerez requested review from hugomrdias and olizilla March 23, 2022 14:42

ghost previously requested changes Mar 23, 2022

View reviewed changes

packages/db/postgres/functions.sql Show resolved Hide resolved

olizilla suggested changes Mar 24, 2022

View reviewed changes

packages/db/postgres/migrations/001-user-used-storage.sql Outdated Show resolved Hide resolved

GaryHomewood added 3 commits March 25, 2022 15:57

fix: pinned size for unique CIDs

54d98bd

Merge branch 'main' into feat/1043-add-pinned-quota

89d9e24

chore: bump migration prefix

b649caf

GaryHomewood mentioned this pull request Mar 28, 2022

Add the user's pinned storage to the API #1043

Closed

flea89 removed their assignment Mar 28, 2022

olizilla approved these changes Mar 28, 2022

View reviewed changes

flea89 mentioned this pull request Mar 28, 2022

feat: storage quota notifications #1156

Merged

chore: tidy changeset

cade592

GaryHomewood merged commit 3200a6e into main Mar 28, 2022

GaryHomewood deleted the feat/1043-add-pinned-quota branch March 28, 2022 16:23

github-actions bot mentioned this pull request Mar 28, 2022

chore(main): release api 5.5.0 #1163

Merged

github-actions bot mentioned this pull request Jun 27, 2022

chore(main): release db 1.0.0 #1578

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add pinned storage to user account API #1044

feat: add pinned storage to user account API #1044

GaryHomewood commented Feb 25, 2022 •

edited by mbommerez

Loading

flea89 left a comment

flea89 Feb 25, 2022

GaryHomewood Feb 28, 2022

flea89 Feb 28, 2022

GaryHomewood Feb 28, 2022

flea89 Feb 28, 2022

GaryHomewood Feb 28, 2022

vasco-santos left a comment

vasco-santos Mar 7, 2022

flea89 Mar 7, 2022

vasco-santos Mar 7, 2022

flea89 Mar 8, 2022 •

edited

Loading

olizilla Mar 23, 2022

vasco-santos Mar 7, 2022

vasco-santos Mar 7, 2022 •

edited

Loading

GaryHomewood Mar 14, 2022

vasco-santos Mar 16, 2022 •

edited

Loading

olizilla Mar 28, 2022

olizilla commented Mar 23, 2022

olizilla commented Mar 23, 2022 •

edited

Loading

olizilla commented Mar 23, 2022

GaryHomewood commented Mar 23, 2022

olizilla commented Mar 23, 2022

olizilla commented Mar 23, 2022

olizilla left a comment

feat: add pinned storage to user account API #1044

feat: add pinned storage to user account API #1044

Conversation

GaryHomewood commented Feb 25, 2022 • edited by mbommerez Loading

flea89 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vasco-santos left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

flea89 Mar 8, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vasco-santos Mar 7, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vasco-santos Mar 16, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

olizilla commented Mar 23, 2022

olizilla commented Mar 23, 2022 • edited Loading

olizilla commented Mar 23, 2022

GaryHomewood commented Mar 23, 2022

olizilla commented Mar 23, 2022

olizilla commented Mar 23, 2022

olizilla left a comment

Choose a reason for hiding this comment

GaryHomewood commented Feb 25, 2022 •

edited by mbommerez

Loading

flea89 Mar 8, 2022 •

edited

Loading

vasco-santos Mar 7, 2022 •

edited

Loading

vasco-santos Mar 16, 2022 •

edited

Loading

olizilla commented Mar 23, 2022 •

edited

Loading