Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

query results are not deterministic #271

Closed
1 of 5 tasks
thescientist13 opened this issue Jan 17, 2020 · 7 comments
Closed
1 of 5 tasks

query results are not deterministic #271

thescientist13 opened this issue Jan 17, 2020 · 7 comments
Assignees
Labels
bug Something isn't working Content as Data P0 Critical issue that should get addressed ASAP v0.5.0 Data w/ GraphQL
Milestone

Comments

@thescientist13
Copy link
Member

thescientist13 commented Jan 17, 2020

Type of Change

  • New Feature Request
  • Documentation / Website
  • Improvement / Suggestion
  • Bug
  • Other (please clarify below)

Summary

One of the current drawbacks of the implementation in #115 is that data is returned in the order it is on disk, which is essentially alpha-sorted. However, like for the Getting Started docs on our website, this navigation needs to be in a certain order.

Details

And also observed is that order is not guaranteed either, possibly due to JSON.stringify.

For that reason, it should get fixed and put under test, before adding the sorting and filtering.

@thescientist13 thescientist13 added enhancement Improve something existing (e.g. no docs, new APIs, etc) Content as Data labels Jan 17, 2020
@thescientist13 thescientist13 added this to the MVP milestone Jan 17, 2020
@thescientist13 thescientist13 added the P0 Critical issue that should get addressed ASAP label Jan 23, 2020
@thescientist13
Copy link
Member Author

Observed as part of #115 that not only should user's be able to provide a filter for results, but that at the very least, the default order should always be deterministic each and every time, at minimum.

So as part of this, we should make sure that we can guarantee basis results order (like the blog posts test cases) and then we can add additional enhancements on top of that if needed.

@thescientist13 thescientist13 changed the title declarative queries and / or sorting and filtering of queries deterministic queries and / or sorting and filtering of queries Jan 23, 2020
@hutchgrant
Copy link
Member

The correct way to do this IMO is through the graphql schema and resolver.
https://www.howtographql.com/graphql-js/8-filtering-pagination-and-sorting/

For our built in schemas, we can add predetermined sorting/filtering capabilities.

@thescientist13
Copy link
Member Author

I think I should just fold #280 into this, since this is such a critical expectation of this feature, it should include unit tests too for not only default determinism, but also put it under test.

@thescientist13 thescientist13 changed the title deterministic queries and / or sorting and filtering of queries deterministic queries w/ sorting and / or filtering capabilities Jan 25, 2020
@thescientist13 thescientist13 mentioned this issue Jan 29, 2020
13 tasks
@hutchgrant
Copy link
Member

hutchgrant commented Feb 6, 2020

I would prefer to merge navigation query with shelf query into one single menu query which uses links and submenus. Navigation is basically a menu, the only difference is it has no submenus. We can create a map of menus(with recursive children and links) and then filter it based on the menu request. Currently within markdown we can get the menus based on front-matter during the graph lifecycle(#274).

e.g.

Some page.md

---
label: 'getting-started'
menu: navigation
title: 'Getting Started'
index: 2
---
## Introduction
First off, thank you for taking the time to check out Greenwood! 

schema

  type Link {
    label: String,
    link: String
  }

  type Menu {
    item: Link
    children: [Menu]
  }

  enum MenuOrderBy {
    label_asc,
    label_desc
    index_asc,
    index_desc
  }

  type Query {
    menu(filter: String, orderBy: MenuOrderBy): Menu
  }

our query from our components(both header and shelf):

query($menu: String!) {
  menu(filter: $menu, orderby: index_asc) {
    item {
      label,
      link
    }
    children {
      item {
        label,
        link
      },
      children {
        item {
          label,
          link
        }
      }
    }
  }
}

This will work for both navigation and side menus.

@hutchgrant
Copy link
Member

This actually resolves #275 as well.

@thescientist13
Copy link
Member Author

Not that I disagree per se, but I'm not sure the right move right now is to a bunch of refactoring without locking everything down and putting it under test and making sure determinism is in place. After that, I would be happy to consider improvement to our default query implementations.

I've been working a little bit on the unit test part just to at least get the determinism locked in so we confidently refactor and experiment.

@thescientist13
Copy link
Member Author

thescientist13 commented Feb 16, 2020

So was able to identify the root cause of the differing outputs and why sometimes the pages are not ordered correctly, and it doesn't have to do with GraphQL, client.extract, or JSON.[stringify|parse]. It's actually Greenwood's graph itself.

By looking at what is coming out of GraphQL

In the case where the tests pass

getChildrenFromParentRoute [ { mdFile: './index.md',
    label: '0ec50e3c102dc02',
    route: '/',
    template: 'page',
    filePath:
     '/Users/owenbuckley/Workspace/project-evergreen/repos/greenwood/packages/cli/test/cases/build.data.graph/src/pages/index.md',
    fileName: 'index',
    relativeExpectedPath: '\'../index/index.js\'',
    title: 'Greenwood App',
    meta: [] },
  { mdFile: './blog/first-post.md',
    label: 'f3f1bb94286324a',
    route: '/blog/first-post',
    template: 'blog',
    filePath:
     '/Users/owenbuckley/Workspace/project-evergreen/repos/greenwood/packages/cli/test/cases/build.data.graph/src/pages/blog/first-post.md',
    fileName: 'first-post',
    relativeExpectedPath: '\'../blog/first-post/first-post.js\'',
    title: 'Greenwood App',
    meta: [] },
  { mdFile: './blog/second-post.md',
    label: '9a4ce9aeeb41a94',
    route: '/blog/second-post',
    template: 'blog',
    filePath:
     '/Users/owenbuckley/Workspace/project-evergreen/repos/greenwood/packages/cli/test/cases/build.data.graph/src/pages/blog/second-post.md',
    fileName: 'second-post',
    relativeExpectedPath: '\'../blog/second-post/second-post.js\'',
    title: 'Greenwood App',
    meta: [] } ]

In the case where the tests fail

getChildrenFromParentRoute [ { mdFile: './index.md',
    label: '0ec50e3c102dc02',
    route: '/',
    template: 'page',
    filePath:
     '/Users/owenbuckley/Workspace/project-evergreen/repos/greenwood/packages/cli/test/cases/build.data.graph/src/pages/index.md',
    fileName: 'index',
    relativeExpectedPath: '\'../index/index.js\'',
    title: 'Greenwood App',
    meta: [] },
  { mdFile: './blog/second-post.md',
    label: '9a4ce9aeeb41a94',
    route: '/blog/second-post',
    template: 'blog',
    filePath:
     '/Users/owenbuckley/Workspace/project-evergreen/repos/greenwood/packages/cli/test/cases/build.data.graph/src/pages/blog/second-post.md',
    fileName: 'second-post',
    relativeExpectedPath: '\'../blog/second-post/second-post.js\'',
    title: 'Greenwood App',
    meta: [] },
  { mdFile: './blog/first-post.md',
    label: 'f3f1bb94286324a',
    route: '/blog/first-post',
    template: 'blog',
    filePath:
     '/Users/owenbuckley/Workspace/project-evergreen/repos/greenwood/packages/cli/test/cases/build.data.graph/src/pages/blog/first-post.md',
    fileName: 'first-post',
    relativeExpectedPath: '\'../blog/first-post/first-post.js\'',
    title: 'Greenwood App',
    meta: [] } ]

We can see that in the first example it's

  1. first-post
  2. second-post

In the second example, it's

  1. second-post
  2. first-post

Based on our logic for walkDirectory, these results would make sense, since we are grabbing all files and pushing them to pages as each one resolves. This results in A race condition where pages could come back in any order.

So I see a short and long term solution here...

For now, just guarantee ordering, by assigning each page an index, so we can still try and maintain async walking but still have order. Or some other method of sequential execution.

// instead of
pages.push({ /* data */})

// do something like
pages[someIndex] = { /* data */ }; 

In the long run, maybe we can apply a data structure of some kind to Greenwood's graph to make it overall more useful? Something like unified?


So next steps here are:

  1. Change this issue to reflect just limit itself to the specific issue of determinism
  2. Move sorting / filtering to a new issue, and pull in the proposal presented in [WIP] task/issue-275 Create Extensible Menus Query #285
  3. Create a new issue (RFC?) to discuss data structures and the Greenwood graph

@thescientist13 thescientist13 changed the title deterministic queries w/ sorting and / or filtering capabilities deterministic queries Feb 17, 2020
@thescientist13 thescientist13 changed the title deterministic queries query results are not deterministic Feb 17, 2020
@thescientist13 thescientist13 added bug Something isn't working and removed enhancement Improve something existing (e.g. no docs, new APIs, etc) labels Feb 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Content as Data P0 Critical issue that should get addressed ASAP v0.5.0 Data w/ GraphQL
Projects
None yet
Development

No branches or pull requests

2 participants