Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filling some gaps in the DTS manifest #78

Merged
merged 9 commits into from
Aug 13, 2024
76 changes: 46 additions & 30 deletions auth/kbase_auth_server.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,22 @@ type KBaseAuthServer struct {
AccessToken string
}

// a record containing information about a user logged into the KBase Auth2
// server
type kbaseUserInfo struct {
// KBase username
Username string `json:"user"`
// KBase user display name
Display string `json:"display"`
// User email address
Email string `json:"email"`
// Identities with associated providers
Idents []struct {
Provider string `json:"provider"`
UserName string `json:"provusername"`
} `json:"idents"`
}

// here's how KBase represents errors in responses to API calls
type kbaseAuthErrorResponse struct {
HttpCode int `json:"httpcode"`
Expand All @@ -55,15 +71,6 @@ type kbaseAuthErrorResponse struct {
Time time.Duration `json:"time"`
}

// here's what we use to fetch user information
type kbaseAuthUserInfo struct {
Username string `json:"user"`
Idents []struct {
Provider string `json:"provider"`
UserName string `json:"provusername"`
} `json:"idents"`
}

// here's a set of instances to the KBase auth server, mapped by OAuth2
// access token
var instances map[string]*KBaseAuthServer
Expand All @@ -87,7 +94,7 @@ func NewKBaseAuthServer(accessToken string) (*KBaseAuthServer, error) {
}

// verify that the access token works (i.e. that the user is logged in)
userInfo, err := server.getUserInfo()
userInfo, err := server.kbaseUserInfo()
if err != nil {
return nil, err
}
Expand Down Expand Up @@ -162,8 +169,9 @@ func (server *KBaseAuthServer) get(resource string) (*http.Response, error) {
return client.Do(req)
}

func (server *KBaseAuthServer) getUserInfo() (kbaseAuthUserInfo, error) {
var userInfo kbaseAuthUserInfo
// returns information for the current KBase user accessing the auth server
func (server *KBaseAuthServer) kbaseUserInfo() (kbaseUserInfo, error) {
var userInfo kbaseUserInfo
resp, err := server.get("me")
if err != nil {
return userInfo, err
Expand All @@ -180,31 +188,39 @@ func (server *KBaseAuthServer) getUserInfo() (kbaseAuthUserInfo, error) {
return userInfo, err
}
err = json.Unmarshal(body, &userInfo)
return userInfo, err
}

// returns the username for the current KBase user accessing the
// KBase auth server
func (server *KBaseAuthServer) Username() (string, error) {
userInfo, err := server.getUserInfo()
return userInfo.Username, err
// make sure we have at least one ORCID for this user
if len(userInfo.Idents) < 1 {
return userInfo, fmt.Errorf("KBase Auth2: No providers associated with this user!")
}
foundOrcid := false
for _, pid := range userInfo.Idents {
if pid.Provider == "OrcID" {
foundOrcid = true
break
}
}
if !foundOrcid {
return userInfo, fmt.Errorf("KBase Auth2: No ORCID IDs associated with this user!")
}
Comment on lines +196 to +205
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it worth setting the OrcID here, so that in the UserInfo function you don't do the same iteration through the list of Idents?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it matters, unless a user has tens of thousands of ORCIDs associated(!). As it is now, kbaseUserInfo mirrors the information maintained by the KBase auth server, which doesn't prioritize the first ORCID. This mirrored structure is separate from our interpretation, which provides some clarity as to what we assume, which to me is more valuable than avoiding an extra loop.

return userInfo, err
}

// returns the current KBase user's registered ORCID identifiers (and/or an error)
// a user can have 0, 1, or many associated ORCID identifiers
func (server *KBaseAuthServer) Orcids() ([]string, error) {
userInfo, err := server.getUserInfo()
// returns a normalized user info record for the current KBase user
func (server *KBaseAuthServer) UserInfo() (UserInfo, error) {
kbUserInfo, err := server.kbaseUserInfo()
if err != nil {
return nil, err
return UserInfo{}, err
}
if len(userInfo.Idents) < 1 {
return nil, fmt.Errorf("No ORCID IDs associated with this user!")
userInfo := UserInfo{
Name: kbUserInfo.Display,
Username: kbUserInfo.Username,
Email: kbUserInfo.Email,
}
orcidIds := make([]string, 0)
for _, pid := range userInfo.Idents {
for _, pid := range kbUserInfo.Idents {
if pid.Provider == "OrcID" {
orcidIds = append(orcidIds, pid.UserName)
userInfo.Orcid = pid.UserName
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it guaranteed that there is only ever one Orcid associated with the user?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but I don't know what we would do with more than one ORCID, so I'm assuming for now that we need to select one, and that the one we select is the first one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the user has different OrcIDs associated with their accounts on different platforms? (well, I know what would happen -- nothing. The more useful case to think about is where they have a bunch of OrcIDs assoc'd with their KBase account, one of which is associated with their account on another platform).

Not deliberately trying to be annoying here -- just thinking of the kinds of annoying situations that may pop up in the wild.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! I guess my logic is that I'd like to start simple, encounter real problems, and address accordingly. At least we have separate KBase and DTS user information so we can articulate our assumptions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been burned too many times by trying to keep it simple and assuming that KBase is doing sensible things...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it! Here's an issue: #81

}
}
return orcidIds, nil
return userInfo, nil
}
39 changes: 16 additions & 23 deletions auth/kbase_auth_server_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,34 +44,27 @@ func TestNewKBaseAuthServer(t *testing.T) {
assert.Nil(err, "Authentication server constructor triggered an error")
}

// tests whether the authentication server can return the username for the
// for the owner of the developer token
func TestUsername(t *testing.T) {
// tests whether an invalid KBase token prevents a proxy for the auth server
// from being constructed
func TestInvalidToken(t *testing.T) {
assert := assert.New(t)
devToken := os.Getenv("DTS_KBASE_DEV_TOKEN")
server, _ := NewKBaseAuthServer(devToken)
assert.NotNil(server)
username, err := server.Username()
assert.Nil(err)
assert.True(len(username) > 0)
devToken := "INVALID_KBASE_TOKEN"
server, err := NewKBaseAuthServer(devToken)
assert.Nil(server, "Authentication server created with invalid token")
assert.NotNil(err, "Invalid token for authentication server triggered no error")
}

// tests whether the authentication server can return the proper credentials
// for the owner of the developer token
func TestOrchids(t *testing.T) {
// tests whether the authentication server can return information for the
// user associated with the specified developer token
func TestUserInfo(t *testing.T) {
jeff-cohere marked this conversation as resolved.
Show resolved Hide resolved
assert := assert.New(t)
devToken := os.Getenv("DTS_KBASE_DEV_TOKEN")
orcid := os.Getenv("DTS_KBASE_TEST_ORCID")
assert.False(orcid == "")
server, _ := NewKBaseAuthServer(devToken)
orcids, err := server.Orcids()
assert.NotNil(server)
userInfo, err := server.UserInfo()
assert.Nil(err)
var foundOrcid bool
for _, id := range orcids {
if orcid == id {
foundOrcid = true
break
}
}
assert.True(foundOrcid)

assert.True(len(userInfo.Username) > 0)
assert.True(len(userInfo.Email) > 0)
assert.Equal(os.Getenv("DTS_KBASE_TEST_ORCID"), userInfo.Orcid)
}
36 changes: 36 additions & 0 deletions auth/user_info.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
// Copyright (c) 2023 The KBase Project and its Contributors
// Copyright (c) 2023 Cohere Consulting, LLC
//
// Permission is hereby granted, free of charge, to any person obtaining a copy of
// this software and associated documentation files (the "Software"), to deal in
// the Software without restriction, including without limitation the rights to
// use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
// of the Software, and to permit persons to whom the Software is furnished to do
// so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in all
// copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
// SOFTWARE.

package auth

// a record containing information about a DTS user
type UserInfo struct {
// user's name (human-readable and display-friendly)
Name string
// username used to access DTS
Username string
// user's email address
Email string
// ORCID identifier associated with this user
Orcid string
// organization with which this user is affiliated
Organization string
}
93 changes: 66 additions & 27 deletions frictionless/frictionless.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,50 +31,73 @@ import (
// a Frictionless data package describing a set of related resources
// (https://specs.frictionlessdata.io/data-package/)
type DataPackage struct {
// list of contributors to the data package
Contributors []Contributor `json:"contributors,omitempty"`
// a timestamp indicated when the package was created
Created string `json:"created,omitempty"`
// a Markdown description of the data package
Description string `json:"description,omitempty"`
// a URL for a web address related to the data package
Homepage string `json:"homepage,omitempty"`
// an image to use for this data package (URL or POSIX path)
Image string `json:"image,omitempty"`
// a machine-readable set of instructions for processing
Instructions json.RawMessage `json:"instructions,omitempty"`
// an array of string keywords to assist users searching for the data package
// in catalogs
Keywords []string `json:"keywords,omitempty"`
// a list identifying the license or licenses under which this resource is
// managed (optional)
Licenses []DataLicense `json:"licenses,omitempty"`
// the name of the data package
Name string `json:"name"`
// the profile of this descriptor per the DataPackage profiles specification
// (https://specs.frictionlessdata.io/profiles/#language)
Profile string `json:"profile,omitempty"`
// a list of resources that belong to the package
Resources []DataResource `json:"resources"`
// a list identifying the license or licenses under which this resource is
// managed (optional)
Licenses []DataLicense `json:"licenses,omitempty"`
// a list identifying the sources for this resource (optional)
Sources []DataSource `json:"sources,omitempty"`
// a title or one sentence description for the data package
Title string `json:"title,omitempty"`
// a version string identifying the version of the data package, conforming to
// semantic versioning if relevant
Version string `json:"version,omitempty"`
}

// a Frictionless data resource describing a file in a search
// (https://specs.frictionlessdata.io/data-resource/)
type DataResource struct {
// a unique identifier for the resource
Id string `json:"id"`
// the name of the resource's file, with any suffix stripped off
Name string `json:"name"`
// a relative path to the resource's file within a data package directory
Path string `json:"path"`
// a title or label for the resource (optional)
Title string `json:"title,omitempty"`
// the size of the resource's file in bytes
Bytes int `json:"bytes"`
// credit metadata associated with the resource (optional for now)
Credit credit.CreditMetadata `json:"credit,omitempty"`
// a description of the resource (optional)
Description string `json:"description,omitempty"`
// indicates the format of the resource's file, often used as an extension
Format string `json:"format"`
// the mediatype/mimetype of the resource (optional, e.g. "test/csv")
MediaType string `json:"media_type,omitempty"`
// the character encoding for the resource's file (optional, default: UTF-8)
Encoding string `json:"encoding,omitempty"`
// the size of the resource's file in bytes
Bytes int `json:"bytes"`
// any other fields requested e.g. by a search query (optional, raw JSON object)
Extra json.RawMessage `json:"extra,omitempty"`
// indicates the format of the resource's file, often used as an extension
Format string `json:"format"`
// the hash for the resource's file (algorithms other than MD5 are indicated
// with a prefix to the hash delimited by a colon)
Hash string `json:"hash"`
// a list identifying the sources for this resource (optional)
Sources []DataSource `json:"sources,omitempty"`
// a unique identifier for the resource
Id string `json:"id"`
// a list identifying the license or licenses under which this resource is
// managed (optional)
Licenses []DataLicense `json:"licenses,omitempty"`
// credit metadata associated with the resource (optional for now)
Credit credit.CreditMetadata `json:"credit,omitempty"`
// any other fields requested e.g. by a search query (optional, raw JSON object)
Extra json.RawMessage `json:"extra,omitempty"`
// the mediatype/mimetype of the resource (optional, e.g. "test/csv")
MediaType string `json:"media_type,omitempty"`
// the name of the resource's file, with any suffix stripped off
Name string `json:"name"`
// a relative path to the resource's file within a data package directory
Path string `json:"path"`
// a list identifying the sources for this resource (optional)
Sources []DataSource `json:"sources,omitempty"`
// a title or label for the resource (optional)
Title string `json:"title,omitempty"`
}

// call this to get a string containing the name of the hashing algorithm used
Expand All @@ -90,12 +113,12 @@ func (res DataResource) HashAlgorithm() string {

// information about the source of a DataResource
type DataSource struct {
// a descriptive title for the source
Title string `json:"title"`
// a URI or relative path pointing to the source (optional)
Path string `json:"path,omitempty"`
// an email address identifying a contact associated with the source (optional)
Email string `json:"email,omitempty"`
// a URI or relative path pointing to the source (optional)
Path string `json:"path,omitempty"`
// a descriptive title for the source
Title string `json:"title"`
}

// information about a license associated with a DataResource
Expand All @@ -107,3 +130,19 @@ type DataLicense struct {
// the descriptive title of the license (optional)
Title string `json:"title,omitempty"`
}

// information about a contributor to a DataPackage
type Contributor struct {
// the contributor's email address
Email string `json:"email"`
// a string describing the contributor's organization
Organization string `json:"organization"`
// a fully qualified http URL pointing to a relevant location online for the
// contributor
Path string `json:"path"`
// the role of the contributor ("author", "publisher", "maintainer",
// "wrangler", "contributor")
Role string `json:"role"`
// name/title of the contributor (name for person, name/title of organization)
Title string `json:"title"`
}
Loading
Loading