Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: ropensci/patentsview
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.3.0
Choose a base ref
...
head repository: ropensci/patentsview
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: master
Choose a head ref

Commits on Aug 19, 2022

  1. Copy the full SHA
    dbe2fa5 View commit details

Commits on Aug 21, 2022

  1. Copy the full SHA
    95fae4c View commit details
  2. Copy the full SHA
    37c7787 View commit details
  3. Refactor to_singular and to_plural so that they return values, and te…

    …st those function during builds
    crew102 committed Aug 21, 2022
    Copy the full SHA
    a1291b3 View commit details
  4. Copy the full SHA
    09e2dec View commit details

Commits on Aug 22, 2022

  1. Copy the full SHA
    897bf5a View commit details
  2. adding throttling retry

    mustberuss committed Aug 22, 2022
    Copy the full SHA
    0a2042f View commit details
  3. Merge branch 'api-redesign-contribute' of github.com:mustberuss/paten…

    …tsview into api-redesign-contribute
    mustberuss committed Aug 22, 2022
    Copy the full SHA
    5c56a80 View commit details

Commits on Sep 16, 2022

  1. Copy the full SHA
    511f9e4 View commit details
  2. Refactor search_pv so that a) we retry via recursion upon throttle er…

    …ror and b) pull response processing/processing out of one_request function
    crew102 committed Sep 16, 2022
    Copy the full SHA
    41f84b2 View commit details
  3. Refactor tests to use TEST_QUERIES object, have better names, and not…

    … be as verbose re: comments
    crew102 committed Sep 16, 2022
    Copy the full SHA
    5b6ac05 View commit details
  4. Update get_endpoints function so that we are only reflecting current …

    …endpoints and also not listing endpoint names when they can be ID'd via the function
    crew102 committed Sep 16, 2022
    Copy the full SHA
    3750937 View commit details
  5. Copy the full SHA
    a340053 View commit details
  6. Copy the full SHA
    a96983a View commit details
  7. Rework how we handle larger result sizes

    * Default to 1000 records per page instead of 25 so that people don't have to deal with 45 requests per minute throttle more than they might have to
    * Error when user requests a result size that can't be returned b/c of API limitations (e.g., 10,000 records total, 10 pages total)
    * Add test to confirm we can indeed pull these larger result sets from API
    crew102 committed Sep 16, 2022
    Copy the full SHA
    bfa086a View commit details
  8. Update docs

    crew102 committed Sep 16, 2022
    Copy the full SHA
    406e571 View commit details

Commits on Sep 18, 2022

  1. HATEOAS feature

    mustberuss committed Sep 18, 2022
    Copy the full SHA
    7372fc1 View commit details
  2. Merge pull request #27 from mustberuss/api-redesign-contribute

    mvp for the new api version
    crew102 authored Sep 18, 2022
    Copy the full SHA
    4cc28e1 View commit details
  3. Document mtchd_subent_only

    crew102 committed Sep 18, 2022
    Copy the full SHA
    419a4f0 View commit details
  4. Don't try to keep up with all the various endpoint changes when it co…

    …mes to pretty formatting, as this was resulting in errors when printing objects
    crew102 committed Sep 18, 2022
    Copy the full SHA
    e66504d View commit details
  5. Refactor retrieve_linked_data

    crew102 committed Sep 18, 2022
    Copy the full SHA
    87e6a3f View commit details
  6. Copy the full SHA
    6ac5174 View commit details
  7. Make minor doc changes

    crew102 committed Sep 18, 2022
    Copy the full SHA
    c644355 View commit details
  8. Copy the full SHA
    ad4e304 View commit details
  9. Don't check query

    crew102 committed Sep 18, 2022
    Copy the full SHA
    16f3685 View commit details
  10. Copy the full SHA
    052e560 View commit details
  11. Add Russ as an author

    crew102 committed Sep 18, 2022
    Copy the full SHA
    d4a06c5 View commit details

Commits on Dec 27, 2022

  1. Merge pull request #28 from ropensci/api-redesign-contribute

    Merge first API redesign PR into master
    crew102 authored Dec 27, 2022
    Copy the full SHA
    489b2f7 View commit details

Commits on Dec 29, 2022

  1. Copy the full SHA
    723b9b2 View commit details
  2. Copy the full SHA
    64439c6 View commit details
  3. See if we can collapse diffs on github for generated rd files by tell…

    …ing git they're generated files
    crew102 committed Dec 29, 2022
    Copy the full SHA
    722b42a View commit details

Commits on Feb 4, 2023

  1. Update _pkgdown.yml

    mustberuss authored Feb 4, 2023
    Copy the full SHA
    808fc98 View commit details
  2. Merge pull request #31 from ropensci/mustberuss-patch-1

    Update _pkgdown.yml
    crew102 authored Feb 4, 2023
    Copy the full SHA
    101468a View commit details

Commits on Jul 17, 2024

  1. docs: modernize pkgdown config to not lose search

    maelle committed Jul 17, 2024
    Copy the full SHA
    44711f9 View commit details

Commits on Jul 20, 2024

  1. Merge pull request #34 from ropensci/maelle-patch-1

    crew102 authored Jul 20, 2024
    Copy the full SHA
    bf3d452 View commit details

Commits on Nov 30, 2024

  1. generated files

    mustberuss committed Nov 30, 2024
    Copy the full SHA
    1817df0 View commit details
  2. updated to parse the newest openapi.json

    mustberuss committed Nov 30, 2024
    Copy the full SHA
    e4e13e4 View commit details
  3. Merge pull request #36 from mustberuss/master

    Updating fieldsdf
    crew102 authored Nov 30, 2024
    Copy the full SHA
    e1ca36b View commit details

Commits on Dec 10, 2024

  1. updated endpoint list (#37)

    * updated endpoint list
    
    * endpoints are now singular
    
    * removed to_singular and to_plural
    
    * simplified get_endpoints()
    
    * put back to_plural()
    
    * generated files
    
    * restored group handling
    
    * generated files
    
    * endpoint name changes
    
    * endpoint name changes
    
    * generated files
    
    * endpoint name changes
    
    * generated files
    
    * removed to_plural() again
    
    * back to regular group checking
    
    * bumped endpoint count and RoxygenNote version
    
    * generated files
    mustberuss authored Dec 10, 2024
    Copy the full SHA
    e0ebd97 View commit details

Commits on Dec 15, 2024

  1. fieldsdf setting groups on top level attributes (#38)

    mustberuss authored Dec 15, 2024
    Copy the full SHA
    1972fca View commit details

Commits on Dec 28, 2024

  1. Try to use pull_request_target-triggered actions in a secure way, giv…

    …en those actions have access to repo secrets
    crew102 committed Dec 28, 2024
    Copy the full SHA
    9da4f5b View commit details
  2. Try again to look up the identity of the person triggering the build

    crew102 committed Dec 28, 2024
    Copy the full SHA
    f583965 View commit details
Showing with 1,726 additions and 2,372 deletions.
  1. +1 −0 .gitattributes
  2. +46 −3 .github/workflows/R-CMD-check.yaml
  3. +9 −4 DESCRIPTION
  4. +2 −0 NAMESPACE
  5. +10 −14 R/data.R
  6. +14 −30 R/get-fields.R
  7. +7 −0 R/patentsview-package.R
  8. +1 −8 R/print.R
  9. +5 −4 R/process-error.R
  10. +18 −5 R/process-resp.R
  11. +1 −1 R/query-dsl.R
  12. +123 −69 R/search-pv.R
  13. BIN R/sysdata.rda
  14. +10 −16 R/unnest-pv-data.R
  15. +4 −36 R/utils.R
  16. +64 −16 R/validate-args.R
  17. +23 −27 _pkgdown.yml
  18. +140 −44 data-raw/fieldsdf.R
  19. +385 −1,214 data-raw/fieldsdf.csv
  20. BIN data/fieldsdf.rda
  21. +84 −157 docs/reference/get_fields.html
  22. +62 −132 docs/reference/get_ok_pk.html
  23. +128 −0 docs/reference/patentsview-package.html
  24. +192 −265 docs/reference/search_pv.html
  25. +66 −140 docs/reference/unnest_pv_data.html
  26. +12 −16 man/fieldsdf.Rd
  27. +1 −0 man/figures/lifecycle-archived.svg
  28. +1 −0 man/figures/lifecycle-defunct.svg
  29. +1 −0 man/figures/lifecycle-deprecated.svg
  30. +1 −0 man/figures/lifecycle-experimental.svg
  31. +1 −0 man/figures/lifecycle-maturing.svg
  32. +1 −0 man/figures/lifecycle-questioning.svg
  33. +1 −0 man/figures/lifecycle-stable.svg
  34. +1 −0 man/figures/lifecycle-superseded.svg
  35. +2 −15 man/get_endpoints.Rd
  36. +9 −9 man/get_fields.Rd
  37. +6 −5 man/get_ok_pk.Rd
  38. +28 −0 man/patentsview-package.Rd
  39. +23 −23 man/qry_funs.Rd
  40. +50 −0 man/retrieve_linked_data.Rd
  41. +52 −55 man/search_pv.Rd
  42. +4 −4 man/unnest_pv_data.Rd
  43. +34 −0 tests/testthat/helpers.R
  44. +2 −1 tests/testthat/test-arg-validation.R
  45. +2 −1 tests/testthat/test-cast-pv-data.R
  46. +97 −57 tests/testthat/test-search-pv.R
  47. +2 −1 tests/testthat/test-unnest-pv-data.R
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -5,3 +5,4 @@ src/* text=lf
R/* text=lf
docs/* linguist-documentation=true
inst/* linguist-documentation=true
man/* linguist-documentation=true
49 changes: 46 additions & 3 deletions .github/workflows/R-CMD-check.yaml
Original file line number Diff line number Diff line change
@@ -1,31 +1,65 @@
# For help debugging build failures open an issue on the RStudio community with the 'github-actions' tag.
# https://community.rstudio.com/new-topic?category=Package%20development&tags=github-actions
on: [push, pull_request]

# Details on pull_request_target and why it's insecure:
# https://securitylab.github.com/resources/github-actions-preventing-pwn-requests/
# Post describing a workaround, from which we take inspiration:
# https://michaelheap.com/access-secrets-from-forks/

name: R-CMD-check

on:
push:
branches:
- master
- 'feature/**'
- 'bugfix/**'
pull_request_target:
types: [opened, synchronize]

jobs:
pre-check:
runs-on: ubuntu-latest
steps:
- name: Confirm crew102 triggered the build
run: |
if [ "${{ github.event.sender.login }}" == "crew102" ]; then
echo "Actor is crew102"
else
echo "Actor is ${{ github.actor }}, failing build."
exit 1
fi
R-CMD-check:
needs: [pre-check]
runs-on: ${{ matrix.config.os }}

name: ${{ matrix.config.os }} (${{ matrix.config.r }})

strategy:
# Run sequentially so that we don't run into rate limit errors that our
# code would normally work around via retry logic
max-parallel: 1
fail-fast: false
matrix:
config:
- {os: windows-latest, r: 'release'}
- {os: macOS-latest, r: 'release'}
- {os: ubuntu-20.04, r: 'release', rspm: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"}
- {os: ubuntu-20.04, r: 'devel', rspm: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"}
# - {os: ubuntu-20.04, r: 'devel', rspm: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"}

env:
R_REMOTES_NO_ERRORS_FROM_WARNINGS: true
RSPM: ${{ matrix.config.rspm }}
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
PATENTSVIEW_API_KEY: ${{ secrets.PATENTSVIEW_API_KEY }}

steps:
- uses: actions/checkout@v2
- name: Checkout code
uses: actions/checkout@v3
with:
# Use the head SHA for pull requests
ref: ${{ github.event.pull_request.head.sha || github.sha }}

- uses: r-lib/actions/setup-r@v1
with:
@@ -71,6 +105,15 @@ jobs:
rcmdcheck::rcmdcheck(args = c("--no-manual", "--as-cran"), error_on = "warning", check_dir = "check")
shell: Rscript {0}

- name: Run examples
env:
_R_CHECK_CRAN_INCOMING_REMOTE_: false
run: |
options(crayon.enabled = TRUE)
remotes::install_cran("devtools")
devtools::run_examples(run_dontrun = TRUE)
shell: Rscript {0}

- name: Upload check results
if: failure()
uses: actions/upload-artifact@main
13 changes: 9 additions & 4 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -2,12 +2,15 @@ Package: patentsview
Type: Package
Title: An R Client to the 'PatentsView' API
Version: 0.3.0
Authors@R: person("Christopher", "Baker", email = "chriscrewbaker@gmail.com",
role = c("aut", "cre"))
Authors@R: c(
person("Christopher", "Baker", email = "chriscrewbaker@gmail.com",
role = c("aut", "cre")),
person("Russ", "Allen", email = "rrjallen@yahoo.com", role = "aut")
)
Encoding: UTF-8
Description: Provides functions to simplify the 'PatentsView' API
(<https://patentsview.org/apis/purpose>) query language,
send GET and POST requests to the API's seven endpoints, and parse the data
send GET and POST requests to the API's twenty seven endpoints, and parse the data
that comes back.
URL: https://docs.ropensci.org/patentsview/index.html
BugReports: https://github.com/ropensci/patentsview/issues
@@ -17,11 +20,13 @@ Depends:
R (>= 3.1)
Imports:
httr,
lifecycle,
jsonlite,
utils
Suggests:
knitr,
rmarkdown,
testthat,
tidyr
RoxygenNote: 7.1.1
RoxygenNote: 7.3.2
Roxygen: list(markdown = TRUE)
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -11,6 +11,8 @@ export(get_endpoints)
export(get_fields)
export(get_ok_pk)
export(qry_funs)
export(retrieve_linked_data)
export(search_pv)
export(unnest_pv_data)
export(with_qfuns)
importFrom(lifecycle,deprecated)
24 changes: 10 additions & 14 deletions R/data.R
Original file line number Diff line number Diff line change
@@ -1,22 +1,18 @@
#' Fields data frame
#'
#' A data frame containing the names of retrievable and queryable fields for
#' each of the 7 API endpoints. A yes/no flag (\code{can_query}) indicates
#' which fields can be included in the user's query. You can also find this
#' data on the API's online documentation for each endpoint as well (e.g.,
#' the \href{https://patentsview.org/apis/api-endpoints/patents}{patents
#' endpoint field list table})
#' A data frame containing the names of retrievable fields for each of the
#' endpoints. You can find this data on the API's online documentation for each
#' endpoint as well (e.g., the
#' \href{https://patentsview.org/apis/api-endpoints/patents}{patents endpoint
#' field list table}).
#'
#' @format A data frame with 992 rows and 7 variables:
#' @format A data frame with the following columns:
#' \describe{
#' \item{endpoint}{The endpoint that this field record is for}
#' \item{field}{The name of the field}
#' \item{data_type}{The field's data type (string, date, float, integer,
#' fulltext)}
#' \item{can_query}{An indicator for whether the field can be included in
#' the user query for the given endpoint}
#' \item{field}{The complete name of the field, including the parent group if
#' applicable}
#' \item{data_type}{The field's input data type}
#' \item{group}{The group the field belongs to}
#' \item{common_name}{The field's common name}
#' \item{description}{A description of the field}
#' \item{common_name}{The field name without the parent group structure}
#' }
"fieldsdf"
44 changes: 14 additions & 30 deletions R/get-fields.R
Original file line number Diff line number Diff line change
@@ -7,35 +7,35 @@
#' possible fields for each endpoint).
#'
#' @param endpoint The API endpoint whose field list you want to get. See
#' \code{\link{get_endpoints}} for a list of the 7 endpoints.
#' \code{\link{get_endpoints}} for a list of the 27 endpoints.
#' @param groups A character vector giving the group(s) whose fields you want
#' returned. A value of \code{NULL} indicates that you want all of the
#' endpoint's fields (i.e., do not filter the field list based on group
#' membership). See the field tables located online to see which groups you
#' can specify for a given endpoint (e.g., the
#' \href{https://patentsview.org/apis/api-endpoints/patents}{patents
#' \href{https://search.patentsview.org/docs/docs/Search%20API/SearchAPIReference/#patent}{patent
#' endpoint table}), or use the \code{fieldsdf} table
#' (e.g., \code{unique(fieldsdf[fieldsdf$endpoint == "patents", "group"])}).
#' (e.g., \code{unique(fieldsdf[fieldsdf$endpoint == "patent", "group"])}).
#'
#' @return A character vector with field names.
#'
#' @examples
#' # Get all assignee-level fields for the patents endpoint:
#' fields <- get_fields(endpoint = "patents", groups = "assignees")
#' # Get all assignee-level fields for the patent endpoint:
#' fields <- get_fields(endpoint = "patent", groups = "assignees")
#'
#' #...Then pass to search_pv:
#' # ...Then pass to search_pv:
#' \dontrun{
#'
#' search_pv(
#' query = '{"_gte":{"patent_date":"2007-01-04"}}',
#' fields = fields
#' )
#'}
#' # Get all patent and assignee-level fields for the patents endpoint:
#' fields <- get_fields(endpoint = "patents", groups = c("assignees", "patents"))
#' }
#' # Get all patent and assignee-level fields for the patent endpoint:
#' fields <- get_fields(endpoint = "patent", groups = c("assignees", "patents"))
#'
#' \dontrun{
#' #...Then pass to search_pv:
#' # ...Then pass to search_pv:
#' search_pv(
#' query = '{"_gte":{"patent_date":"2007-01-04"}}',
#' fields = fields
@@ -48,34 +48,18 @@ get_fields <- function(endpoint, groups = NULL) {
if (is.null(groups)) {
fieldsdf[fieldsdf$endpoint == endpoint, "field"]
} else {
validate_groups(groups = groups)
validate_groups(endpoint, groups = groups)
fieldsdf[fieldsdf$endpoint == endpoint & fieldsdf$group %in% groups, "field"]
}
}

#' Get endpoints
#'
#' This function reminds the user what the 7 possible PatentsView API endpoints
#' This function reminds the user what the possible PatentsView API endpoints
#' are.
#'
#' @return A character vector with the names of the 7 endpoints. Those endpoints are:
#'
#' \itemize{
#' \item assignees
#' \item cpc_subsections
#' \item inventors
#' \item locations
#' \item nber_subcategories
#' \item patents
#' \item uspc_mainclasses
#' }
#'
#' @examples
#' get_endpoints()
#' @return A character vector with the names of each endpoint.
#' @export
get_endpoints <- function() {
c(
"assignees", "cpc_subsections", "inventors", "locations",
"nber_subcategories", "patents", "uspc_mainclasses"
)
unique(fieldsdf$endpoint)
}
7 changes: 7 additions & 0 deletions R/patentsview-package.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#' @keywords internal
"_PACKAGE"

## usethis namespace: start
#' @importFrom lifecycle deprecated
## usethis namespace: end
NULL
9 changes: 1 addition & 8 deletions R/print.R
Original file line number Diff line number Diff line change
@@ -15,18 +15,11 @@ print.pv_data_result <- function(x, ...) {

k <- vapply(names(df), function(y) class(df[, y]), FUN.VALUE = character(1))

dat_level <- c(
patents = "a patent", inventors = "an inventor",
assignees = "an assignee", locations = "a location",
cpc_subsections = "a CPC subsection", uspc_mainclasses = "a USPC main class",
nber_subcategories = "a NBER subcategory"
)

lst <- ifelse("list" %in% k, " (with list column(s) inside) ", " ")

cat(
"#### A list with a single data frame", lst, "on ",
dat_level[[names(x)[1]]], " level:\n\n",
names(x)[1], " level:\n\n",
sep = ""
)

9 changes: 5 additions & 4 deletions R/process-error.R
Original file line number Diff line number Diff line change
@@ -11,18 +11,19 @@ throw_if_loc_error <- function(resp) {
if (num_grps > 2) {
stop2(
"Your request resulted in a 500 error, likely because you have ",
"requested too many fields in your request (the locations endpoint ",
"requested too many fields in your request (the location endpoint ",
"currently has restrictions on the number of fields/groups you can ",
"request). Try slimming down your field list and trying again."
)
}
}
}

# Not sure this is still applicable
#' @noRd
hit_locations_ep <- function(url) {
grepl(
"^https://api.patentsview.org/locations/",
"^https://search.patentsview.org/api/v1/location/",
url,
ignore.case = TRUE
)
@@ -32,7 +33,7 @@ hit_locations_ep <- function(url) {
get_num_groups <- function(url) {
prsd_json_filds <- gsub(".*&f=([^&]*).*", "\\1", utils::URLdecode(url))
fields <- jsonlite::fromJSON(prsd_json_filds)
grps <- fieldsdf[fieldsdf$endpoint == "locations" &
grps <- fieldsdf[fieldsdf$endpoint == "location" &
fieldsdf$field %in% fields, "group"]
length(unique(grps))
}
@@ -52,5 +53,5 @@ xheader_er_or_status <- function(resp) {
#' @noRd
get_x_status <- function(resp) {
headers <- httr::headers(resp)
headers[grepl("x-status-reason", names(headers), ignore.case = TRUE)]
headers[grepl("x-status-reason$", names(headers), ignore.case = TRUE)]
}
23 changes: 18 additions & 5 deletions R/process-resp.R
Original file line number Diff line number Diff line change
@@ -1,37 +1,50 @@
#' @noRd
parse_resp <- function(resp) {
j <- httr::content(resp, as = "text", encoding = "UTF-8")
jsonlite::fromJSON(
j,
simplifyVector = TRUE, simplifyDataFrame = TRUE, simplifyMatrix = TRUE
)
}

#' @noRd
get_request <- function(resp) {
gp <- structure(
list(method = resp$req$method, url = resp$req$url),
class = c("list", "pv_request")
)

if (gp$method == "POST")
if (gp$method == "POST") {
gp$body <- rawToChar(resp$req$options$postfields)
}

gp
}

#' @noRd
get_data <- function(prsd_resp) {
structure(
list(prsd_resp[[1]]),
names = names(prsd_resp[1]),
list(prsd_resp[[4]]),
names = names(prsd_resp[4]),
class = c("list", "pv_data_result")
)
}

#' @noRd
# There used to be an endpoint specific _count ex total_assignee_count
# Now all endpoints return a total_hits attribute
get_query_results <- function(prsd_resp) {
structure(
prsd_resp[grepl("_count", names(prsd_resp))],
prsd_resp["total_hits"],
class = c("list", "pv_query_result")
)
}

#' @noRd
process_resp <- function(resp) {
prsd_resp <- parse_resp(resp)
if (httr::http_error(resp)) throw_er(resp)

prsd_resp <- parse_resp(resp)
request <- get_request(resp)
data <- get_data(prsd_resp)
query_results <- get_query_results(prsd_resp)
2 changes: 1 addition & 1 deletion R/query-dsl.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Design adapated from http://adv-r.had.co.nz/dsl.html
# Design adapted from http://adv-r.had.co.nz/dsl.html

#' @noRd
lapply2 <- function(...) sapply(..., USE.NAMES = TRUE, simplify = FALSE)
Loading