You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As well as an allow-list (only sites on the allow-list are allowed to be proxied with Via) Checkmate also has a separate block-list (any site on the block-list is not allowed to be proxied, even if it's on the allow-list). IIRC the block-list serves a few different purposes:
If a site is on the allow-list it can still be blocked by adding the same site to the block-list as well. This can be easier than removing the site from the allow-list. In fact we don't have a documented process for removing a site from the allow-list and I'm not sure it would be trivial to do so.
School assignments created by instructors using Hypothesis's LMS app (https://github.com/hypothesis/lms) bypass the allow-list: instructors are allowed to create assignments to annotate any URL they want even if it's not on the allow-list. But URLs on the block list will still be blocked even if use in an LMS assignment.
Sub-resources of HTML pages bypass the allow-list. For example if nytimes.com is on the allow-list and someone tries to proxy that page, in order for the page to load properly Via also needs to proxy many requests for JS, CSS, images, fonts, API calls, ads, etc etc. These requests can cover many different URLs and domains. Adding them all the the allow-list would be impractical. For that reason any sub-resource requests made by an allow-listed HTML page are themselves allowed to bypass the allow-list. But if one of those sub-resource URLs is on the block list it will still be blocked.
Problem
The process for adding a site to or removing a site from the blocklist is time-consuming and cumbersome. This process wastes the time of Hypothesis developers.
The blocklist is saved in a text file in an S3 bucket. The developer has to download this file from S3, edit it, and then re-upload it.
Checkmate has a Celery task that downloads the blocklist from S3 and imports it into Checkmate's DB:
Add a <textarea> to Checkmate's admin pages (https://checkmate.hypothes.is/ui/admin) that shows the current contents of the blocklist and allows an admin to edit the blocklist and save their changes directly into Checkmate's DB, without going through an S3 bucket and Celery task.
Context
For additional context see #919.
As well as an allow-list (only sites on the allow-list are allowed to be proxied with Via) Checkmate also has a separate block-list (any site on the block-list is not allowed to be proxied, even if it's on the allow-list). IIRC the block-list serves a few different purposes:
If a site is on the allow-list it can still be blocked by adding the same site to the block-list as well. This can be easier than removing the site from the allow-list. In fact we don't have a documented process for removing a site from the allow-list and I'm not sure it would be trivial to do so.
School assignments created by instructors using Hypothesis's LMS app (https://github.com/hypothesis/lms) bypass the allow-list: instructors are allowed to create assignments to annotate any URL they want even if it's not on the allow-list. But URLs on the block list will still be blocked even if use in an LMS assignment.
Sub-resources of HTML pages bypass the allow-list. For example if
nytimes.com
is on the allow-list and someone tries to proxy that page, in order for the page to load properly Via also needs to proxy many requests for JS, CSS, images, fonts, API calls, ads, etc etc. These requests can cover many different URLs and domains. Adding them all the the allow-list would be impractical. For that reason any sub-resource requests made by an allow-listed HTML page are themselves allowed to bypass the allow-list. But if one of those sub-resource URLs is on the block list it will still be blocked.Problem
The process for adding a site to or removing a site from the blocklist is time-consuming and cumbersome. This process wastes the time of Hypothesis developers.
checkmate/checkmate/celery_async/tasks.py
Lines 13 to 34 in 0918278
This process is documented in How do I block particular URLs in Via?
Solution
Add a
<textarea>
to Checkmate's admin pages (https://checkmate.hypothes.is/ui/admin) that shows the current contents of the blocklist and allows an admin to edit the blocklist and save their changes directly into Checkmate's DB, without going through an S3 bucket and Celery task.Done when
sync_blocklist()
Celery task has been removed from Checkmatesync-blocklist
periodic task has been removed from h-periodicThe text was updated successfully, but these errors were encountered: