Skip to content
This repository has been archived by the owner on Mar 16, 2023. It is now read-only.

Sites recording a user entering and leaving a sensitive category #77

Closed
dmarti opened this issue Mar 22, 2021 · 3 comments
Closed

Sites recording a user entering and leaving a sensitive category #77

dmarti opened this issue Mar 22, 2021 · 3 comments

Comments

@dmarti
Copy link
Contributor

dmarti commented Mar 22, 2021

If a site has a logged-in user or sets a first-party cookie, the site can record cohort over time (see Longitudinal Privacy). In the event that a user is in a normal cohort for several weeks, then a blocked (sensitive) cohort, then a normal cohort again, the site can see that either

  • the user chose to reconfigure their browser to turn off FLoC, and then turned it back on again
  • the user has manually cleared history or other browser state that resulted in resetting FLoC
  • the browser has not been used too little in the previous time interval
  • the user experienced a short-term interest in sensitive topics

Depending on context and other information available to the site, the entering and leaving of a sensitive cohort may reveal specific information about a user's activities during a certain time period to the site. (for example, users who participated in a short-term labor action regarding work-related medical conditions.)

@michaelkleber
Copy link
Collaborator

There are many different reasons why a person might have no FLoC at a particular time. You mention two of them, but other examples including a person visiting too few web sites during some period of time, or a person clearing some cookie or history.

Also, even if you do restrict your attention to the sensitive-category situation, recall that most people in a FLoC that is dropped for sensitivity reasons have not actually visited any sensitive page. If there is a topic that is sensitive and is visited by 4% of the population, and we use t-closeness with t=0.1, then we will stop using a cohort if 15% of people in that cohort have visited a page that touches that sensitive topic. But still 85% of the people in the cohort have not visited such a page.

@dmarti
Copy link
Contributor Author

dmarti commented Mar 22, 2021

It looks like we're thinking about two different threat models here.

  • Sites can identify a user as a likely member of a group that will receive extra scrutiny or other adverse consequences
  • Sites can determine to some confidence level whether a particular page or page topic has appeared in the user's history

This is not about detecting which pages appear in history (see #40 and discussion in the TAG review of FLoC ). Not every member of a group is going to have pages related to a sensitive topic in their browser history in any given week.

Example: an employer has 200 employees, of which half are pro-union. The employer logs everyone's cohort, and notices that about 100 employees were without FLoC the week of a union meeting. Only 15 of the members of the pro-union cohort actually visited the union web site, but the entire cohort was correctly tagged as sensitive. The employer in this case is concerned about pro-union qualities in the user—that user's similarity to others in the pro-union cohort—not the user's web history as such, and can take action against the entire 100.

In a real situation, of course, there will be more than 2 cohorts represented among the employees, and the employer (or a specialized firm) will need to do more data collection and more comparisons between dates of union events and dates of disappearing cohort data.

@dmarti
Copy link
Contributor Author

dmarti commented Apr 14, 2021

Related issue: #100. (That issue covers longitudinal tracking by changes in observed real cohort, this issue covers longitudinal tracking by observing users entering and leaving the "null cohort")

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants