- Background
- Security goals
- Shared Storage
- FLEDGE sellers
- FLEDGE bidders
- Details
- Alternatives considered
- Shared Storage in Fenced Frames
This document proposes a set of API changes to enhance the security of the aggregatable reports by making it more difficult for bad actors to interfere with the accuracy of cross-site measurement. Note that a mechanism based on the Private State Tokens API has been proposed for the Attribution Reporting API.
The proposal is separated by the different contexts the Private Aggregation API can be invoked in as the constraints and designs differ substantially.
Our security goals match the Attribution Reporting proposal’s, see here for details. Briefly, our primary security goals are:
- No reports out of thin air
- No replaying reports
We also share the secondary goals:
- Privacy of the invalid traffic (IVT) detector
- Limit the attack scope for bad actors that can bypass IVT detectors
- No report mutation (lower priority)
As with the Attribution Reporting API, we don’t currently prevent reports being created out of thin air, but practical attacks are challenging. More details here.
When triggering a Shared Storage operation that could send an aggregatable report, we propose allowing the site to specify a high-entropy ID from outside the isolated context. This ID would then be embedded unencrypted in the report issued by that worklet operation, e.g. adding the following key to the report:
"context_id" : "example_string",
This would be achieved by adding a new optional parameter to the Shared Storage
run()
and selectURL()
APIs, e.g.:
sharedStorage.run('someOperation', {'privateAggregationConfig': {'contextId': 'example_string'}});
Note that this design does not support report verification for Shared Storage operations run from within a fenced frame. See below for a discussion of that case.
An approach based on Private State Tokens was not proposed as it would add complexity and offer strictly less power than ID-based filtering for invalid traffic filtering.
One key concern with this approach is that the number of reports (with that ID)
could be used to exfiltrate cross-site information. So, when an ID is specified
for a Shared Storage operation, we ensure that a single report is sent no matter
how many calls to sendHistogramReport()
occur (including zero). Instead, this
report would have a variable number of contributions embedded (see batching
proposal).
To avoid leaking the number of contributions, we will need to
pad
the encrypted payload. Additionally, if a context has run out of budget, a
report should still be sent (containing no contributions).
This approach allows for a server to retroactively alter its decisions on report validity. For example, if a new signal for invalid traffic is determined, previous reports with that signal could be marked as invalid too (if they have not yet been processed).
This option easily achieves all of the higher priority security goals
- No reports out of thin air: Any report without an ID, or with an unexpected ID, can be discarded as invalid. These IDs are high-entropy and so can be made infeasible to guess.
- No replaying reports: Each ID can be unique, allowing discarding of reports with a repeated ID.
- Privacy of the invalid traffic (IVT) detector: The valid/invalid decision associated with an ID can be made server-side and need not be revealed to the client. In fact, the decision need not happen immediately, see Allows retrospective filtering.
- Limit the attack scope for bad actors that can bypass IVT detectors: By using a unique ID, server-side checks could be added to ensure that the metadata fields in the report match the expected values.
- No report mutation: Only partially addressed. The server can verify that plaintext fields in the report “reasonably match” their expectations. This does not prevent mutation of the payload or fields that could have multiple reasonable values (e.g. small changes to scheduled_report_time).
Adding a high-entropy ID may allow for timing attacks. E.g. if a report is not issued until after a Shared Storage operation completes, the reporting origin could in principle use the scheduled reporting time to learn something about how long the operation took to run. This is currently mitigated by a randomized delay, but we also plan to add a timeout in Shared Storage, see Reduced delay below.
Adding a high-entropy ID also allows for the reports to be arbitrarily partitioned. However, by making the count of reports with the given ID deterministic, we avoid the major concern this introduces (non-noisy leaks through counts). We do not consider the ability to process only a chosen subset of reports to be a privacy concern, given other protections (e.g. adding noise to the summary report).
Currently, reports are delayed by up to one hour to avoid revealing an association between the issued reports and the original context. As this approach explicitly reveals this association (with other mitigations), we can shorten these delays. We plan to impose a 5 second timeout on Shared Storage operations making contributions. We then plan to wait until the timeout to send a report, even if execution finishes early. This avoids leaking information through how long the operation took to run. We also considered instead keeping a shorter randomized delay (e.g. up to 1 minute), but that did not seem necessary.
We propose a very similar mechanism for FLEDGE seller reporting as for Shared Storage worklets. That is, we’ll allow the site to specify a high-entropy ID from outside the isolated context and this ID would then be embedded unencrypted in the report issued by that seller within that auction, e.g.:
"context_id" : "example_string",
The seller would specify this ID through an optional parameter into the
auctionConfig
, e.g.:
const myAuctionConfig = {
...
'privateAggregationConfig': {
'contextId': 'example_string',
}
};
const auctionResultPromise = navigator.runAdAuction(myAuctionConfig);
See the Shared Storage section for more details.
Like for shared storage, adding a high-entropy ID could allow for timing attacks as the reporting origin could use the scheduled reporting time to learn something about when the report was triggered. This is partially mitigated by the existing randomized reporting delay (10-60 min) imposed as FLEDGE auctions impose small timeouts (e.g. 0.5 s). As discussed above for Shared Storage, we avoid concerns about partitioning by making the number of reports deterministic (and other protections).
Note: Unlike the above sections which offer relatively straightforward approaches, this section is highly complex and nuanced. Feedback is appreciated!
We can’t easily use a contextual ID for the FLEDGE bidder case as the existence of a bidder in a particular auction is inherently cross-site data, see below. So, our options are more limited and we focus on mechanisms using Private State Tokens.
However, note also that there are no existing network requests that we can easily reuse for token issuance. While there is a trusted signals fetch, that is intentionally uncredentialed. Much like using an ID, we can’t just add a network request for each bidder as that would reveal cross-site data.
So, we handle token issuance by adding a new optional parameter to
runAdAuction()
, e.g.:
const myAuctionConfig = {
...
'privateAggregationConfig': {
...
'tokenIssuanceURLs': [
'https://origin1.example/path?signal1=abc,signal2=def',
'https://origin2.example/some-other-path',
'https://origin3.example/etc',
],
// How many tokens to request from each listed issuer. Optional, defaults to
// each issuer's batch size.
'numTokensPerIssuer': 10,
}
}
const auctionResultPromise = navigator.runAdAuction(myAuctionConfig);
This would trigger a token issuance request for each listed origin (see below). Each token would be redeemed along with any later reports’ network requests (see below).
If this token successfully verifies, then the reporting origin has a guarantee
that the report was associated with a runAdAuction()
request that was signed.
This requires the addition of a new network request for each listed token issuer,
emitted when runAdAuction
is invoked.
As the presence of bidders in an auction is inherently cross-site, we require listing all possible token issuers from the publisher site. The user agent will then unconditionally perform a token issuance request for each listed token issuer to avoid cross-site leakage, i.e. even if the issuer is not used by any bidder in the auction.
The user agent will also need to impose a limit on the number of token issuers listed in each auction to avoid too many network requests being added. Practically, this means interest group owners will likely need to use the delegation mechanism.
A single bidder origin may own multiple interest groups that a user is enrolled in. Additionally, multiple interest group owner origins may use the same token issuer (due to delegation). In these cases, the interest groups will have to share the tokens issued.
In the case of multiple owner origins using the same token issuer, tokens can’t be reused as we don’t want to reveal that both interest group owners were present in the same auction (for the same user). However, multiple tokens can be requested from a single token issuer to mitigate this. If not enough tokens were issued, some reports will be sent unattested.
In the case of multiple interest groups with the same owner, the histogram
contributions should be
batched
together into a single report, avoiding the need to use multiple tokens.
However, the extended reporting
plans
for Private Aggregation allow for fenced frames to trigger reports indirectly
with window.fence.reportPrivateAggregationEvent()
. This could occur
arbitrarily later, so we may need to ignore events triggered too long after the
auction (e.g. after 1 hour). We could consider replacing the randomized delay
with simply waiting until the timeout, even if execution finishes early.
Interest group owners will be able to delegate their token issuance by hosting a
.well-known
file which specifies the origin to delegate to. This will be
optional (i.e. each origin can choose itself as its token issuer), but note that
all origins choosing themselves would likely exceed limits, see Need to limit
the list of token issuers above.
To ensure that the same file is served across different browser instances, the user agent vendor may re-distribute these files through a separate mechanism. Further, to ensure that the origin does not change frequently, the user agent could impose some limits on the rotation frequency.
Token issuance network requests will be sent to the specified token issuer URLs.
The URL path and query string allows for metadata to be embedded by the seller,
but note that only the token issuance origin is used for
delegation. Each request will have a
Sec-Private-Aggregation-Private-State-Token
header with one or more blinded
messages (each of which embeds a report_id) according to the number of tokens
requested. If the number of tokens is not requested, the token issuer’s batch
size
will be used. The token issuer will inspect the request and decide whether it is
valid, i.e. whether the issuer suspects it is coming from a real, honest client
and should therefore be allowed to generate aggregatable reports.
If the request is considered invalid and hence shouldn’t be taken into
account to calculate aggregate measurement results, the origin should respond
without adding a Sec-Private-Aggregation-Private-State-Token
response header.
If this header is omitted or is not valid, the browser will proceed normally,
but any report generated will not contain the report verification header. Note:
more advanced deployments can consider issuing an "invalid" token using private
metadata to avoid the client learning the detection result. See privacy of the
IVT detector in Security considerations for more
details.
If the request is considered valid, the origin should add a
Sec-Private-Aggregation-Private-State-Token
header with a blind token (the
blind signature over the blinded message) for each blinded message included in
the original request. The origin could also return a token for only a subset of
the blinded messages if it wishes to limit the number of tokens issued to limit
exfiltration risk.
Internally, the browser will store the token associated with any generated report until it is sent.
If a token is allocated to an aggregatable
report, it will be sent along with the report’s request in the form of a new
request header Sec-Private-Aggregation-Private-State-Token
. If this token is
successfully verified, then the reporting origin has a guarantee that the report
was associated with a previous request that was signed.
Note: unlike the basic Private State Token API (which enables conveying tokens from one site to another), there are no redemption limits for Private Aggregation API integration. See Privacy considerations for discussion of other mitigations.
This option easily achieves the primary security goals plus some secondary security goals. The considerations largely match the Attribution Reporting proposal’s given the similar token-based approach, see here for details.
Much like Attribution Reporting’s proposal, this integration is intended to be as privacy-neutral as possible. In particular, we want to avoid cross-site information leakage. While each token’s issuance occurs using a request from a single site, this token – including its metadata, or no token if none was issued – will later be sent with a report from a bidder. The identity of which bidders participated in an auction is cross-site data.
If the count of reports is sensitive, this partitioning could amplify counting
attacks. However, note that reports can already be partitioned by the
scheduled_report_time
and api
fields. There are designs for protecting the
count of encrypted reports
to mitigate or eliminate the risk of counting attacks. These designs target the
Attribution Reporting API, but could be adapted for Private Aggregation. Still,
with less extreme mitigations, there are privacy benefits to reducing the
partitioning available.
For the initial design, we do not plan to implement any changes to the Private State Token protocol’s public/private metadata bits. So, each token will have six buckets of metadata embedded. Further, each report could either have a token or no token, allowing up to 7 total possibilities (~2.8 bits). This would therefore allow the reporting origin to partition its reports into 7 buckets.
We could consider mitigations in the future. For example:
- restricting the public/private metadata to one bucket – or just a single private bit to avoid an invalid traffic oracle.
- refusing to send reports to reporting origins using report verification if no token was available/issued.
- sending null reports with some frequency for buyers that delegate to an issuer who issued tokens.
Alternatively, we could associate any trust signals available at the
joinAdInterestGroup()
call with reports later sent from a bidder under that
interest group.
Token issuance could be handled by adding a new optional parameter to
joinAdInterestGroup()
, e.g.:
const myGroup = {
...
'privateAggregationTokens': 10, // number of tokens to request
}
const joinPromise = navigator.joinAdInterestGroup(myGroup, 30 * kSecsPerDay);
This would trigger a token issuance request (see above) with the requested number of blinded messages. Each resulting token would be redeemed along with the later report’s network request (see above).
If the token successfully verifies, then the reporting origin has a guarantee
that the report was associated with a previous joinAdInterestGroup()
request
that was signed.
This requires the addition of one new network request at joinAdInterestGroup()
time.
This approach uses a different security model to Attribution Reporting’s, with a potentially large time delay between token issuance and use.
Due to this large time delay between token issuance and last possible use, it will be difficult to decide on the number of tokens to issue. If too few are issued, later auctions may not be able to be attested. Issuing too many may degrade performance, e.g. unnecessarily using storage, and may exacerbate token exfiltration attacks.
This approach requires Private State Tokens to be persisted for later use. This store will need to be separate from the existing Private State Token store. Note also that key rotations will cause issues here, as any tokens issued before the rotation would not be able to be used after the rotation.
We could combine the functionality of both the proposal and the above alternative. There are a few different ways we could do this.
We could allow for both mechanisms to be independently implemented. Separate headers could be used to distinguish between the two. This would allow for maximum flexibility, but comes at a possible complexity and privacy cost.
Privacy risk: By supporting two separate token fields, the number of possible token states is ‘squared’. That is, without additional mitigations, adding a second Private State Token field would increase the number of states from 7 to 49 (~5.6 bits). This partitioning would allow for amplified counting attacks unless other mitigates are put in place, see above.
We could allow each origin to pick one of the two mechanisms, using a similar mechanism to picking a token issuer. Any attempt to use the other mechanism would be ignored or cause an error.
We could allow for both mechanisms, but only allow one token to be bound to each report. If a token is available via each mechanism, the browser will prefer one (e.g. the runAdAuction() associated token).
Instead of using Private State Tokens, we could also use a contextual ID here.
But, to avoid a cross-site leak, this would require that a report be sent to
each origin listed in interestGroupBuyers
, even if that bidder did not
actually participate in the auction. This could lead to a large number of (null)
reports, which would pose a performance concern.
Ideally for performance, the user agent would be able to only request a token for reports that are actually going to be sent. But, that would inherently leak cross-site data, which we can't allow. But it might be possible to design a trusted server architecture that can perform the required invalid traffic determination and token issuance while ensuring that any cross-site data is not persisted. This is not feasible in the short term, however, requiring significant design and exploration.
When a shared storage operation is run from a fenced frame instead of a document, we can no longer set a contextual ID. Any cross-site information the fenced frame has could be embedded in the context ID, so the ability to set it is disabled.
Instead, we propose allowing a Private State Token to be bound to the FencedFrameConfig output of a FLEDGE auction. We would reuse the FLEDGE bidder mechanism chosen above and take an additional token from the same source for this purpose. When the shared storage worklet triggers a report to be sent, any context ID specified would be ignored and the token would be used instead.
As it uses the same token source, most details match the FLEDGE bidder discussion (see above). Additional considerations are listed below.
This proposal does not currently support cross-origin subframes or nested fenced frames within the top-level fenced frame.
As discussed above, adding a token allows reports to be partitioned, which exacerbates the risk of a counting attack.
This design also implicitly reveals whether a Shared Storage worklet’s aggregatable report came from an operation run by a document or a fenced frame. This may allow for further partitioning, but is unlikely to be a significant issue.
Further design work is needed to extend this mechanism to fenced frames
rendering the output of a selectURL()
operation.