Table of Contents
- John Delaney ([email protected])
- Arpana Hosabettu ([email protected])
- Nan Lin ([email protected])
The transitional debugging reports, which are currently supported by the Attribution Reporting API to facilitate testing, will be deprecated alongside third-party cookie deprecation.
Here we propose the aggregate debug reporting framework to allow developers to measure the performance of the Attribution Reporting API post third-party cookie deprecation. The aggregate debug reporting utilizes the mechanism introduced in the Aggregate Attribution Reporting to report debug data in aggregate. To prevent leakage, the cross-site data in the aggregatable reports would be encrypted to ensure it can only be processed by the aggregation service. During processing, this service will add noise and impose limits on how many queries can be performed.
The source registration and trigger registration will accept an optional dictionary field:
{
..., // existing fields
"aggregatable_debug_reporting": {
"budget": 1024, // required for sources, irrelevant for triggers
"key_piece": "0x120", // required, source- or trigger-side key piece
"debug_data": [
{
"types": ["source-destination-limit", "source-destination-rate-limit"], // required to be present and non-empty
"key_piece": "0x1", // required
"value": 123 // required
},
{
"types": ["unspecified"],
"key_piece": "0x7",
"value": 789
}
], // defaults to [], i.e. doesn't opt-in to any debug types
"aggregation_coordinator_origin": "https://publickeyservice.msmt.aws.privacysandboxservices.com" // defaults to implementation-defined default origin
}
}
The source registration can pre-allocate budget
from the total L1 budget
(L1 = 65536) for debug reporting, and the remaining will be used for aggregate
attribution reporting.
The configuration accepts an optional list field debug_data
to allow
developers to specify the aggregation key piece as a hex-string and the
corresponding histogram value for debug types they opt-in to receiving.
When the unspecified
value is set, any non-explicitly specified debug
types will be reported with the corresponding key piece and value; otherwise
they will not be reported. The list of supported debug types will be documented
in the specification.
The configuration also accepts an optional string field
aggregation_coordinator_origin
to allow developers to specify the deployment
option for the aggregation service supported by the browser. This defaults to
the implementation-defined default origin if omitted.
The final histogram bucket key is defined as a combination (bitwise OR) of the source-side key piece if available, the trigger-side key piece if available, and the key piece for the corresponding debug type.
Final keys will be restricted to a maximum of 128 bits. This means that hex strings in the JSON must be limited to at most 32 digits.
Note: Aggregate debug reporting is not supported for registrations inside a fenced frame tree to avoid breaking the privacy model of fenced frames.
Aggregatable debug reports will look very similar to
aggregatable attribution reports.
They will be reported via HTTP POST
to the reporting origin at the path
.well-known/attribution-reporting/debug/report-aggregate-debug
.
The report itself does not contain histogram contributions in the clear. Rather, the report embeds them in an encrypted payload that can only be read by a trusted aggregation service known to the browser. The report will be JSON-encoded with the following schema:
{
// Info that the aggregation services also need encoded in JSON
// for use with AEAD.
"shared_info": "{\"api\":\"attribution-reporting-debug\",\"attribution_destination\":\"https://advertiser.example\",\"report_id\":\"[UUID]\",\"reporting_origin\":\"https://reporter.example\",\"scheduled_report_time\":\"[timestamp in seconds]\",\"version\":\"[api version]\"}",
// Supports a list of payloads for future extensibility if multiple coordinators
// are necessary. Currently only supports a single coordinator.
"aggregation_service_payloads": [
{
"payload": "[base64-encoded HPKE encrypted data readable only by the aggregation service]",
"key_id": "[string identifying public key used to encrypt payload]",
},
],
// The deployment option for the aggregation service.
"aggregation_coordinator_origin": "https://publickeyservice.msmt.aws.privacysandboxservices.com"
}
Note: If the source registration specifies a list of destinations, the first one in lexicographical order will be used for debug reporting.
The payload
should be a CBOR map encrypted via
HPKE and then base64-
encoded. The map will have the following structure:
{
"operation": "histogram", // Allows for the service to support other operations in the future
"data": [{
"bucket": <bucket, encoded as a 16-byte (i.e. 128-bit) big-endian bytestring>,
"value": <value, encoded as a 4-byte (i.e. 32-bit) big-endian bytestring>
}, ...]
}
The browser may encode multiple contributions in the same payload. To avoid revealing the number of contributions in the payload through its encrypted size, the browser should pad the list of payloads with null (zero value) contributions up to the maximum, which is 2 for aggregatable debug reports.
We bound the contributions any source can make to a histogram. This bound is characterized by a single parameter: L1, the maximum sum of the contributions (values) across all buckets for a given source, including aggregate attribution reporting and debug reporting. L1 refers to the L1 sensitivity / norm of the histogram contributions per source. The budget for debug reporting is pre-allocated at source registration time, and exceeding the budget will cause future contributions to silently drop.
The aggregation service will add noise to the summary reports to achieve differential privacy, e.g. by using Laplace noise scaled based on the L1 sensitivity and a desired privacy parameter epsilon. The effective epsilon needs to be scaled based on the budget allocated for debug reporting.
Note: The L1 budget (65536) is shared between the aggregatable attribution reports and the aggregatable debug reports that are associated with the same source registration.
To fully protect the total count of debug reports, the browser will unconditionally send a report on every attribution registration that opts-in to aggregate debug reporting. A null report will be sent in the case that the attribution registration did not generate an aggregatable debug report. Null reports will not check or affect rate limits.
To reduce complexity, the source registration time will always be excluded from the aggregatable debug reports.
Additionally, this proposal also prevents attackers from inferring user data from the generated debug reports.
Note: If the reporting origins are concerned about the potentially large number of reports, they may consider opting-in to aggregate debug reporting with sampling on a per advertiser basis for debugging and monitoring.
To limit the amount of user-identity leakage, the browser should throttle the amount of total information sent through this API in a given time period for a user. The browser could bound the contributions that any <top-level site, reporting site> tuple can make to the histogram within a rolling time window. If this threshold is hit, the browser will stop scheduling aggregatable debug reports for the rest of the time period for attribution registrations matching that tuple.
Strawman: L1 bound of 65536 per <top-level site, reporting site> per day.
Additionally, to mitigate the attack in which multiple reporting sites collude with each other to infer user data from the debug reports, the browser can also bound the contributions that any top-level site can make to the histogram within a rolling time window.
Strawman: L1 bound of 220 = 1048576 per top-level site per day.
There will also be a maximum limit of 5 aggregatable debug reports per source to limit the cross-site leakage on a per-source level.
As the count of reports is deterministic with the introduction of null reports and the report does not contain cross-site data in the clear, there is no privacy concern with sending these reports immediately without delay.
We may consider allowing source and trigger registrations to specify a high-entropy debug context ID for aggregate debug reporting, which will be embedded unencrypted in the aggregatable debug report. This would uniquely tie every aggregatable debug report to the source or trigger context that generated it. Because the IDs are high entropy (and thus difficult to guess), they could serve the purpose of anonymous tokens.
This design is aligned with the trigger context ID proposal in aggregate attribution reporting.
The aggregate debug reporting mechanism should be extendable to aggregate error reporting for the Private Aggregation API.
The details of the approach will be explored in a separate GitHub issue.