Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider putting the file in /.well-known/consent-requests.json #9

Open
robrwo opened this issue Jun 18, 2021 · 18 comments
Open

Consider putting the file in /.well-known/consent-requests.json #9

robrwo opened this issue Jun 18, 2021 · 18 comments

Comments

@robrwo
Copy link

robrwo commented Jun 18, 2021

Adding a Link header to HTTP responses can add 100-200 bytes for each response, which can affect users on slower mobile connections.

Most users will likely choose an option when they first visit a website and not need to care to see this again, so there's no reason to send the header. (Conditionally sending the header when the consent metadata has changed or the user has not responded involves tracking the user, which the user may not have consented for.)

It makes more sense for a user agent to request a well-known file if it does not have consent configuration for a site.

@michael-oneill
Copy link

michael-oneill commented Jun 18, 2021

Yes, a /.well-known file would be a good idea. There will likely be a common per-origin resource for privacy aspects, combining common CSP and other headers, first party sets etc., and this would be a good place for this. Browsers would then only have one common resource to fetch and cache.

@coolharsh55
Copy link
Contributor

Wouldn't this be better declared in the webpage metadata? E.g. as Example 13

<!doctype html>
<html>
  <head>
    <link href="/our-consent-requests.json" rel="consent-requests" />
    …

@robrwo
Copy link
Author

robrwo commented Jun 18, 2021

Wouldn't this be better declared in the webpage metadata?

Not every document is a web-page, e.g. images. There might even be a separate web server that serves images, or a CDN that servers scripts and stylesheets, such as jQuery.

The well-known URL could be linked to from web page HTTP response headers. But it won't need to be.

@robrwo
Copy link
Author

robrwo commented Jun 18, 2021

Also note in Section 7.1.1 (Making zero requests), the lack of a file at this location indicates that the website is not asking for consent.

@da2x
Copy link

da2x commented Jun 18, 2021

Well-known URI is specified in RFC 5785. This mechanism is much better suited for this than requiring every website to advertise a Link response header.

Also note in Section 7.1.1 (Making zero requests), the lack of a file at this location indicates that the website is not asking for consent.

Replying with an empty file at a well-known location, or replying with HTTP 204 No Content would fulfill the same purpose.

@robrwo
Copy link
Author

robrwo commented Jun 21, 2021

Replying with an empty file at a well-known location

Also replying with 404 (Not Found).

@da2x
Copy link

da2x commented Jun 21, 2021

Replying with an empty file at a well-known location

Also replying with 404 (Not Found).

No, that is a very clear signal that the server doesn’t understand or support the protocol.

@robrwo
Copy link
Author

robrwo commented Jun 21, 2021

Replying with an empty file at a well-known location

Also replying with 404 (Not Found).

No, that is a very clear signal that the server doesn’t understand or support the protocol.

Which would signal that the website is not asking consent.

@da2x
Copy link

da2x commented Jun 21, 2021

Which would signal that the website is not asking consent.

“I’m not asking consent for anything because I don’t do anything that requires me to ask for it” is not the same as “I don’t understand – here’s a generic error.”

@coolharsh55
Copy link
Contributor

Also replying with 404 (Not Found).

A 204 would be more appropriate. It represents success and No Content.

@gb-noyb
Copy link
Collaborator

gb-noyb commented Jul 8, 2021

To the discussion above: we indeed thought it would be good if websites have a way to show they support the protocol, even when not requesting any consent.

To the original question, whether to use Link header or well-known file, we have considered both options, but ended up with the link header for multiple reasons:

  • The link header indicates that the website supports ADPC. Without it, the browser would have to fetch the consent requests file for every website it visits, even for websites that do not support ADPC. Besides that this also wastes bytes and packets, this seems like a bad protocol practice, reminding of the quaint old habit of fetching /favicon.ico.

  • The .well-known/ convention was introduced to avoid ‘polluting’ the URI space for cases where one cannot use links (e.g. autoconfiguration for an email client). But if you can use links, there seems no need for this work-around, and we better just use links. The introduction of RFC 8615 is worth reading; here the first paragraph:

    Some applications on the Web require the discovery of information
    about an origin [RFC6454] (sometimes called "site-wide metadata")
    before making a request. For example, the Robots Exclusion Protocol
    (http://www.robotstxt.org) specifies a way for automated processes to
    obtain permission to access resources; likewise, the Platform for
    Privacy Preferences [P3P] tells user agents how to discover privacy
    policy before interacting with an origin server.

    The requirement to discover “before interacting with an origin server”, does not apply to our case: I suppose that (unlike with P3P) nobody will insist on reading consent requests, and adding another round trip, before visiting a website.

  • Also conceptually, it seems intuitive that the website initiates the interaction, as it is the website that requests consent. With the .well-known approach, the visitor would be asking the website whether the website wants to request consent.

  • Using a link gives flexibility to which consent is requested: for example, different consent request files could be used for people connecting from different jurisdictions, or to users that are logged in, etc.

    • Note that flexibility can also be considered a downside: customising requests to individuals might not seem desirable. E.g. in the privacy considerations section we discuss that malign websites could individualise consent request to used them for user tracking. However, using a fixed .well-known URL for this JSON file does not ensure the file contents are fixed: a malign website could just return a different, customised file on each new request.
  • Note that there is no need to add the header to subresources, as it is ignored for those (an image cannot request consent).

@robrwo
Copy link
Author

robrwo commented Jul 27, 2021

Also conceptually, it seems intuitive that the website initiates the interaction, as it is the website that requests consent.

The website doesn't initiate interaction. The user agent does.

Requiring the website to include a link has several problems:

  • This may require access to web server configuration, which is not available to all site authors.

  • Software developers will need to modify their web applications to support this. Most will probably wait for plugins for their web frameworks to be written that supports these, especially for a new or immature protocol.

  • This will increase the HTTP response content size, which developers will want to avoid since that can have adverse effects on the site's performance, especially for mobile users and users in less developed parts of the world. This can affect page rankings in Google.

@Spacefish
Copy link

I like the .well-known approach, as in the end it will only affect clients which really want to do consent management.

  • So "normal" clients would just request the webpage with every tracking enabled -> nothing changes
  • Client´s that do support the consent protocol and have it enabled, would first request the .well-known file depending on response they would show a consent box to the user and cache the consent locally and afterwards request the page itself with the consent headers set.
  • Client´s that do have consent enabled and configured, either know which consent the user want´s to give and send them in the HTTP headers or they just send "withdraw=*" if they are configured to don´t give any consent with every request.

So there is only one case where the .well-known file needs to be requested before the page is requested:

  • If the user is new to the website (no consents stored locally)
  • And they want to explicitly be asked for consent
  • And their client supports consent management at all

Client´s without consent management won´t have the additional overhead of an extra header or a <link meta.. HTML tag

@gb-noyb
Copy link
Collaborator

gb-noyb commented Jan 5, 2022

First, to @robrwo’s earlier points above:

  • This may require access to web server configuration, which is not available to all site authors.

Firstly, one can also use the equivalent <link> tag in the page instead of a Link header, see section 7.1. But presumably if one cannot access the headers, one would anyway go for the javascript-based approach, section 8, that is explicitly made to cater for such cases.

  • Software developers will need to modify their web applications to support this. Most will probably wait for plugins for their web frameworks to be written that supports these, especially for a new or immature protocol.

Indeed. But note that most websites already use a plugin or third-party Consent Management Platform (CMP) to ask for consent and handle the responses. Presumably those would be the first to implement the ADPC protocol, and that seems an acceptable adoption path. In any case, this seems independent of the .well-known vs link issue.

  • This will increase the HTTP response content size, which developers will want to avoid since that can have adverse effects on the site's performance, especially for mobile users and users in less developed parts of the world. This can affect page rankings in Google.

I will comment on overhead issues separately, below.

@gb-noyb
Copy link
Collaborator

gb-noyb commented Jan 5, 2022

To some specific points by @Spacefish above:

  • So "normal" clients would just request the webpage with every tracking enabled -> nothing changes

I hope you meant “disabled”; otherwise we should start again from the problem description: the situation is that websites need to ask a user’s consent before they can legally ‘track’ them. (using ‘tracking’ as shorthand for any personal data processing for which they use consent as the legal basis)

  • Client´s that do support the consent protocol and have it enabled, would first request the .well-known file depending on response they would show a consent box to the user and cache the consent locally and afterwards request the page itself with the consent headers set.

Adding an extra round-trip before visiting a website sounds like something nobody is waiting for. The .well-known file could be requested in parallel though, or even after the page visit; which is similar to what happens with the link-based approach.

@gb-noyb
Copy link
Collaborator

gb-noyb commented Jan 5, 2022

And overall to the above discussion: I appreciate concerns about reducing overhead; the “website obesity crisis” is still rampant, with megabytes of data being transferred from servers to clients needlessly. A link header/tag is perhaps just a hundred bytes, but I agree we still should not add a hundred bytes to every web transaction without good reason. We should especially not force such an overhead on otherwise efficient websites that weigh just a few kilobytes.

However there seem to be some misconceptions and misguided comparisons here. Firstly, a website does not have to add this link if it does not want to ask for consent to anything; and if does want to ask for consent, adding one link is a lot more efficient than the currently used alternative: adding html, javascript and css for a consent banner, which easily adds tens or hundreds of kilobytes (if anyone has measurements, feel free to share). Presumably websites supporting ADPC will still want to add this in-page banner as a fall-back, but adding this one link seems insignificant in comparison. As the website can avoid loading this in-page banner for clients that support ADPC, this could in fact be a huge win in reducing overhead.

Moreover, the .well-known based approach also creates network overhead, as ADPC-based clients would make needless requests to websites that do not use ADPC. Plenty websites have no need to ask for consent to every visitor, and quite likely this number will increase as companies adapt to the new data protection regulations (GitHub’s reasoning is a great example).

In any case, thinking from the perspective that “users want to be asked for consent” seems unhelpful. Usually websites want to ask users for consent more than vice versa, and if it is not through ADPC they will use other ways to do so.

I will keep this issue open, because using a .well-known file is a valid option to consider, and as stated before we did consider this; but I hope we can have a more informed and intellectually honest discussion that compares the alternatives.

@robrwo
Copy link
Author

robrwo commented Jan 5, 2022

Indeed. But note that most websites already use a plugin or third-party Consent Management Platform (CMP) to ask for consent and handle the responses.

Most? No, a lot of websites implement something for GDPR etc but that doesn't mean they are using a plugin for it.

@robrwo
Copy link
Author

robrwo commented Jan 5, 2022

Adding an extra round-trip before visiting a website sounds like something nobody is waiting for. The .well-known file could be requested in parallel though, or even after the page visit; which is similar to what happens with the link-based approach.

It isn't really an "extra round trip". HTTP/1.1 and later support using a single connection for multiple requests, which reduces most of the overhead. And that will more than make up for sending an extra link (either as a header or embedded in HTML) for every request. (Also note that embedding a link in HTML is useless for non-HTML documents.)

adding one link is a lot more efficient than the currently used alternative: adding html, javascript and css for a consent banner, which easily adds tens or hundreds of kilobytes (if anyone has measurements, feel free to share)

No, because websites will still need to support the banners until almost all users have upgraded to browsers that supports this. And that will take years. (I still have 1-3% of users on a site I maintain that use very old web browsers. For whatever reason, users are unable or unwilling to upgrade.)

adding this one link seems insignificant in comparison.

A majority of users are for most websites are using mobile devices, and a significant portion of them are on mobile networks or from countries with poor internet connectivity. That hundred bytes is enough to force a page to take an extra packet which means a longer delay to displaying the page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants