Filter by topic and date
Enabling Publishers to Express Preferences for AI Crawlers: An Update on the AIPREF Working Group
- Mark Nottingham AIPREF Working Group Co-chair
- Suresh Krishnan AIPREF Working Group Co-chair
30 Oct 2025
In January, the IETF chartered the AI Preferences (AIPREF) Working Group to make it easier to express how AI models should use Internet content.
By June, we reported that the group was near completion of its chartered goal to deliver specifications by September this year.
Accordingly, we made a Working Group Last Call on the documents in September. The response was not one we anticipated: several participants raised fundamental issues about the nature of the Internet-Drafts (I-Ds), forcing the group to reconsider its approach.
What We Found
Although a variety of issues were raised, a particularly relevant observation was that AI is at its heart a technique, not a purpose. So, stating preferences about whether AI can be used with a given online asset is similar to stating a preference as to whether that content can be used with (for example) the Python programming language, or a linked list.
This observation leads to the realisation that it isn’t terribly useful to let publishers state a preference for what tools a programmer might use—especially as AI becomes pervasive throughout computing, for purposes such as summarisation, accessibility, and translation. This is especially true for online search engines, which use AI pervasively throughout many parts of their "pipeline" for data, regardless of how that data is presented to end users.
However, it is still very useful and relevant to give publishers input into what those tools are used for—the purpose of the processing. That purpose might be the training of an AI model, or various presentations to users in response to a search query. Giving publishers ways to express preferences about how their assets are used in these activities allows them to achieve their goals without unnecessarily linking those preference statements to a particular technology. That helps to avoid unintentional limits on innovation—developers will be able to select the best tools for the job while still respecting publisher preferences.
A New Approach
The result (after extensive discussion and collaboration at our Zürich interim meeting) is an updated vocabulary that defines preferences for:
- Foundation Model Production: using an asset to train or fine-tune a foundation model
- AI Output: Using an asset in an AI-based system to generate outputs that are presented to clients of that system.
- Search: AI Output, so long as the output contains a link to where the data came from, and the data is only represented by verbatim excerpts (not summaries). Note that the label “search” might change.
Together, these preferences are intended to give publishers the ability to state how they’d like their assets to be used. In particular, they should allow a publisher to state that they don’t want their assets used to train foundation models (which in itself is a distinct purpose), but still want to show up in search experiences. And, if they don’t want their assets to be summarised or otherwise transformed, they can use the "search" preference to indicate this.
See the draft vocabulary for more details. Note also that there are several other open issues on the draft, and this is only a proposal, albeit one informed by discussion to date – it does not have consensus, doesn’t address some known problems, and hasn’t yet been widely reviewed.
Next Steps
This approach is currently a proposal to the Working Group; the editors have proposed it in a draft for consideration, but it does not yet have consensus. Accordingly, we’re looking for input regarding it – in particular, from publishers, AI vendors, and others who might be affected. If you have feedback, please give it on the mailing list or at our upcoming meeting during IETF 124 Montreal.