Filter by topic and date
Report from the Tools Team Retreat 2025
- Jay Daley IETF Executive Director
- Robert Sparks Senior Director of Information Technology
16 Jun 2025
The IETF Administration LLC (IETF LLC) development team met along with IETF LLC senior leadership for its annual two-day retreat, and with the IETF Chair able to join remotely for key sessions. This post reviews the key inputs and outputs of that retreat, and how they affect the IETF.
While publication of this post has been somewhat delayed, the Tools Team work on many of the topics covered at the retreat is well underway, and subject to ongoing discussion and review at the monthly open Tools Team calls, where the team reports to the community.
Retrospective and risk-based planning
The lead up to this retreat was a difficult 18 months for the Tools Team. The necessary focus on infrastructure issues took longer than anticipated due to multiple unforeseen problems, leaving less time for the backlog of new feature requests that directly impact users of our systems. This has led to frustration among those awaiting new features and also within the team. The team recognised that work was needed to rebuild some trust. Coming into this retreat, those infrastructure issues had been managed down to a level where some resources had switched back to addressing the feature backlog and making visible forward progress.
The retrospective identified some key points:
- A more thorough approach to risk is needed to ensure that progress is not derailed again. This approach was carried through into the rest of the retreat.
- Communications about the work of the team needs to be improved and, in particular, how the work relates to community priorities and expectations.
- The role of the team has changed significantly throughout this process, from a pure development team to a mix of developers and devops; while this was not planned, it was with retrospect foreseeable.
Community engagement and roadmap
The team reviewed the current approach to community engagement, focusing on two key questions:
- Is the team focusing on the things the community needs us to focus on?; and
- Do the mechanisms for community visibility need improvement?
On the first question, the team acknowledged that while dealing with major problems and addressing technical debt that blocks other work is of critical importance, the resources of the team must be better balanced to ensure that some resources are reserved for dealing with tangible community priorities. This should be the situation going forward, but plans were discussed as to how to maintain this balance should any new crisis emerge.
On the second question, the team recognised that the current roadmap was of low utility to the community as it did not provide the answer to that most common question: “When will my feature/change request get implemented?”. The team discussed alternative ideas and took away an action to change this.
As part of this session, the team discussed how the nature of the “Tools Team” has changed over the years from a small core of contractors with multiple active volunteers developing code, to a much larger team of staff developers with volunteer code development limited to a few key gaps. It noted that the Tools Team calls began with the former structure, and migrated in content and form to work with the latter, but organically and without planning. It also noted that the concept of the “Tools Team” as an IETF Team under the GEN area no longer makes sense given the new composition and work distribution.
The team agreed an initial change to the Tools Team calls with the addition of a “big picture” section at the beginning of each Tools Team call. It also agreed to discuss further the positioning of the Tools Team as a GEN area team with those on that call and the IETF Chair as GEN Area Director.
Key work on new features
The team reviewed a number of projects that are developing new features for the community, what was outstanding, and any concerns and resource implications. These were:
- The RPC Tools modernisation. This is now the highest priority for the team as the entire RPC tools chain is no longer fit for purpose and will break badly when RFC 10K is reached in March 2026, and the IETF needs the documents to flow. The four underlying projects are all replacement projects:
- A new underlying database and RPC workflow system (“Purple”);
- A new front end for www.rfc-editor.org (“Red”);
- A new errata system (a like-for-like replacement at this stage); and
- A new visual editor (“DraftForge”) that replaces the set of command line scripts and emacs macros that the RPC currently uses to edit RFCs.
- IESG Dashboards. This has a twofold motivation:
- Consolidate in as few places as possible all the things Area Directors (ADs) need to action; and
- Provide decision support—insight into the document pipeline and what ADs are working on (part of which is so that the IETF Chair can assess current AD workload).
- Email subscription management. The community has requested multiple improvements to management of list subscriptions, which will be implemented via a new Datatracker feature. Unfortunately the Mailman 3 API appears to be very slow on our installation, leading to concerns that there are underlying errors that have not been found.
- BibXML. This service currently has major issues with three datasets broken because the crawlers that contractor wrote for us are broken—IEEE (format has changed), W3C (API has changed) and NIST (format has changed). While new versions of these crawlers are open source and working, we cannot upgrade to them because we are fixed to a version of Relaton that is a year behind the version that these crawlers now use. A plan was developed to overcome this without rewriting the whole system
- IAB Liaisons. The team discussed how to best manage projects like this with rapidly changing requirements.
Architecture
The team reviewed a number of underlying systems architecture areas that are either problems or important dependencies, and the best way forward for these:
- Blob store. One of the major issues last year has been bot traffic triggering on-the-fly document generation so a large set of work was started last year to replace these vulnerable processes with pre-generation and caching, some of which is still ongoing. The team refined its multi-layered architecture for blob delivery that utilises databases, cloud blob storage, and CDN distribution.
- Separating authentication from Datatracker. The current reliance on Datatracker for authentication represents a risk given that it also does many other things, and is vulnerable to the on-the-fly generation resource constraints. The team reviewed a proof-of-concept standalone authentication service using Authentik and agreed on further customisation tests before it could be implemented.
- Unified search. There are multiple services that use search, often of the same or similar data but held in different repositories. This requires reimplementation of the same features. The team reviewed a proof-of-concept of a Typesense-based centralised search service and agreed to use that going forward.
- Queuing service. Currently there is a RabbitMQ service implemented inside Datatracker but the need for queuing is now broader and so the team agreed to pull this out as a first class service, running in a high-availability configuration, which cannot be achieved while it sits inside Datatracker.
- Horizontal scaling of services. The team discussed possible models that could be implemented once Datatracker has been rearchitected to allow multiple copies to run simultaneously.
Operations
The team reviewed a number of areas of operations:
- Cloud infrastructure provider issues. The team identified several showstopper issues with the current cloud provider, including restrictions on exit points IP addresses, random upgrade schedules, no option for VPC level security and the lack of high performance options. The team agreed to migrate most services to Azure to get around these issues.
- Cybersecurity contractor. The team agreed its approach to bidder interviews and the successful bidder has since been chosen and announced.
- EOL dependencies. The team reviewed upcoming EOL dependencies and agreed a management plan.
- Use of AI generated code. The team reiterated its support for the policy that AI generated code is allowed so long as the developer that checks it in, takes full responsibility for it.
If you want to learn more or discuss any of these topics then please consider joining the tools-discuss mailing list and the monthly open Tools Team calls.