Lexicon publication has been a hit with developers, and powers some cool power tools like lexicon.garden (by ) and UFOs (by ). But there are some missing pieces that result in common questions like:
"What XRPC endpoints must a PDS implement? How about a relay, or a bsky AppView?"
"What record types need to be indexed to implement bsky blocks? And what is that block behavior?"
"Which AppView endpoints require authentication, and which are public?"
"How to discover implementations and providers for existing XRPC endpoints? Or which apps use which record schemas?"
This post is going to riff on a couple different mechanisms that could provide answers to questions like these. I think that lexicon.garden already implements some similar ideas, and I wouldn't be surprised if others in the ecosystem are experimenting.
Service Interface Definitions
A new lexicon type could be specified which lists a bundle of XRPC endpoints and record types which together form a service "interface". Actual software and deployments might compose multiple interfaces. Metadata in the interface would indicate which endpoints require different levels of auth (public, authentication, admin authentication).
Interfaces would be useful as a form of developer documentation (which could be rendered in lexicon explorers), and could also help with scoping code generation. Tools like lexgen (in Go) or the lex CLI (in TypeScript) could generate API clients and server stubs scoped to specific interfaces. Automated testing tools could verify that implementations actually include all the listed endpoints. Sync and indexing tools like tap could include all the relevant record types.
A handful of com.atproto.* interfaces could describe protocol-level network roles like "PDS" or "relay", though such definitions should reflect what is written in the specifications, not vica versa. Lexicon designers could publish their own interface declarations. Interfaces would have some similarities to permission sets, but have different purposes and are not redundant. For example, interfaces could freely reference definitions across NSID hierarchies.
Do interface declarations actually need to be a lexicon type, or could they just be a record type? There are some aesthetic advantages to having NSIDs instead of full AT URIs, but maybe we should keep the lexicon language small and simple, and just have these be simple records.
Modality Definitions
There is a fuzzy protocol concept called "app modalities", which are a set of records which can be combined to provide an application experience. In these early days, app modalities are often clustered under a single NSID hierarchy. Eg, tech.tokimeki.poll.poll and tech.tokimeki.poll.vote presumably combine together to provide a basic "polls" app experience. But as lexicons get reused between apps, things are likely to get more complex: maybe Spark would integrate Tokimeki polls as part of their short-form video experience. An "app modality" declaration could declare a set of record types (NSIDs), and provide some readable documentation about how they are expected to be combined. For example, how app.bsky.graph.block records are expected to be implemented in thread views.
A modality definition could optionally indicate that one record types is a "signaling" record, the presence of which indicates whether an account intentionally participates in the modality or not. This could help users "deactivate" individual modalities without dropping from the overall network, and make modality-specific backfill clearer.
Modalities are tightly coupled with the concept of "space types" from the permissioned spaces proposal, which will require an NSID and thus probably a new lexicon type. They also overlap a lot with the "service interface" concept mentioned above, though with differing emphasis on XRPC endpoints vs record types. I think it could make sense to reimplement an app modality using the same record types and behaviors as another project, but not implement all the same XRPC endpoints. Overally I couldn't really imagine three distinct new lexicon types; these are all pretty similar.
Schema Docs
Lexicon schemas can include "description" text fields on both individual fields and overall declarations, and these can be used for basic developer-facing documentation. But these are pretty rudimentary: they don't support internationalization, and don't support flexible rich text rendering. It would be great to link official longer-form "README" docs to lexicons.
I think a pragmatic way to do this would be "sidecar" records published in the same atproto repo as the schemas themselves. These don't need to be com.atproto.*; they could be community.lexicon.*, garden.lexicon.*, or something else. They could include longer-form richtext directly, or reference text blobs (eg, markdown). They could include example/demo records (something lexicon.garden already supports).
In some cases it makes sense to attach to a specific record or endpoint schema. For example, the app.bsky.feed.searchPosts could have several paragraphs or even a full table talking about the supported syntax for the q query string parameter. In other cases, it might make sense to attach longer documentation to a "modality" or "interface" schema (described above). It might also be helpful to have a way to publish group-level documentation records. Eg, a schema doc with the record key tools.ozone.set. (ending with period) or tools.ozone.set._group (not a valid NSID) would describe the "group" or "directory" of lexicons with that prefix.
Project Declarations
All of the above abstractions would be ways to document and talk about abstraction layers. But it is also helpful to be able to declare concrete implementations and deployed services. When you discover a new lexicon endpoint, you often want to know which servers you could actually make calls to! Or you might want to discover which client apps actually implement a given app modality.
I think the natural way to do this would be one or more record types for describing projects, service deployments, and user-facing apps. These could be indexed and cross-referenced in developer docs, and aggregated in to app or service directories. They can provide metadata linking to service DIDs, API hostnames, project homepages, mobile apps, open source repos, individual leaders/developers, fundraising links, etc.
You could say "this service deployment (DID and URL) implements these service interfaces". Or "this Android client app implements these two app modalities". Or "this open source server code implements these API endpoints".
If a project delcaration is published in the same atproto repo as specific lexicon schemas, it could be indicated as the "original" project (or some other carefully chosen term). There could be some ethos tension around this: schemas are abstractions for interoperability, and in some sense everybody implementing schemas should be on a level playing field. On the other hand, control of lexicon schemas (via NSID domain control) is a real form of power, and it seems pragmatic to make the relationship between published lexicon and an actual project explicit.
One of the things that could drop out of this would be protocol-native "project directories" and "app stores" (a concept which has come up a few times). A big part of those come down to curration and signaling maturity: whether a project has adoption, has been abandoned, actually does what it says, is an experiment looking for contributors, etc. Anybody can publish any of these declarations or schemas at any time. If the only audience is developers, the incentives for abuse are low; but being a distribution and promotion channel for end-users would mean the need for baseline moderation and vetting is higher.
As with schema docs, I don't think this record types need to be protocol-defined or live under the com.atproto.* namespace. Convergence on a common record type (or at least a small number of variations) would be good, and I expect that would happen organically.