Show Source

This is a set of notes devoted to a hack for carrying structured metadata over IRC. The protocol is a collaboration between Nathaniel Filardo and Glenn Willen, with input from many others on Freenode’s #cslounge (we love you all).

There is a list of implementations of various subsets of the IRCIE universe at Implementations. If somebody just sent you here, you might start looking there. Note that not all implementations support all meta-data, but ideally they should understand each other’s common subset just fine.

The technicalities of the encoding are discussed in Encoding. That page also contains the authoritative list of “well known type” identifiers for use with IRCIE.

Currently, there are definitions for a few types of data:
  • Instances – Add instance labels to messages, making IRC a little more like Zephyr.
  • Miscellaneous Metadata – Some small things, including robot indicators.

Encoding

Message Protocol

Invisible Coding

Here’s the hack, which we lovingly refer to as “IRC invisible (en)coding” (IRCIE). One uses the characters ^B, ^C, ^O, ^V, and ^_, which are IRC formatting control characters, to encode various data in a way that most clients won’t ever see.

The mapping is as follows:

Character IRC Control meaning ASCII Value
^B Bold 0x02 (STX) 0
^C Set color 0x03 (ETX) 1
^O Kill formatting 0x0F (SI) 2
^V Reverse video 0x16 (SYN) 3
^_ Underline 0x1F (US) 4

We avoid the use of ^Cdd,dd (the color formatting control sequence), as some clients have a bad habit of dropping the ^C and printing the digits instead. Further we avoid the use of space and tab as these may become visible when interspersed with the formatting control characters, which would defeat the invisibility of the encoding technique. We do not use ^F (“blink”) for reasons long since forgotten – perhaps there was a misbehaving popular client?

Framing

All protocol messages are embedded at the end of an IRC message. This makes for easier parsing and simplifies the protocol design (no risk of accidentally “finding” a message in the middle of another message, for example.) However, for compatibility with many real-world clients (as opposed to the specification) which expect CTCP messages to span an entire message, we allow protocol messages at the “logical end” of IRC messages such as CTCP ACTION. See Examples below.

For maximal extensibility, we start by defining an outer protocol which allows for multiple inner protocols to coexist, using type and length coding; we hope this will allow for compatibility, should future protocol designers (or future versions of this protocol) want to use the same invisible-character trick that we use here.

We signal the start of a protocol message with the lead-in character ^O, followed by the empty string as our type tag, followed by the tag terminator ^O. Type/version tags should consist of sequences of characters other than ^O. The effect of this is to sandwich a series of formatting control codes between two kill formatting codes, an extremely unlikely thing to be generated accidentally. We consider it virtually certain that no legitimate IRC message would have such strings of characters without intervening printable text; we therefore see this as an ideal lead-in sequence for embedded messages, and get the possibility of adding more message types for free.

Next we encode the length of the message, using the “L” code described below, in bytes. The length encoded only includes the message (i.e. the part after the length itself, up to and not including the final ^O.). This length is sometimes called the “meta length” or just “MetaL”.

Next comes the message itself, in the TLV format described below.

Finally, a message MUST end with ^O which is outside the TLV encodings. (This resets unaware clients’ idea of the current text style, which will be totally confused by this point. This is a courtesy to potentially buggy clients which fail to reset formatting codes at the end of a line, as well as another convenient marker unlikely to be sent by normal clients.)

TLV Encoding

Entries inside the protocol stream are TLV encoded using a two-character (25 possibilities) type (so-called “T encoding”) and a length-then-modified base-5 encoding of the length (“L encoding”). The remainder of the record is then interpreted according to its type tag.

Error Handling

A client receiving a message with a record type it does not understand MUST discard that record and MUST process all other records in the message, if any. Clients MAY inform the user of unrecognized messages but SHOULD allow the user to disable this notification easily.

A client receiving a malformed record SHOULD assume that it has mistakenly identified this protocol and cease processing the message; recovery would be difficult due to the lack of (supposed) length information. If the corruption is only visible late in the stream, after one or more records have already been processed, clients SHOULD inform their users of the error but again SHOULD allow the user to disable this notification easily. Clients MAY cease processing this protocol for a given channel after such an error, again informing the user of their decisions.

L encoding

L encodings have two components: a fixed one-symbol prefix and a variable-length suffix. The single symbol is the biased length of the suffix, expressed as a T-encoded value. Note that we reserve the maximal value for future expansion.

The protocol defined here uses a bias value of 1, so that a “0” for the first byte of the L encoding indicates that the next one (1) symbol is to be consumed as part of the length, etc.

This scheme supports all length values ranging from 0 (which is two bytes long, namely ^B^B) to 779 (which is five bytes long, namely ^V^_^_^_^_). This is considered sufficient for all practical uses of this protocol. In particular, 779 is more than the usual line lengths tolerated by some ircds.

This encoding was chosen due to its bias for short sequences to have short length fields while still being relatively easy to parse.

Assigned Types

Currently the following message types are assigned:

Value Purpose See
0-2 RESERVED IETFNG PROTOCOL EXTENSIONS  
3 Head-of-frame Flags (special handling) [[../Misc]]
4 Continuation Flags [[../Misc]]
5 Instance label, absolute encoding, huffman table 1 [[../Instances]]
6 Teledildonics Protocol Transport
7-14 RESERVED, GLOBAL ASSIGNMENT  
15 OTR advertisement message [[../Misc]]
16 Miscellaneous Message Flags (deprecated) [[../Misc]]
17 MS Comic Chat Data (Re-)Encoding [[../ComicChat]]
18-19 RESERVED, GLOBAL ASSIGNMENT  
20-24 EXPERIMENTAL, LOCAL ASSIGNMENT  

Infelicities

Note that some ircds will actively filter out formatting codes if channel operators request said behavior, rendering this protocol unable to function. This is unfortunate, but perhaps understandable given the potential for abuse. One may be able to convince server devs that messages such as ours which do not in fact alter the formatting may be worth passing, but that seems somewhat unlikely.

Instances

What Are Instances?

Instances are essentially “threads” of conversation. The name comes from Zephyr where messages are routed by a triplet of “recipient”, “class”, and “instance”. Essentially, IRC has only private messages without classes or instances and public classes (channels).

Design Goals?

First, we partition the set of clients into “aware” and “unaware” by whether or not (respectively) they are playing this game. We want the following behaviors:

  • Unaware clients remain unaware of instances, and users receive all messages on all instances within the channel.
  • Unaware client users do not see any real difference in messages (no control characters, ugly strings, etc.)
  • Aware clients may filter (“punt” and “unpunt”) instances, as well as send messages on them.

Instance Labels

Huffman Table 1

This is a 5-ary Huffman tree whose output is to be T encoded:

( ( rsoit )
  ( gb<>- )
  ( mane. )
  ( ( Ch()= )
    ( U@HG# )
    ( &j+NB )
    ( MFL;: )
    ( ^~Q?Z ) )
  ( ( 'ufp/ )
    ( ldcv_ )
    ( STARE )
    ( I O ( wWkqx ) ( DPyXY ) ( KVJz" ) )
    ( ( 01234 ) ( 56789 ) ( %*,|! ) ( `$\{} ) ( [] ) ) ) )

This is to say that, as examples, r encodes as 00, I encodes as 440, and , encodes as 4422.

Instance Continuation Messages

Empty instance declarations are used to mean “see last instance tag” rather than “no instance” (which can be obtained by leaving out an instance tag altogether). If a client finds that it must split a message, this gives a much shorter, fixed-width protocol message which may be placed on subsequent lines. Clients which see an Instance Continuation Message but have not seen an Instance Label previously (as might be the case after joining a channel) SHOULD display the message as if it had no instance and MAY indicate to the user that this downgrade has occurred.

Note that while it is in theory possibly to use an ICM when multiple independent messages are sent on the same instance, clients SHOULD NOT use this functionality, as it increases the risk that an aware client will be unable to correctly route an instanced message. Clients MUST NOT use an ICM if more than 60 seconds have elapsed since sending a message with an Instance Label or if they have witnessed a JOIN.

Note that if an Instance Continuation Message and an Instance Label appear in the same protocol message, aware clients MUST consider the label authoritative and MAY inform the user that there was an error in the message. Conforming clients MUST NOT generate messages with both ICM and IL entries.

Protocol Examples

  1. To send the instance label test, the full message works out to be

    Header ^O^O
    MetaL ^C^C^V (13)
    First Record
      Type ^C^B (5)
      Length ^C^B^V (8)
      Value ^B^_^O^V^B^C^B^_
    Footer ^O

    for a total of 19 bytes sent (11 of which are protocol overhead).

  1. A CTCP ACTION message such as ^AACTION barfs on the floor.^A may be instance tagged with label “test” by sending ^AACTION barfs on the floor.^O^O^C^C^V^C^B^C^B^V^B^_^O^V^B^C^B^_^O^A.
  1. An instance continuation message is rendered as

    Header ^O^O
    MetaL ^B^V (3)
    First Record
      Type ^C^B (5)
      Length ^B^B (0)
      Value  
    Footer ^O

Miscellaneous Metadata

Head Of Frame Flags

Sometimes, all you want is a regex. Head-of-frame flags are intended to be scanned by agents not interested in decoding the entirety of IRCIE (as well as those which are). They MUST immediately follow the sigil and metaL header fields, and MUST be repeated at the head of every fragment (see below) of a message. HOF flags’ definitions violate the encoding abstraction and are encoded directly as symbols in the stream, to facilitate this parsing. Fields are filled in “left to right” (“most significant position” first); fields not present read as their defaulted value.

Position Value Signal
0 0 Not a bot.
1 Bot/automated message.
2-4 Bot flag: reserved.

Automated response flags are there to help bots avoid chatter-spam (responding to other bots, possibly circularly). Concretely, bots not following IRCIE should produce ^O^O^C^B^B^B^V^B^C^C^O and should use this regex to recognize bot flags encoded in arbitrary IRCIE TLV streams: ^O^O(^B|^C.|^O..|^V...|^_....).^B^V(^B|^C.|^O..|^V...|^_....).[^C^O^V^_].*^O$.

OTR Advertisement Messages

Reserved for use with Off The Record messaging when being used on an IRC transport. For compatibility with the extant OTR protocol, the payload of this message is a list of OTR versions to support. These values are formed by T-encoding, using two bytes each, and the list is formed by concatenation. (Note that some of the older OTR documentation does not admit that there are multiple tags!)

List value Signal
0 RESERVED
1 OTR V 1
2 OTR V 2
3-24 RESERVED

For example:

Header ^O^O
MetaL ^C^B^V (8)
First Record
  Type ^V^B (15)
  Length ^B^_ (4)
  Value ^B^O^B^C (2,1)
Footer ^O

Miscellaneous Message Flags

The message flags type is reserved for carrying a bitset for flagged attributes of the message to which it is attached. This may be thought of as an infinite bitstring transmitted without leading zeros. The value is encoded numerically using T-encoding with a leading 1 which is ignored (to signal the leftmost boundary of bits to be interpreted). Clients receiving bits beyond their interpretation are obligated to ignore these bits.

No MMFs are defined at present; the sole user of this feature was moved to the Head-of-frame flags above. Nonetheless, the type is reserved should we ever want it back.

Continuation Flags

A single character flag, marking fragmented IRC messages. Each IRC message may contain AT MOST one continuation flag message. The messages are T-encoded as follows; other values are reserved.

Value Meaning
0 Begin a split message set
1 Continue a split message set
2 End a split message set

When processing a stream of messages, a “begin split message set” indicates that the transmitting client has had to fragment the IRC line in order to satisfy the IRC server. It will then produce a stream of zero or more “continues split”-tagged messages followed by exactly one “end split” message.

The semantics of processing Continuation Flags are as if the message had not been fragmented. That is, the client will accumulate lines in a fragmented set, and will then parse all other IRCIE messages contained in the accumulated sets. Head-of-frame flags MUST be identical across fragments and are only parsed once after reassembly.

Client processing must also consider any message without a continuation flag as having been proceeded by an end if the client has sent a begin (that is, messages either IRCIE-untagged and IRCIE-tagged w/o continuation flags while expecting a continue or an end should cause the client to pretend that it has seen an end prior to the triggering message). The same behavior is expected if a client drops from the server before sending the ending message.

Clients should avoid changing IRC nicknames while emitting continuations but clients MAY chose to defensively track said.

It is an error to send continue or end messages without first sending a beginning, and clients SHOULD discard the continuation flags in this case.

Implementations

Commentary vs. CTCP

This protocol is awful, in a sense. It duplicates a huge amount of machinery (signaling, message framing, dispatch) already present for CTCP (see http://www.irchelp.org/irchelp/rfc/ctcpspec.html and http://www.invlogic.com/irc/ctcp.html) and in a sense ought not exist. However, as of this writing, most IRC clients do not properly parse CTCP and choke if either multiple CTCP messages are sent in one IRC line (which is legal) or if the CTCP message does not span the entire IRC line (also legal). Further, clients tend to be bad about ignoring CTCPs they don’t understand, coughing up user-visible noise instead.

If IRC clients were to properly understand CTCP, the instance labels might be encoded by the introduction of a new command which means “the next IRC message or CTCP ACTION from this client will be on instance …” and send that at the head of the IRC line in question. A similar CTCP may be defined for instance continuation messages with the same semantics as the message defined here. Done correctly, such a CTCP vocabulary would be more flexible than the protocol defined here, allowing one IRC line to carry multiple messages bound for different instances.