Wait wait wait, what do you mean XMPP over TLS?
Okay so there is a high chance that you don't know what XMPP is, unless you like reading RFCs or you have had to implement some sort of chatting feature recently.
XMPP is a protocol based on XML, to send and receive messages, as simple as that.
Streams
XMPP is just a set of simple rules for defining a XML stream yeah, you heard me a stream. If you haven't found any XML streams yet, it's basically like a XML document but for things that require to be networked.
Wait but I have never seen
xml://
in my browser how do those streams, stream.
Well for example XMPP supports:
- WebSocket
- QUIC
- TCP
- TCP/TLS
That's all nice and good till you realize, that's raw tcp, there is no framing.
Wait wait wait a moment here, what the fuck is framing and why are you dying out here for it?
Framing
If you know a bit about how strings work in C you might know that they are something like this:
Where you have your data H
E
Y
and then you have a special character \0
,
that character is a null terminator.
The computer would do:
Okay but, like, why you telling me about C you fuc-
When dealing with networking we also need a way, to tell the computer to stop reading.
Apart of including a marker like C does with it's null terminator (\0
).
We can also tell the computer the number of bytes to read, like in HTTP
:
This are two ways of telling the computer to stop reading, either from memory or network, so that it doesn't continue forever.
The problem
Let me talk you little sh-
We just established, that XMPP has several protocols, and weirdly enough what framing is, so whats the deal?
Neither XMPP/TCP or XMPP/TCP/TLS have FRAMING.
Wait didn't you just say that meant, reading forever?
About that.... Do you remember that XMPP was just an XML stream, right?
A lot of XML are parsers are stream parsers, meaning that they just go reading the
XML and do events like start-tag
, start-attribute
, end-tag
, meaning that they don't need
the whole document for parsing, it just processes forever.
Yet, my problem is that I need to get the whole "frame".
That's a bit selfish of you not gonna lie
The solution
My technical solution went as follows:
Best case
- Pull for data from the peer
- Check if XML is a valid fragment
- Data is valid
- Process it
Large fragment
- Pull for data from the peer
- Check if XML is a valid fragment
- Data is invalid
- Push to a buffer
- Pull for data from the peer
- Push data to buffer
- Check if buffer is valid XML
- Data is valid
- Process it
Was it that hard?
To get the buffering, async reading from two streams at once, processing data in time, returning the right data, was very tedious. But now I learnt my lesson not to fall for this pitfalls if I ever make a protocol.
Conclusion
Please add framing to your protocols, and please please don't try to buffer the protocols that don't have framing.