Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Why can't "[the bracekts] “[...] be transmitted over news wires.”"?


Because they use(d) a reduced character set that included parentheses but not brackets.

See https://en.wikipedia.org/wiki/Baudot_code#ITA_2_and_US-TTY

There was a time before ASCII, you know...


But this only punts the question forward: why do they still use this reduced character set, when most (if not all) of their data is transmitted over the internet?


I have no idea whether they still use this reduced character set. But whether or not they do, this could still affect the protocol or stylebook, for the usual sorts of reasons when legacy systems are involved:

* News wires are large old distributed networks. There may still be old equipment attached to it that natively uses this encoding; or modern software may still be using the compatible encoding because it was never practical or cost-effective to have a "flag day" to switch over.

* Even if the technical limitation has been lifted, the convention may live on as part of the folklore of the field.


I am curious too. After some research I assume the relevant standard is IPTC 7901, but I don't see a mention of excluded characters in the body of the newswire:

https://iptc.org/standards/iptc-7901/

Perhaps the convention of a limited character set dates back to the telegraph era


I can't give a concrete reason, but I suspect they might want to be able to transmit over channels with comparatively little bandwidth/much noise. Shannon taught us that more reliable communication in the face of low bandwidth and/or high noise is only possible with a smaller symbol set.


I could see that in some extraordinarily niche scenarios but if someone has the bandwidth to be transmitting full news articles then they have the bandwidth to use a whole 7 bits per character. And they should be compressing it too, at which point you don't need to restrict less common characters.


The set of people at the wire transmitting full articles might be different from the set publishing brief facts from low-bandwidth locations. (But it is convenient if they use the same infrastructure for it!)

I'm not convinced they would be compressing it. When you have the bandwidth for a full article, size is probably not going to be the problem, and when you don't have the bandwidth, compression is probably counterproductive.


Compression of text is basically never counterproductive as long as you use a suitable design.

Even a static huffman tree would help, and has no overhead. An entropy encoder would do very well and only cost a few bits to flush.


Compression removes redundancy. It's literally the definition of compression! And reduced redundancy is always bad when reliable transmission is a priority over small size.


If your news wire starts mangling data, you need to resend it, not guess.

And I bet you can get more reliability out of compressing and then adding error correction bits.


Bit of a late reply, but resending it is a form of redundancy, and so is adding error correction bits. It is cheaper to do these things (they require less bandwidth) when you start with a limited character set.


How do journalists using a non-latin-based alphabet do it ?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: