Programming Library Conventions

Java: Readers and Writers work with Unicode Characters, while Input and Output work with raw bytes.

A side study of mine is how developers write (and organize) libraries and then [inadequately] document them.

Consistency is a good thing, and while I’ve never seen the following fact explicitly pointed out, it does represent some extra thought on the part of the Java library authors.

With the realization that applications are not just for USA English speakers, Unicode support is becoming mandatory. Standard ASCII bytes allows for 256 characters, but Unicode supports everything, including foreign characters.

Java’s strings use Unicode characters, not bytes, although we all know a Unicode character is represented by a sequence of one or more bytes. This is why the storage size of the representation is not necessarily the same as the string’s length.

With the Java libraries, anything that talks about Readers and Writers is working with content in terms of Characters.

Anything that talks about Input and Output is working with content in terms of raw bytes.

Knowing that is how the library is sliced up makes it much easier to find the routine you’re looking for.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.