Non-Latin Character Sets

An obvious desideratum for an electronic publication of the Vita is the ability to represent the various non-Latin texts in machine readable form (Greek, Armenian, Georgian, and Slavonic). At present there are numerous ways to do this on any particular PC or Macintosh (and to a lesser degree the same is true for Unix machines). But none of these machines achieves this representation in a standardized fashion. Indeed the manner of encoding and presenting any single font can vary widely depending not only on the type of computer one is using but also the type of font-software that is being employed. This is because most computers represent character sets in a 7- or 8-bit (byte) fashion. Or to put the matter in more general terms, one is limited to at most 256 different characters at any one time. The Latin alphabet is almost always a fixed variable in this situation but the placement of the foreign characters across these remaining "open" bits is often unique to any given piece of software. Thus Latin-based texts can be ported from one computer to another without any problems, but one can rarely, if ever, say the same for non-Latin based texts.

The development of Unicode, a 16-bit convention for rendering character sets allows a computer to process over 65,000 characters at any one time. Every known character-set can be handled by this convention and perhaps just as important, in a uniform and standardized manner.

The encoding of non-Latin character sets according to uniform standards will allow the textual data we prepare to be utilized by scholars everywhere just as present-day ASCII conventions allow for such portability for Latin characters. Of course one drawback at present is that few software tools exist for Unicode implementation, but this is quickly changing. Most likely, far sooner than anyone would have imagined, it will become widely available for micro-computer applications.

All of the texts prepared for this edition of Vita have been converted into Unicode conventions. When personal computers are able to utilize this information all the texts we have assembled will be universally usable.

You Might Also Like