GT.M Version 5.2-000 Technical Bulletin
Introduction
Starting with the V5.2-000 release, GT.M provides support for Unicode™ version 5.0.0. Releases of the Unicode™ Standard and releases of ISO/IEC-10646 track each other (see http://www.unicode.org/faq/unicode_iso.html for more information).
The objective of this technical bulletin is to describe the enhancements to GT.M language features and utility programs in V5.2-000 using practical examples, discussion summaries, and best practices.
An understanding of Unicode ™ and GT.M is a prerequisite to using the Unicode™-related features of GT.M. For information on Unicode™, refer to:
The Unicode
Consortium (http://www.unicode.org) develops standards in the area of internationalization including defining the behavior and relationships between characters in Unicode.
The Wikipedia entry on Unicode™ (http://en.wikipedia.org/wiki/Unicode) is an excellent resource on encodings, glyphs, coded character sets, code-points, surrogate characters, collation, UTF-8, and so on.
This technical bulletin has five parts:
- Theory of Operation: This section explains the philosophy behind support for Unicode ™ on GT.M and summarizes enhancements that support it, especially the concept that there is no change to the GT.M database engine and Unicode™-related functionality of GT.M is simply another way to interpret the stings of bytes stored in the database files.
- M Language: This section covers the enhancements to M Language Commands, String Processing Functions, and explains how GT.M works with the UTF-8 character set. It describes Unicode™ strings, I/O, and so on. Together with Theory of Operation and Utility Programs, this section provides information application developers need to develop applications using Unicode™.
- Utility Programs: This section covers changes in MUPIP, DSE, and LKE.
- Discussion and Best Practices: This section discusses the best practices for data interchange between M character set and UTF-8, limitations and maximums of V5.2-000, and ten rules to design and develop Unicode™-based applications for deployment on GT.M.