Returns its first argument as a string converted to a different encoding. The two argument form changes the encoding for case within a character set. The three argument form changes the encoding scheme.
The format for the $ZCONVERT() function is:
$ZCO[NVERT](expr1, expr2,[expr3])
The first expression is the string to convert. If the expression contains a code-point value that is not in the character set, $ZCONVERT() generates a run-time error.
In the two argument form, the second expression specifies a code that determines the form of the result. In the three-argument form, the second expression specifies a code that controls the character set interpretation of the first argument. If the expression does not evaluate to one of the defined codes corresponding to a valid code for the number of available arguments, $ZCONVERT() generates a run-time error.
The valid (case insensitive) character codes for expr2 in the two-argument form are:
U converts the string to UPPER-CASE. "UPPER-CASE" refers to words where all the characters are converted to their "capital letter" equivalents. $ZCONVERT() retains characters already in UPPER-CASE "capital letter" form unchanged.
L converts the string to lower-case. "lower-case" refers to words where all the letters are converted to their "small letter" equivalents. $ZCONVERT() retains characters already in lower-case or having no lower-case equivalent unchanged.
T converts the string to title case. "Title case" refers to a string with the first character of each word in upper-case and the remaining characters in the lower-case. $ZCONVERT() retains characters already conforming to "Title case" unchanged.
The optional third expression specifies the a code that determines the character set of the result. If the expression does not evaluate to one of the defined codes $ZCONVERT() generates a run-time error.
In the three argument form, when the second or third expression specifies "W-1252", $ZCONVERT interprets its first argument as encoded as specified by its second argument and returns a string reflecting the conversion of the first argument to the encoding of the third argument(UTF-16LE and UTF-16BE are not supported in this mode).
The valid (case insensitive) codes for character set encoding for expr2 and expr3 in the three-argument form are:
"UTF-8"-- a multi-byte variable length UnicodeA(R) encoding form.
"UTF-16LE"-- a multi-byte 16-bit UnicodeA(R) encoding form in little-endian; not supported for "M" or "W-1252" input or output.
"UTF-16BE"-- a multi-byte 16-bit UnicodeA(R) encoding form in big-endian; not supported for "M" or "W-1252" input or output.
"UTF-16"-- a multi-byte 16-bit UnicodeA(R) encoding form which uses the same endian level as that of the current system.
"W-1252"-- a single-byte 8-bit character encoding. It's an extension to ASCII used primarily in Microsoft environments.
"M"-- a single-byte 8-bit character encoding. In $ZCONVERT, 'M' corresponds to 'W-1252'.
Warning | |
---|---|
When $gtm_chset is set to UTF-8, the "M" or "W-1252" code specifing input or output one-byte encoding requires care in the multi-byte environment. Therefore use caution in choosing between character- and byte-oriented functions in the surrounding code, such as between $CHAR() and $ZCHAR(). The BADCHAR setting is also a factor to keep in mind. |
Note | |
---|---|
When UTF-8 mode is enabled, GT.M uses the ICU Library to perform case conversion. As mentioned in the Theory of Operation section, the case conversion of the strings occurs according to UTF-8 code-point values. This may not be the linguistically or culturally correct case conversion, for example, of the names in the telephone directories. Therefore, application developers must ensure that the actual case conversion is linguistically and culturally correct for their specific needs. The two-argument form of the $ZCONVERT() function in M mode does not use the ICU Library to perform operation related to the case conversion of the strings. |
Example:
GTM>write $zconvert("Happy New Year","U") HAPPY NEW YEAR
Example:
GTM>Write $zconvert("HAPPY NEW YEAR","T") Happy New Year
Example:
GTM>Set T8="a,>>e|?e??a??eJPY?c?-c??a??c??a??a13a??" GTM>Write $Length(T8) 12 GTM>Set T16=$zconvert(T8,"UTF-8","UTF-16LE") GTM>Write $length(T16) %GTM-E-BADCHAR, $ZCHAR(129,137,232,150) is not a valid character in the UTF-8 encoding form GTM>Set T16=$ZCOnvert(T16,"UTF-16LE","UTF-8") GTM>Write $length(T16) 9 GTM>set WTOUTF8=$zconvert($ZCHAR(128),"W-1252","UTF-8") GTM>write WTOUTF8 a?! GTM>set UTF8TOW=$zconvert(WTOUTF8,"utf-8","M") GTM>write UTF8TOW ?
In the above example, $LENGTH() function triggers an error because it takes only UTF-8 encoding strings as the argument.