7
SQL and PL/SQL Programming in a Global Environment

This chapter contains information useful for SQL programming in a globalization support environment. It includes the following topics:

Locale-Dependent SQL Functions with Optional NLS Parameters

All SQL functions whose behavior depends on globalization support conventions allow NLS parameters to be specified. These functions are:

TO_CHAR
TO_DATE
TO_NUMBER
NLS_UPPER
NLS_LOWER
NLS_INITCAP
NLSSORT

Explicitly specifying the optional NLS parameters for these functions enables the functions to be evaluated independently of the session's NLS parameters. This feature can be important for SQL statements that contain numbers and dates as string literals.

For example, the following query is evaluated correctly if the language specified for dates is AMERICAN:

SELECT last_name FROM employees WHERE hire_date > '01-JAN-1999';

Such a query can be made independent of the current date language by using a statement similar to the following:

SELECT last_name FROM employees WHERE hire_date > 
TO_DATE('01-JAN-1999','DD-MON-YYYY', 'NLS_DATE_LANGUAGE = AMERICAN');

In this way, SQL statements that are independent of the session language can be defined where necessary. Such statements are necessary when string literals appear in SQL statements in views, CHECK constraints, or triggers.

All character functions support both single-byte and multibyte characters. Except where explicitly stated, character functions operate character by character, rather than byte by byte.

The rest of this section includes the following topics:

Default Values for NLS Parameters in SQL Functions

When SQL functions evaluate views and triggers, default values from the current session are used for the NLS function parameters. When SQL functions evaluate CHECK constraints, they use the default values that were specified for the NLS parameters when the database was created.

Specifying NLS Parameters in SQL Functions

NLS parameters are specified in SQL functions as follows:

'parameter = value'

For example:

'NLS_DATE_LANGUAGE = AMERICAN'

The following NLS parameters can be specified in SQL functions:

NLS_DATE_LANGUAGE
NLS_NUMERIC_CHARACTERS
NLS_CURRENCY
NLS_ISO_CURRENCY
NLS_SORT

Table 7-1 shows which NLS parameters are valid for specific SQL functions.

Table 7-1 SQL Functions and Their Valid NLS Parameters

SQL Function	Valid NLS Parameters
`TO_DATE`	`NLS_DATE_LANGUAGE NLS_CALENDAR`
`TO_NUMBER`	`NLS_NUMERIC_CHARACTERS NLS_CURRENCY NLS_DUAL_CURRENCY NLS_ISO_CURRENCY`
`TO_CHAR`	`NLS_DATE_LANGUAGE NLS_NUMERIC_CHARACTERS NLS_CURRENCY NLS_ISO_CURRENCY NLS_DUAL_CURRENCY NLS_CALENDAR`
`TO_NCHAR`	`NLS_DATE_LANGUAGE NLS_NUMERIC_CHARACTERS NLS_CURRENCY NLS_ISO_CURRENCY NLS_DUAL_CURRENCY NLS_CALENDAR`
`NLS_UPPER`	`NLS_SORT`
`NLS_LOWER`	`NLS_SORT`
`NLS_INITCAP`	`NLS_SORT`
`NLSSORT`	`NLS_SORT`

The following examples show how to use NLS parameters in SQL functions:

TO_DATE ('1-JAN-99', 'DD-MON-YY',
   'nls_date_language = American')

TO_CHAR (hire_date, 'DD/MON/YYYY',
   'nls_date_language = French')

TO_NUMBER ('13.000,00', '99G999D99',
   'nls_numeric_characters = '',.''')

TO_CHAR (salary, '9G999D99L', 'nls_numeric_characters = '',.''
   nls_currency = '' Dfl''')

TO_CHAR (salary, '9G999D99C', 'nls_numeric_characters = ''.,''
   nls_iso_currency = Japan')

NLS_UPPER (last_name, 'nls_sort = Swiss')

NLSSORT (last_name, 'nls_sort = German')

Note:

In some languages, some lowercase characters correspond to more than one uppercase character or vice versa. As a result, the length of the output from the NLS_UPPER, NLS_LOWER, and NLS_INITCAP functions can differ from the length of the input.

Unacceptable NLS Parameters in SQL Functions

The following NLS parameters are not accepted in SQL functions except for NLSSORT:

NLS_LANGUAGE
NLS_TERRITORY
NLS_DATE_FORMAT

NLS_DATE_FORMAT is not accepted as a parameter because it can interfere with required format masks. A date format must always be specified if an NLS parameter is in a TO_CHAR or TO_DATE function. As a result, NLS_DATE_FORMAT is not a valid NLS parameter for the TO_CHAR or TO_DATE functions.

If NLS_LANGUAGE or NLS_TERRITORY is specified in the TO_CHAR, TO_NUMBER, or TO_DATE functions, then a format mask must also be specified as the second parameter of the function. For example, the following specification is legal:

TO_CHAR (hire_date, 'DD/MON/YYYY', 'nls_date_language = French')

The following specification is illegal because there is no format mask:

TO_CHAR (hire_date, 'nls_date_language = French')

The following specification is illegal because the format mask is not specified as the second parameter of the function:

TO_CHAR (hire_date, 'nls_date_language = French', 'DD/MON/YY')

Other Locale-Dependent SQL Functions

This section includes the following topics:

The CONVERT Function

The CONVERT function enables conversion of character data between character sets.

The CONVERT function converts the binary representation of a character string in one character set to another. It uses exactly the same technique as conversion between database and client character sets. Hence, it uses replacement characters and has the same limitations.

The syntax for CONVERT is as follows:

CONVERT(char, dest_char_set[, source_char_set])

source_char_set is the source character set and dest_char_set is the destination character set. If the source_char_set parameter is not specified, then it defaults to the database character set.

In client/server environments that use different character sets, use the TRANSLATE ...USING function to perform conversions instead of CONVERT. The TRANSLATE...USING function must be used if either the client or the server has NCHAR or NVARCHAR2 data.

See Also:

Oracle9i SQL Reference for more information about the CONVERT function and the TRANSLATE...USING function
"Character Set Conversion Support" for character set encodings that are used only for the CONVERT function

SQL Functions for Different Length Semantics

Oracle9i provides SQL functions that work in accordance with different length semantics. There are three groups of such SQL functions: SUBSTR, LENGTH, and INSTR. Each function in a group is based on a different kind of length semantics and is distinguished by the character or number appended to the function name.The members of each group of functions is distinguished by the character or number that is appended to the function's name. For example, SUBSTRB is based on byte semantics.

The SUBSTR functions return a requested portion of a substring. The LENGTH functions return the length of a string. The INSTR functions search for a substring in a string.

The SUBSTR functions calculate the length of a string differently. Table 7-1 summarizes the calculation methods.

Table 7-2 How the SUBSTR Functions Calculate the Length of a String

Function	Calculation Method
`SUBSTR`	Calculates the length of a string in characters based on the length semantics associated with the character set of the datatype. For example, AL32UTF8 characters are calculated in UCS-4 code units. UTF8 and AL16UTF16 characters are calculated in UCS-2 code units. A supplementary character is counted as one character in AL32UTF8 and as two characters in UTF8 and AL16UTF16. Because `VARCHAR` and `NVARCHAR` may use different character sets, `SUBSTR` may give different results for different datatypes even if two strings are identical. If your application requires consistency, then use `SUBSTR2` or `SUBSTR4` to force all semantic calculations to be UCS-2 or UCS-4, respectively.
`SUBSTRB`	Calculates the length of a string in bytes
`SUBSTR2`	Calculates the length of a string in UCS-2 code units, which is compliant with Java strings and Windows client environments. Characters are represented in UCS-2 or 16-bit Unicode values. Supplementary characters are counted as two code units.
`SUBSTR4`	Calculates the length of a string in UCS-4 code units. Characters are represented in UCS-4 or 32-bit Unicode values. Supplementary characters are counted as one code unit.
`SUBSTRC`	Calculates the length of a string in Unicode complete characters. Supplementary characters and composite characters are counted as one character.

The LENGTH and INSTR functions calculate string length in the same way, according to the character or number added to the function name.

The following examples demonstrate the differences between SUBSTR and SUBSTRB on a database whose character set is AL32UTF8.

For the string Fußball, the following statement returns a substring that is 4 characters long, beginning with the second character:

SELECT SUBSTR ('Fußball', 2 , 4) SUBSTR FROM dual;

SUBS
----
ußba

For the string Fußball, the following statement returns a substring 4 bytes long, beginning with the second byte:

SELECT SUBSTRB ('Fußball', 2 , 4) SUBSTRB FROM dual;

SUB
---

ußb

See Also:

Oracle9i SQL Reference for more information about the SUBSTR, LENGTH, and INSTR functions

LIKE Conditions for Different Length Semantics

The LIKE conditions specify a test that uses pattern-matching. The equality operator (=) exactly matches one character value to another, but the LIKE conditions match a portion of one character value to another by searching the first value for the pattern specified by the second.

LIKE calculates the length of strings in characters using the length semantics associated with the input character set. The LIKE2, LIKE4, and LIKEC conditions are summarized in Table 7-3.

Table 7-3 LIKE Conditions

Function	Description
`LIKE2`	Use when characters are represented in UCS-2 semantics. A supplementary character is considered as two code units.
`LIKE4`	Use when characters are represented in UCS-4 semantics. A supplementary character is considered as one code unit.
`LIKEC`	Use when characters are represented in Unicode complete character semantics. A composed character is treated as one code unit.

There is no LIKEB condition.

Character Set SQL Functions

Two SQL functions, NLS_CHARSET_NAME and NLS_CHARSET_ID, can convert between character set ID numbers and character set names. They are used by programs that need to determine character set ID numbers for binding variables through OCI.

Another SQL function, NLS_CHARSET_DECL_LEN, returns the length of an NCHAR column.

This section includes the following topics:

Converting from Character Set Number to Character Set Name

The NLS_CHARSET_NAME(n) function returns the name of the character set corresponding to ID number n. The function returns NULL if n is not a recognized character set ID value.

Converting from Character Set Name to Character Set Number

NLS_CHARSET_ID(text) returns the character set ID corresponding to the name specified by text. text is defined as a run-time VARCHAR2 quantity, a character set name. Values for text can be NLSRTL names that resolve to character sets that are not the database character set or the national character set.

If the value CHAR_CS is entered for text, then the function returns the ID of the server's database character set. If the value NCHAR_CS is entered for text, then the function returns the ID of the server's national character set. The function returns NULL if text is not a recognized name.

Note:

The value for text must be entered in uppercase characters.

Returning the Length of an NCHAR Column

NLS_CHARSET_DECL_LEN(BYTECNT, CSID) returns the declaration length in number of characters for an NCHAR column. BYTECNT is the byte length of the column. CSID is the character set ID of the column.

The NLSSORT Function

The NLSSORT function enables you to use any linguistic sort for an ORDER BY clause. It replaces a character string with the equivalent sort string used by the linguistic sort mechanism so that sorting the replacement strings produces the desired sorting sequence. For a binary sort, the sort string is the same as the input string.

The kind of linguistic sort used by an ORDER BY clause is determined by the NLS_SORT session parameter, but it can be overridden by explicitly using the NLSSORT function.

Example 7-1 specifies a German sort with the NLS_SORT session parameter.

Example 7-1 Specifying a German Sort with the NLS_SORT Session Parameter

ALTER SESSION SET NLS_SORT = GERMAN;
SELECT * FROM table1
ORDER BY column1;

Example 7-2 Specifying a French Sort with the NLSSORT Function

This example first sets the NLS_SORT session parameter to German, but the NLSSORT function overrides it by specifying a French sort.

ALTER SESSION SET NLS_SORT = GERMAN;
SELECT * FROM table1
ORDER BY NLSSORT(column1, 'NLS_SORT=FRENCH');

The WHERE clause uses binary comparison rather than linguistic comparison by default, but this can be overridden by using the NLSSORT function in the WHERE clause.

Example 7-3 Making a Linguistic Comparison with the WHERE Clause

ALTER SESSION SET NLS_COMP = ANSI;
SELECT * FROM table1
WHERE NLSSORT(column1, 'NLS_SORT=FRENCH')>


NLSSORT(column2, 'NLS_SORT=FRENCH');

Setting the NLS_COMP session parameter to ANSI causes the NLS_SORT value to be used in the WHERE clause.

The rest of this section contains the following topics:

NLSSORT Syntax

There are four ways to use NLSSORT:

NLSSORT(), which relies on the NLS_SORT parameter
NLSSORT(column1, 'NLS_SORT=xxxx')
NLSSORT(column1, 'NLS_LANG=xxxx')
NLSSORT(column1, 'NLS_LANGUAGE=xxxx')

The NLS_LANG parameter of the NLSSORT function is not the same as the NLS_LANG client environment setting. In the NLSSORT function, NLS_LANG specifies the abbreviated language name, such as US for American or PL for Polish. For example:

SELECT * FROM table1
ORDER BY NLSSORT(column1, 'NLS_LANG=PL');

Comparing Strings in a WHERE Clause

NLSSORT enables applications to perform string matching that follows alphabetic conventions. Normally, character strings in a WHERE clause are compared by using the binary values of the characters. One character is considered greater than another character if it has a greater binary value in the database character set. Because the sequence of characters based on their binary values might not match the alphabetic sequence for a language, such comparisons may not follow alphabetic conventions. For example, if a column (column1) contains the values ABC, ABZ, BCD, and ÄBC in the ISO 8859-1 8-bit character set, the following query returns both BCD and ÄBC because Ä has a higher numeric value than B:

SELECT column1 FROM table1 WHERE column1 > 'B';

In German, Ä is sorted alphabetically before B, but in Swedish, Ä is sorted after Z. Linguistic comparisons can be made by using NLSSORT in the WHERE clause:

WHERE NLSSORT(col) comparison_operator NLSSORT(comparison_string)

Note that NLSSORT must be on both sides of the comparison operator. For example:

SELECT column1 FROM table1 WHERE NLSSORT(column1) > NLSSORT('B');

If a German linguistic sort has been set, then the statement does not return strings beginning with Ä because Ä comes before B in the German alphabet. If a Swedish linguistic sort has been set, then strings beginning with Ä are returned because Ä comes after Z in the Swedish alphabet.

Using the NLS_COMP Parameter to Simplify Comparisons in the WHERE Clause

Comparison in the WHERE clause or PL/SQL blocks is binary by default. Using the NLSSORT function for linguistic comparison can be tedious, especially when the linguistic sort has already been specified in the NLS_SORT session parameter. You can use the NLS_COMP parameter to indicate that the comparisons in a WHERE clause or in PL/SQL blocks must be linguistic according to the NLS_SORT session parameter.

Note:

The NLS_COMP parameter does not affect comparison behavior for partitioned tables. String comparisons that are based on a VALUES LESS THAN partition are always binary.

See Also:

"NLS_COMP"

Controlling an ORDER BY Clause

If a linguistic sort is in use, then ORDER BY clauses use an implicit NLSSORT on character data. The sort mechanism (linguistic or binary) for an ORDER BY clause is transparent to the application. However, if the NLSSORT function is explicitly specified in an ORDER BY clause, then the implicit NLSSORT is not done.

If a linguistic sort has been defined by the NLS_SORT session parameter, then an ORDER BY clause in an application uses an implicit NLSSORT function. If you specify an explicit NLSSORT function, then it overrides the implicit NLSSORT function.

When the sort mechanism has been defined as linguistic, the NLSSORT function is usually unnecessary in an ORDER BY clause.

When the sort mechanism either defaults or is defined as binary, then a query like the following uses a binary sort:

SELECT last_name FROM employees 
ORDER BY last_name;

A German linguistic sort can be obtained as follows:

SELECT last_name FROM employees
ORDER BY NLSSORT(last_name, 'NLS_SORT = GERMAN');

Miscellaneous Topics for SQL and PL/SQL Programming in a Global Environment

This section contains the following topics:

SQL Date Format Masks
Calculating Week Numbers
SQL Numeric Format Masks
The Concatenation Operator
Loading External BFILE Data into LOBs

See Also:
Oracle9i SQL Reference for a complete description of format masks

SQL Date Format Masks

Several format masks are provided with the TO_CHAR, TO_DATE, and TO_NUMBER functions.

The RM (Roman Month) format element returns a month as a Roman numeral. You can specify either upper case or lower case by using RM or rm. For example, for the date 7 Sep 1998, DD-rm-YYYY returns 07-ix-1998 and DD-RM-YYYY returns 07-IX-1998.

Note that the MON and DY format masks explicitly support month and day abbreviations that may not be three characters in length. For example, the abbreviations "Lu" and "Ma" can be specified for the French "Lundi" and "Mardi", respectively.

Calculating Week Numbers

The week numbers returned by the WW format mask are calculated according to the following algorithm: int(dayOfYear+6)/7. This algorithm does not follow the ISO standard (2015, 1992-06-15).

To support the ISO standard, the IW format element is provided. It returns the ISO week number. In addition, the I, IY, IYY, and IYYY format elements, equivalent in behavior to the Y, YY, YYY, and YYYY format elements, return the year relating to the ISO week number.

In the ISO standard, the year relating to an ISO week number can be different from the calendar year. For example, 1st Jan 1988 is in ISO week number 53 of 1987. A week always starts on a Monday and ends on a Sunday. The week number is determined according the following rules:

If January 1 falls on a Friday, Saturday, or Sunday, then the week including January 1 is the last week of the previous year, because most of the days in the week belong to the previous year.
If January 1 falls on a Monday, Tuesday, Wednesday, or Thursday, then the week is the first week of the new year, because most of the days in the week belong to the new year.

For example, January 1, 1991, is a Tuesday, so Monday, December 31, 1990, to Sunday, January 6, 1991, is in week 1. Thus, the ISO week number and year for December 31, 1990, is 1, 1991. To get the ISO week number, use the IW format mask for the week number and one of the IY formats for the year.

SQL Numeric Format Masks

Several additional format elements are provided for formatting numbers:

D (decimal) returns the decimal point character.
G (group) returns the group separator.
L (local currency) returns the local currency symbol.
C (international currency) returns the ISO currency symbol.
RN (Roman numeral) returns the number as its Roman numeral equivalent.

For Roman numerals, you can specify either upper case or lower case, using RN or rn, respectively. The number being converted must be an integer in the range 1 to 3999.

The Concatenation Operator

If the database character set replaces the vertical bar | with a national character, then all SQL statements that use the concatenation operator (encoded as ASCII 124) will fail. For example, creating a procedure fails because it generates a recursive SQL statement that uses concatenation. When you use a 7-bit replacement character set such as D7DEC, F7DEC, or SF7ASCII for the database character set, then the national character which replaces the vertical bar is not allowed in object names because the vertical bar is interpreted as the concatenation operator.

The user can use a 7-bit replacement character set if the database character set is the same or compatible, that is, if both character sets replace the vertical bar with the same national character.

Loading External BFILE Data into LOBs

The DBMS_LOB PL/SQL package can load external BFILE data into LOBs. Previous releases of Oracle did not perform character set conversion before loading the binary data into CLOBs or NCLOBs. Thus the BFILE data had to be in the same character set as the database or national character set to work properly. The APIs that are introduced in Oracle9i Release 2 (9.2) allow the user to specify the character set ID of the BFILE data by using a new parameter. The APIs convert the data from the specified BFILE character set into the database character set for CLOBs or the national character set for NCLOBs. The loading takes place on the server because BFILE data is not supported on the client.

Use DBMS_LOB.LOADBLOBFROMFILE to load to BLOBs.
Use DBMS_LOB.LOADCLOBFROMFILE for load to CLOBs and NCLOBs.
See Also:

Oracle9i Supplied PL/SQL Packages and Types Reference

Oracle9i Application Developer's Guide - Large Objects (LOBs)

7 SQL and PL/SQL Programming in a Global Environment

Locale-Dependent SQL Functions with Optional NLS Parameters

Default Values for NLS Parameters in SQL Functions

Specifying NLS Parameters in SQL Functions

Table 7-1 SQL Functions and Their Valid NLS Parameters

Unacceptable NLS Parameters in SQL Functions

Other Locale-Dependent SQL Functions

The CONVERT Function

SQL Functions for Different Length Semantics

Table 7-2 How the SUBSTR Functions Calculate the Length of a String

LIKE Conditions for Different Length Semantics

Table 7-3 LIKE Conditions

Character Set SQL Functions

Converting from Character Set Number to Character Set Name

Converting from Character Set Name to Character Set Number

Returning the Length of an NCHAR Column

The NLSSORT Function

Example 7-1 Specifying a German Sort with the NLS_SORT Session Parameter

Example 7-2 Specifying a French Sort with the NLSSORT Function

Example 7-3 Making a Linguistic Comparison with the WHERE Clause

NLSSORT Syntax

Comparing Strings in a WHERE Clause

Using the NLS_COMP Parameter to Simplify Comparisons in the WHERE Clause

Controlling an ORDER BY Clause

Miscellaneous Topics for SQL and PL/SQL Programming in a Global Environment

SQL Date Format Masks

Calculating Week Numbers

SQL Numeric Format Masks

The Concatenation Operator

Loading External BFILE Data into LOBs

7
SQL and PL/SQL Programming in a Global Environment