Is there any "native" collation?

When upgrading a Navision native Database to MSSQL-Server naturally the question arises, what collation to use. The fifty-fifty joker leaves the Windows collations at least, since the SQL-Collations are not recommended. In order to have the application run as close as possible in the same way it did and the users notice as few changes as possible one should try to get (nearly) the same sort order of Code- and Text-Columns as before.

So, what is the collation (depending on the Country-Version of course), that matches as close as possible the sorting order in the native database for Code- and/or Text-Fields for the german DE-Version?

If this question is not as simple to answer as it looks like, where can I find information about the sorting order in the native database and/or the (collation dependent) order in SQL-Server?

The collation choice may impact how text columns sort (case-sensitive vs. case-insensitive) however it will not impact Code columns (the exception may be extend character in the codepage). Code columns have no lower case characters. In the native database a code column is sorted depending on the data it contains. If the column values contain all numeric data then they are treated and sorted as an integer else the are sorted as text. In SQL the Code columns are Varchar columns and sorted as such.

The Windows collation for:

“Afrikaans, Basque, Catalan, Dutch, English, Faeroese, German, Indonesian, Italian, Portuguese”

should give you close behaviour with your native database, checking also the Case and Accent senesitive boxes.

The biggest difference will be for Code fields that, by default, do not order as in the native database since there is no categorization of pure integer values and alpha-numeric values. This is not a collation issue but a data-type issue in using the SQL Varchar data type for code fields. It can be overcome by changing the Code field SQL Data Type property to Variant - but that can be a big change and you need to determine if it is justified for you.

Thanks for the answers, but there seems to be a little disconcern about the sql-type varchar collation-dependence. In MSDN one can read:

“Objects that use char or varchar are assigned the default collation of the database, …”,

which I interpret in such a way that Navision Code fields (varchar respectively) “suffer” from collation… - indeed I have tested this with two records in a table with a single code field primary key having the values ‘URLAUB’ and ‘URL-SONDER’ respectively. There results annother sort order if one chooses “Afrikaans, Basque, Catalan,…” windows-collation as if one chooses “Western-European dictionary sort order, code page 1252, case-sensitive, accent-sensitive (51)” sql-collation, e.g. - the first collation gives the same sorting as in native Navision, namely URLAUB first and URL-SONDER second, the latter sql-collation gives the reversed sorting…

Nevertheless the choice “Afrikaans, Basque, Catalan,…” seems to be indeed a good approximation to the “native” sort order (besides the sorting of purely numeric code-values of course)

The collation setting that you choose from Navision is a database-wide setting that will affect all character data types in the database - meaning Text and Code fields from Navision’s view. Navision does not make use of column-level collations, although this is the granularity provided by SQL Server.

It is generally true that ordering and comparison will differ between SQL and Windows collations, for the same code page and case/accent sensitivity. Although you have found that the Windows collation is closer to what you would like, I know that in many cases SQL collations have been prefered for one reason or another.

In general, Windows collations are a better choice because they use the operating system’s API for comparisons (or a snapshot of it at least) which in turn, many common desktop applications also use - so it is more standard. Added to this, SQL collations are likely to have a shorter future.

I was told be several people that it was better to use the SQL collations, especially if you had a mixed setup with many different character sets in the database (ie. mixing latin, thai and chinese characters). What’s your comment to that?

I have replied to your other post about collations, but basically the point is that since both are codepage dependent they suffer from the same code page mismatch problems where extended characters will not convert as expected if you mix code pages.

SQL collations are simpler in terms of the dictionary rules incorporated into them, such as the equality of charcters such as “Å” being equal to “AA” in some dictionaries, and so on, and therefore more closely match the NAV Native database server (which cannot enforce such rules).

Converslely, it is usually desirable to include such rules in your data comparison and ordering, and Windows collations achieve this because they are standard OS collations, as I said, used by applications in general. So I would always go for a Windows collation unless I had a specific scenario that prevented me from doing that - since the character mappings are not the same between the two, I don’t doubt that a SQL collation avoids a specific character mapping issue that a windows collation has, which affects a particular locale and data that is interesting to you.

Another point that comes to mind - NAV itself has to do some tricks when using a SQL collation. It generates all character strings internally as binary-comparable values in order to compare correctly in the client, which are often much larger than the original character strings (e.g. a 10 character string can become over 100 bytes). This is because there is no OS support to do such comparisons. For windows collations, it uses the OS CompareString() function which is much more efficient.

If I was using a database with Chinese/Japense/Korean characters, which are double-byte (when not Unicode), then its quite another situation since there is no native support for those characters and it does not make sense to talk about dictionary comparison/ordering at all. Therefore both a SQL or Windows collation are essentially the same - you can choose a binary flavour of them, which removes dictionay rules altogether - still I would choose a Windows binary collation.

Ok, I know we have two collation threads running right now. So I would like this thread to continue and the other to stop.

Here you partly answered my question in the other thread (http://dynamicsuser.net/forums/t/20642.aspx). Except what would you do if you in same database as above also had to support Danish, Greek and Turkish?

Well, Erik, you are asking for a one size fits all, and there really isn’t one for your example which draws from code page 1252, 1253 and 1254! Each allow representation of different characters (and dictionary rules). You are looking for a Unicode solution which is not there yet.

Since I don’t know the character contents of code pages in general - I suppose my answer would be to experiment with one of the code pages to see which is the best choice based on usage, probably in this case either of the Greek or Turkish would be best. The problem is you will never be able to represent all characters of the three languages because that is by definition the problem with code pages.