Changing collation in a SQL database

What’s the best way to change the collation in a SQL 2005 database? Using the alter database from the NAV client or directly from SQL management studio?

I tried to do it from the NAV client, but approx. when it reach 69% complete then it stopped, as the transation log file could not grow more. The database was 18 GB and the log file had grown to 78 GB! (It was empty when I started)

How about doing a navision backup and create a new db with diff collation and restore the backup?

That also crossed my mind, but it takes quite a lot of time to restore an 18 GB database… And as our operation is basically 24/7 then I’m looking for a way to minimize the time as much as possible.

You could disable all the keys. and enable them after collation change. You can also change the recovery model to simple Simple if that helps.

Well recovery model was already changed to simple… And disabling / enabling keys is the main reason why restoring takes so much time…

As I have seen if I want to do the collation change from the SQL Management Studio, then I must first do it on the database level, and then for all individual tables in the database. Is that true?

yes, a db has a collation and then every table has its own collation.

It’s a big sql statement to change all the tables collation, I would rather do it from navision.

The other option is to do it in a small db. Turn on SQL profiler and capture all the sql statements, then do them from enterprise manager in test environment on another db, then in prod.

If you just wanted to change the collation of the Master db you could do from within SQL but on the databases I recommend NAV backup and restore to an newly created db (with same collation in master for optimal performance). 20GB should’t take more than 3-5 hours to restore having the right hardware (backup probably 1-2 hours) and performing the restore directly from the SQL server.

Well actually creating a NAV backup is not an option. A large number of different errors are coming up when trying to do a NAV backup! I think it’s due to the fact that we are running a lot of different character sets in the database already.

Yes it is not possible to do navision backup of companies where record have been entered with other codepages if windows collation.

KR

Kim

Just another reason to change to SQL collation!

Well I better gonna add some more space for that tlog file!

Using a SQL collation does not avoid codepage mismatch issues.

Both SQL collations and Windows collations (the latter as single-byte, which NAV utilizes) depend on a selected code page which by default must match between the client and server when accessing a database.

The problem of having characters that were generated by a machine running a certain code page when doing a backup or export, then restoring/importing on a machine with a different code page is common to both collation types.

It is not even a “database collation” concept - merely a code page concept. It is the same if you edit a text file with an OEM editor using, e.g. code page 437, then give that to someone opening it with Notepad using ANSI 1252. Extended characters will be different.

Since NAV does not utilize either collation types as Unicode, you need to ensure that code pages are fixed, or use characters that do not suffer under conversion.

Adding a different drive with a 100GB transaction log file did the trick. The collation is not set to the same for both server and db! And it’s possible to run the NAV backup again.

Ok, how would you then setup the server/database to allow have both ex. Danish, Greek, Turkish, Thai and Chinese characters (but in different companies and different Citrix front ends)?

The question was partly answered in http://dynamicsuser.net/forums/t/20849.aspx. Please continue the debate about what collation to use in that thread. This thread was mainly a technical issue regarding basic task of chaning the collation.

I think Erik wanted to refer to this thread: http://dynamicsuser.net/forums/t/20679.aspx which is how it can be configured allowing different languages and codepages in one central db.

That will certainly allow all Scandinavian characters and Western European, but not Greek and Turkish too.

Also, it does not capture the more complex dictionary rules I mentioned, even for the Scandinavian languages, because SQL collations are very much lacking here.

For example, “AA” is order between “A” and “B”. In the Danish dictionary, this is actually ordered after “Z” because it is equal to “Å” which is last in the alphabet. The Windows collation for Danish/Norwegian does capture this and other similar language rules.

The arguments presented there against windows collations (around extended characters converting incorrectly) are equally applicable to SQL collations, and SQL collations are inferior in representing all language rules. This is really why windows collations were introduced - to align with the OS and fully represent all language rules - for a particular locale. But they, nor SQL collations, do not unify code pages and those issues - Unicode does.

Hi Dean,

I don’t understand why you say that will not work. I have had one database the last 5 years including of course western european and US languages but also include Korean, Chinese and Turkish. It works fine with the SQL collation as described but if using windows collation it gives the noted errors (refer to the other thread).

So without being SQL expert I have tested real life examples and know what works and what does not work … I assume this is due to the different handling of codepages for SQL and Windows collation (also refer to previous thread where I refer to MS article.

I assume this database is also an OEM850 code page? Are all connecting clients using the same code page to connect or have you disabled the “Validate Code Page” check box?

I think these discussions are less about SQL and more about code page issues in general.

I don’t doubt your system is working; but we need to define working and be clear what you are doing here - if you are mixing these languages then you are not using the dictionary aforded by the collation itself, but just using it as a storage repository of bytes. (In this scenario I would choose instead a binary windows collation, as I mentioned). I am saying that there is no single code page that will provide the dictionary comparisons and ordering of these three languages correctly. I have also seen an installation using Japanese (a DBCS code page) that used a binary windows collation and therefore had no ordering semantics at all - however the binary order approximately matched the grouping of the double-byte characters in Japanese, so it was a reasonable compromise. Almost any collation could be used in that case - and I expect that’s also true of your example. You are not actually using the collation for anything, but if that meets your needs, its fine.

I don’t believe the error your are referring to in the thread is due to a windows collation but due to a code page mismatch. C/SIDE uses OEM characters for almost everything so any files (e.g. a backup) it writes are using the OEM of the machine that client is running on (e.g. 850). If you hand that file to a machine not using 850 (e.g. 437) - you will get conversion errors during the restore. This is the case for both windows and SQL collations, and therefore I don’t see this as an argument for SQL collations.

Finally, it is absolutely right that SQL collations and Windows collations are very different, as the article you refer to is stating. This is why I would always recommend a Windows collation - they capture the full semantics of the language - assuming that is important for an installation - and SQL collations do not. They are also, in terms of a single-byte system, future-proof and have dispensed with the OEM code pages altogether.

I have been asking around the last 5 years to different MS employees to find an optimal collation.

What we want is:

  • a single database / datastore for all countries (and of course validate codepage is deactivated)

  • a collation that supports the sorting of most languages (however recommend e.g. turkish, chines, korean users not to use their local characters in codefields)

  • no errros when making navision database tests and navision backups

I my scenario I have a central database with SQL collation 850 and western european/US clients as well as special clients for Chinese, Turkish and Korean. With my standard european client I can make backups and restore of all companies into testdatabases etc. without problems.

However if I perform exactly the same with the only difference that the database has windows collation Danish/Norweigan I receive following error when performing database test of trying to make a navision backup of my Chinese, Turkish and Korean companies "There is an error in the database structure …….” I have even tried then to backup the chinese company with a chinese client but still error.

That’s why I have concluded that I need to use SQL collation when having a central datastore because of the difference in the handling of non-unicode characters in Windows and SQL collations as described in the MS article.

However I am still searching for advices about optimal choise of collation (that supports most languages) for central database storing data for all countries around the world.

Hi Kim,

There is no single choice of collation for the scenario you have, as long as code pages are involved (and NAV currently is only a code page based product). This is why there are so many different collations for each locale-specific language rules. If sorting and comparison of data is not important, meaning you can tolerate inaccuracies, then the choice of collation is really arbitrary too.

The “Validate Code Page” setting was introduced to aid your kind of scenario, and is a back door - it is explicitly breaking the contract between client and server about character conversions by avoiding the code page check which drives the ANSI<–>OEM translation on the client machine.

If you use it, you are accepting possible conversion problems (or avoiding characters that have those problems, which must be difficult running Chinese, Turkish etc all in a code page of 255 characters) and also relying upon, in your case for example, the scandinavian comparison and sorting rules being applied to Chinese and so on - which is not predictable.

It is great that this setting is working for you. However, I would not generalize this into a recommendation for everyone, either to use SQL collations or to disable the “Validate Code Page” option.