Page 1 of 1

Delphi (XE) + MySQL + Latin1 Troubles and Solutions

Posted: 18.03.2011, 16:56
by Lidael
In the last two weeks i moved an old Turbo 2006 project do Delphi XE with Zeos 7.0 Alpha from SVN.
The project uses Zeos to connect to an MySQL Database with charset latin1. All worked fine, until i realized that sending data to the database will scramble any special chars from latin1 (here: ÄÖÜäöüß). But ready already correctly stored special chars is not a problem. After some reading, and doing some tests i found out was was going on:

Regardles of the current codepage in the connection, zeos encodes all strings as UTF8. Ok, lets just set the codepage of the connection to utf8 and let the database handle the conversation from utf8 to latin1. Now Saving works fine, but the data fetched from the server is now also utf8 encoded (trying to set Params like character-set_results didn't worked), but Zeos won't do the UTF8 decoding. Shit.

Ok lets do a quick try with an utf8 databse and charset=utf8. woot, all works like a breeze. Getting some information about converting the database to utf8 brings out that i either can do it with only one command per table, but then maybe having changed fieldtypes and sizes, but that was not what i want, or changing each field in every table one by one. Hell no.
Although i wasn't quite sure about the other applications (PHP, ...) accessing these database will work without any modifications.

After some searching here in the forums i found the following solution:
1.) from http://zeos.firmos.at/viewtopic.php?p=12092#12092

In File src/dbc/ZDbcCache.pas in TZRowAccessor.SetUnicodeString after Line 2127 (FBuffer.Columns[FColumnOffsets[ColumnIndex - 1]] := 0;) add the following

Code: Select all

if DetectUTF8Encoding(Value) = etUTF8 then
  Value := UTF8Decode(Value);
(You need to add the Unit WideStrUtils)

That solves all CHAR and VARCHAR-Fields, but not TEXT Fields

2.) from http://zeos.firmos.at/viewtopic.php?p=12834#12834

In File src/dbc/ZDbcMySqlResultSet.pas in TZMySQLResultSet.GetBlob(ColumnIndex: Integer) changing from:

Code: Select all

function TZMySQLResultSet.GetBlob(ColumnIndex: Integer): IZBlob;
var
  Stream: TStream;
begin
{$IFNDEF DISABLE_CHECKING}
  CheckBlobColumn(ColumnIndex);
{$ENDIF}
  Stream := nil;
  try
    if not IsNull(ColumnIndex) then
    begin
      Stream := TStringStream.Create(UTF8ToString(GetString(ColumnIndex)));
      Result := TZAbstractBlob.CreateWithStream(Stream)
    end
    else
      Result := TZAbstractBlob.CreateWithStream(nil);
  finally
    if Assigned(Stream) then
      Stream.Free;
  end;
end;
to:

Code: Select all

function TZMySQLResultSet.GetBlob(ColumnIndex: Integer): IZBlob;
var
  Stream: TStream;
  Value: AnsiString;
begin
{$IFNDEF DISABLE_CHECKING}
  CheckBlobColumn(ColumnIndex);
{$ENDIF}
  Stream := nil;
  try
    if not IsNull(ColumnIndex) then
    begin
      Value := GetString(ColumnIndex);
      if DetectUTF8Encoding(Value) = etUTF8 then
        Value := UTF8Decode(Value);
      Stream := TStringStream.Create(Value);
      Result := TZAbstractBlob.CreateWithStream(Stream)
    end
    else
      Result := TZAbstractBlob.CreateWithStream(nil);
  finally
    if Assigned(Stream) then
      Stream.Free;
  end;
end;
Thats all to get it working with a latin1 (maybe even other types) and connection.Property charset=utf8.
According to my little test programm it even won't break utf8 databases.

Hope this helps someone migrating to 7

EDIT: its codepage=UTF8 NOT charset=UTF8, sry

Posted: 22.06.2011, 09:20
by Lilive
Hello
Thanks for your post. It helped me very much :!:
I have modified the source code of Zeoslib like above and I have recompiled Zeoslib.

but it don't work with ZConnection1.Property charset=utf8.
The special chars from latin1 aren't displayed

It work fine with ZConnection1.Property Codepage=utf8.
all special chars from latin1 are correctly displayed when they are send from the MySQL DB. :)

When I sends a request with special chars from latin1,

Code: Select all

'SELECT name FROM t_family WHERE name LIKE "%bébé%"'
i have this error :

Code: Select all

'SQL Error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation 'like''
I solved this problem with the mention of character set in the request :

Code: Select all

'SELECT name FROM t_family WHERE name LIKE _latin1"%bébé%"'
And it work fine 8) .

My config : DEPHI XE
+ ZEOSLIB_TESTING_REV878 modified like above
+ MySQL 5.1.57 with character set latin1 collate latin1_swedish_ci by default

Posted: 24.06.2011, 12:24
by Lidael
Lilive wrote:Hello
but it don't work with ZConnection1.Property charset=utf8.
The special chars from latin1 aren't displayed

It work fine with ZConnection1.Property Codepage=utf8.
all special chars from latin1 are correctly displayed when they are send from the MySQL DB. :)
Oh, my fault. Its codepage, not charset, i will edit it in the initial post.

My settings are:
compression=true
codepage=UTF8
MYSQL_OPT_RECONNECT=true

Delphi XE (Prof), ZEOS-SVN trunk Rev 875, MySQL 5.1.41, collate, latin_general_ci or latin_swedish_ci (default).

I just had a LIKE query test with %é% and a latin1_general_ci Table AND Column, and i didn't got the error.

I'm not sure, if it it something introduced in ZEOS after Rev 875, or something related to the mysql server (or dll?) version, but mine is older, so this shouldn't be the problem. And i'm sure the collation of the single colums is also right.
The codepage should normally only tell mysql in which chartset the communication client<->server is handled, and mysql should do the needed conversation on the server side on its own.

Posted: 25.06.2011, 18:26
by Lilive
Hello,
Oh, my fault. Its codepage, not charset, i will edit it in the initial post.
Errare humanum est ... :wink:

To the question of characters sent in a query to MySQL, I made tests.

Indeed it seems that it works, as you say, if you send a single character like "é" or "e" for example.

Code: Select all

SELECT name FROM t_table WHERE name LIKE '%ë%'

this code is accepted and returns rows containing the characters "e é è ê ë..."

but it does not work properly if we add other characters.

for example

Code: Select all

SELECT name FROM t_table WHERE name LIKE '%noë%'

This code is accepted without error but returns a MySQL result absurd. :shock:

Then there is the error message if we add one more character:

Code: Select all

SELECT name FROM t_table WHERE name like '%noël%'
This code causes the display of the following error message :

Code: Select all

'SQL Error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation 'like'' 
Currently the only way I have find to write a query that returns a valid result is the insertion of _LATIN1 before the string:

Code: Select all

SELECT name FROM t_table WHERE name LIKE _LATIN1'%noël%'
Still good Sunday 8)

Posted: 27.06.2011, 08:47
by Lilive
I return to the property codepage = utf8 for TZConnection component.
Opening the connection with empty properties ZConnection, then performing a query ZQuery.SQL = SHOW VARIABLES
This gives the following values:
(Remember, I use Delphi XE + ZEOSLIB_TESTING_REV878 modified like above + MySQL 5.1.57 with character set latin1 collate latin1_swedish_ci by default.

Code: Select all

character_set_client = latin1 
character_set_connection = latin1 
character_set_database = latin1 
character_set_filesystem = binary 
character_set_results = latin1 
character_set_server = latin1 
character_set_system = utf8 
collation_connection = latin1_swedish_ci 
collation_database = latin1_swedish_ci 
collation_server = latin1_swedish_ci 
After reopening the connection with ZConnection.properties codepage = utf8,
by performing a query ZQuery.SQL = SHOW VARIABLES, we obtain the following values:

Code: Select all

character_set_client = utf8 
character_set_connection = utf8 
character_set_database = latin1 
character_set_filesystem = binary 
character_set_results = utf8 
character_set_server = latin1 
character_set_system = utf8 
collation_connection = utf8_general_ci 
collation_database = latin1_swedish_ci 
collation_server = latin1_swedish_ci 
codepage = utf8 equivalent to the "CHARACTER_SETS SET utf8".
it does not change anything on the MySQL database and applies only to the session initiated by TZConnection.
This lasts as TZConnection.connected = true.
Thus, "SET CHARACTER_SET utf8" tells the server, “future incoming messages from this client are in character set utf8.” It also specifies the character set that the server should use for sending results back to the client. (For example, it indicates what character set to use for column values if you use a SELECT statement.)
Also according to the manual of MySQL this conversion is slow ...

Concerning the sending of special characters latin1 I found a slightly different behavior between design mode and a compiled executable.
In design mode, you can only use the insertion of an introducer _LATIN1 before the chain as I wrote above.

With the following code:

Code: Select all

TForm1.FormCreate procedure (Sender: TObject); 
begin 
  with Form1.ZConnection1 do begin 
  Connected: = False; 
  Database: = 'MyBase'; 
  HostName = 'localhost'; 
  Password = 'secret'; 
  Port: = 3306; 
  Properties.Clear; 
  Properties.Add ('codepage = utf8'); 
  Connected: = True; 
  end; 
end; 

procedure TForm1.Button1Click (Sender: TObject); 
begin 
  with Form1.ZReadOnlyQuery1 do begin 
    Active: = False; 
    SQL.Clear; 
    SQL.Add ('SELECT name FROM t_family WHERE name LIKE'%% '+ + Form1.Edit1.Text'%%"'); 
    Open; 
    end; 
end; 
If written in the Edit1 "noel" (without accent) MySQL returns a valid result
If written in the Edit1 "ë", "noë" or "noël" was the error 'SQL Error: Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation 'like''

By altering the request :

Code: Select all

SQL.Add ('SELECT name FROM WHERE name LIKE t_family _LATIN1"%%' + + Form1.Edit1.Text'%%"'); 
Hopefully

but it can also encode the string from the Edit1 in UTF8 :

Code: Select all

SQL.Add ('SELECT name FROM t_family WHERE name LIKE '%% '+ UTF8Encode(Form1.Edit1.Text) +'%%"'); 
and it works fine 8)

Here, with the changes to the source code for Zeoslib by Lidael and some guidance above, can handle read/write a MySQL database whose tables are encoded LATIN1, from Delphi XE and Zeoslib components in version 7. :D

Concerning the UTF-Problems

Posted: 04.10.2011, 15:07
by EgonHugeist