[patch_done] Delphi 2010 + MySQL + UTF8

The alpha/beta tester's forum for ZeosLib 7.0.x series

Report problems concerning our Delphi 2009+ version and new Zeoslib 7.0 features here.

This is a forum that will be removed once the 7.X version goes into stable!!

Moderators: gto, EgonHugeist, olehs

User avatar
aperger
Expert Boarder
Expert Boarder
Posts: 129
Joined: 24.08.2005, 08:24
Location: Veszprém
Contact:

Post by aperger »

Hi mdaems,

Sorry for the late answer, but I have no time to do tests with MySQL.


I do test according to your idea (ZDbcMySql.pas):

Code: Select all

    { Sets a client codepage. }
    if FClientCodePage <> '' then
    begin
      {$IFDEF DELPHI12_UP}
      SQL := PAnsiChar(UTF8String(Format('SET NAMES ''%s''', [FClientCodePage])));
      {$ELSE}
      SQL := PAnsiChar(Format('SET NAMES ''%s''', [FClientCodePage]));
      {$ENDIF}
      FPlainDriver.ExecQuery(FHandle, SQL);
      CheckMySQLError(FPlainDriver, FHandle, lcExecute, SQL);
      DriverManager.LogMessage(lcExecute, FPlainDriver.GetProtocol, SQL);
    end;
And I set the collation property after the connection. It looks me OK. BUT but but…

Code: Select all

procedure TDM.SetPgSQLProperties;
begin
{$IFDEF LINUX}
  self.conDB.Properties.Values['codepage']:=codepage;
  self.conDB.Properties.Values['client_encoding']:=codepage;
{$ELSE}
  {$IFDEF UNICODE}
  if DefCodePageID = 65001 then begin
    codepage:='utf8';
  end else if DefCodePageID = 1250 then begin
    codepage:='WIN1250';
  end else begin
    codepage:='latin2';
  end;
  self.conDB.Properties.Values['codepage']:='UTF8';
  self.conDB.Properties.Values['client_encoding']:=codepage;
  {$ELSE}
  codepage:='latin2';
  self.conDB.Properties.Values['codepage']:=codepage;
  self.conDB.Properties.Values['client_encoding']:=codepage;
  {$ENDIF}
{$ENDIF}
end;


procedure TDM.SetMySQLProperties;
begin
{$IFDEF LINUX}
  self.conDB.Properties.Values['codepage']:=codepage;
  self.conDB.Properties.Values['client_encoding']:=codepage;
{$ELSE}
  {$IFDEF UNICODE}
  if DefCodePageID = 65001 then begin
    codepage:='utf8';
  end else if DefCodePageID = 1250 then begin
    codepage:='CP1250';
  end else begin
    codepage:='latin2';
  end;
  self.conDB.Properties.Values['codepage']:=codepage;
  self.conDB.Properties.Values['client_encoding']:=codepage;
  {$ELSE}
  codepage:='latin2';
  self.conDB.Properties.Values['codepage']:=codepage;
  self.conDB.Properties.Values['client_encoding']:=codepage;
  {$ENDIF}
{$ENDIF}
end;
I should set ‘UTF8’ in codepage parameter for PostgreSQL otherwise I got wrong characters. With MySQL database it is not working I should set codepage parameter according to value of ‘DefCodePageID:=DefaultSystemCodePage’;

I think we should set the same parameters for all of the databases, or the usage of these parameters are different at different SQL servers!?



Other….

I have a storage and invoice system (compiled with D2010) which based on ZeOS DBO.
This version of the program is running with PostgreSQL in production environment without any problem. But if I do test with MySQL, I have problems with BLOB fields. I can read back images from the database with ZeOS 7.0 (latest SVN TRUNK version) but I could not save correct images into the database. I think the fill of the BLOB filed parameters are wrong. I can you help me to fix it:

Code: Select all

      stAsciiStream, stUnicodeStream, stBinaryStream:
        begin
          TempBlob := DefVarManager.GetAsInterface(Value) as IZBlob;
          if not TempBlob.IsEmpty then
            Result := GetEscapeString(TempBlob.GetString)
          else
            Result := 'NULL';
Best regards
Attila
User avatar
mdaems
Zeos Project Manager
Zeos Project Manager
Posts: 2766
Joined: 20.09.2005, 15:28
Location: Brussels, Belgium
Contact:

Post by mdaems »

Hi Attila,

I committed the change of 'Set charset' to 'set names' for the mysql dbc driver. At least this should be better according to the documentation.

Concerning the 'inter-database codepage compatibility' I must admit I don't have a clue what that all is about. Using a western european windows system and keyboard until D2009 I didn't have to think about it.

For the storage of blobs in mysql + D2010 I'm afraid the problem is a missing string conversion or one conversion too much in the GetEscapestring function or the Tempblob.getstring function.
I owe you one when you can find it... As long as my D2009- and fpc versions keep working I accept every reasonable patch!

Mark
Image
User avatar
aperger
Expert Boarder
Expert Boarder
Posts: 129
Joined: 24.08.2005, 08:24
Location: Veszprém
Contact:

Post by aperger »

Hi Mark,

As I could find this "BUG?", I will inform you...
Thanks for the commit.

Attila
User avatar
aperger
Expert Boarder
Expert Boarder
Posts: 129
Joined: 24.08.2005, 08:24
Location: Veszprém
Contact:

Test Program: BLOB fileds

Post by aperger »

Hi All,

I have created a small test program to show what is my problem with characher sets.

Here is my code snipet (attched ZIP contains the full source):

Code: Select all

  // Why the encoding is working in different way at different protocols (DB type)?
  // It is worg!
  if ContainsText(connBlobTest.Protocol, 'postgresql') then begin
    connBlobTest.Properties.Clear;
    connBlobTest.Properties.Values['client_encoding']:='WIN1250';
		connBlobTest.Properties.Values['codepage']:='UTF8';
  end else
  if ContainsText(connBlobTest.Protocol, 'mysql') then begin
    connBlobTest.Properties.Clear;
    connBlobTest.Properties.Values['client_encoding']:='UTF8';
		connBlobTest.Properties.Values['codepage']:='CP1250';
  end;
Can anybody help me?
You do not have the required permissions to view the files attached to this post.
User avatar
aperger
Expert Boarder
Expert Boarder
Posts: 129
Joined: 24.08.2005, 08:24
Location: Veszprém
Contact:

Post by aperger »

The BLOB problem is NOT solved, only the PostgreSQL parts is working, why???
You do not have the required permissions to view the files attached to this post.
ibrahim
Fresh Boarder
Fresh Boarder
Posts: 12
Joined: 29.05.2009, 14:16

Post by ibrahim »

Hi to you all,
I opened a topic about Firebird-Delphi2009 and UTF8.
I noticed today this topic and since I have very similar results as those described here, I made a try creating a MySql database similar to my Firebird one.
Here are my conclusions/observations

1- in the TZAbstractRODataset.GetFieldData function changing the line
WStrCopy(Buffer, PWideChar(RowAccessor.GetUnicodeString(ColumnIndex, Result)));

to

WStrCopy(Buffer, PWideChar(UTF8Decode(RowAccessor.GetUnicodeString(ColumnIndex, Result))));

Resolves field displaying with both databases. Since, as pointed here in this topic, we can't know what effect this may have with other compilers I suggest to use (can it work?)

{$IFDEF UNICODE}
WStrCopy(Buffer, PWideChar(UTF8Decode(RowAccessor.GetUnicodeString(ColumnIndex, Result))));
{$ELSE}
WStrCopy(Buffer, PWideChar(RowAccessor.GetUnicodeString(ColumnIndex, Result)));
{$ENDIF}

2- With Firebird I can use TZquery to insert data with ExecSQL using the following sql statement
INSERT INTO LANGUAGES VALUES (-1,'انجليزي')

and I get my record inserted without any problem.
The same thing doesn't work with MySql. I get '???????' instead in the second field.

3-Any insert made with TTable will result in all charecters transformed in '?'.

I downloaded your example and I will try it with Delphi 2009.

I will be glad to make any test you may need.

Thank you all.

Ibrahim
ibrahim
Fresh Boarder
Fresh Boarder
Posts: 12
Joined: 29.05.2009, 14:16

Post by ibrahim »

Hi to you all,

I made a lot of debug today and I think/hope that I found a solution for the UTF8 problem with MySql.
This solved both reading and writing to database.

I made a lot of debug and I noticed that the problems come out when you cast a WideString to an AnsiString and pass it as a parameter.
In this case the passed string gets filled with '?' characters.

I think and hope that what I found can help figure out the rest of problems about UTF8 strings.

The changes I mad are

1- In the TZRowAccessor.SetUnicodeString procedure and after

Code: Select all

	FBuffer.Columns[FColumnOffsets[ColumnIndex - 1]] := 0;
I added the following

Code: Select all

        if DetectUTF8Encoding(Value) = etUTF8 then
           Value := UTF8Decode(Value);
This solves the read problem of UTF8 fields for both MySql and Firebird

The following two modifications solved the write problem to the database

2- In TZGenericCachedResolver.FillStatement procedure
I changed the following code

Code: Select all

     stString:
        Statement.SetString(I + 1, RowAccessor.GetString(ColumnIndex, WasNull));
      stUnicodeString:
        Statement.SetUnicodeString(I + 1,
          RowAccessor.GetUnicodeString(ColumnIndex, WasNull));
To the following one

Code: Select all

      {$IFDEF DELPHI12_UP}
      stUnicodeString,stString:
        Statement.SetUnicodeString(I + 1,
          RowAccessor.GetUnicodeString(ColumnIndex, WasNull));
      {$ELSE}
      stString:
        Statement.SetString(I + 1, RowAccessor.GetString(ColumnIndex, WasNull));
      stUnicodeString:
        Statement.SetUnicodeString(I + 1,
          RowAccessor.GetUnicodeString(ColumnIndex, WasNull));
      {$ENDIF}

3- In TZMySQLEmulatedPreparedStatement.GetEscapeString procedure
I changed the following code

Code: Select all

 {$IFDEF DELPHI12_UP}
    BufferLen := FPlainDriver.GetEscapeString(Buffer, PAnsiChar(AnsiString(Value)), Length(Value))
  else
    BufferLen := FPlainDriver.GetRealEscapeString(FHandle, Buffer, PAnsiChar(AnsiString(Value)), Length(Value));   
  {$ELSE}

To the following one

Code: Select all

  {$IFDEF DELPHI12_UP}
    BufferLen := FPlainDriver.GetEscapeString(Buffer, PAnsiChar(AnsiString(Value)), Length(Value))
  else      
      {$IFDEF UNICODE}
          BufferLen := FPlainDriver.GetRealEscapeString(FHandle, Buffer, PAnsiChar(UTF8Encode(Value)), Length(Value)*2);
      {$ELSE}
          BufferLen := FPlainDriver.GetRealEscapeString(FHandle, Buffer, PAnsiChar(AnsiString(Value)), Length(Value));
      {$ENDIF}


I will try to debug the firebird insert/modify question.
Post here if you have any idea that may save me some time debug.

Ibrahim
User avatar
mdaems
Zeos Project Manager
Zeos Project Manager
Posts: 2766
Joined: 20.09.2005, 15:28
Location: Brussels, Belgium
Contact:

Post by mdaems »

ibrahim,

Can you provide full diff output or just a zip of the files you changed? I can try to pull them through the test suite to check if they don't break tests and even if they solve test failures for D2009+.

Mark
Image
User avatar
aperger
Expert Boarder
Expert Boarder
Posts: 129
Joined: 24.08.2005, 08:24
Location: Veszprém
Contact:

Post by aperger »

Thanx everybody to work on this problem!

I think "ibrahim"'s modification is close to solution of "KDeveloper2".

On additional point: What is the meaning of these connection parameters? And what is the different between 'client_encoding' and 'codepage'?

Code: Select all

connBlobTest.Properties.Values['client_encoding']:='WIN1250';
connBlobTest.Properties.Values['codepage']:='UTF8';
These parameters should work on the same way for all of the database types!

Attila
ibrahim
Fresh Boarder
Fresh Boarder
Posts: 12
Joined: 29.05.2009, 14:16

Post by ibrahim »

Hi Mark,
I'm sorry but I read your post just today.
Here is in attachment the files I changed.
You will find //IBRAHIM comment in them where I changed the code.
I included also an Interbas/Firebid file that helps to solve the UFT8 problems with firebird/interbase as I wrote in this topic
http://zeos.firmos.at/viewtopic.php?t=2960

Ibrahim
You do not have the required permissions to view the files attached to this post.
User avatar
aperger
Expert Boarder
Expert Boarder
Posts: 129
Joined: 24.08.2005, 08:24
Location: Veszprém
Contact:

Basic PATCH for MYSQL

Post by aperger »

Hi mdaems,

I have created a PATCH for MySQL/PostgreSQL (and maybe for other types, too!) to handle the character sets in correct way.
I think in MySQL parts contains a BUG, because there was not identified the type of character based fields correctly, because the character set property was by-passed (FieldHandle points to MYSQL_FIELD).
Why we need to identify the type of a column twice?
• in the TZMySQLResultSet.Open function (called by ConvertMySQLHandleToSQLType, it is correct NOW)
• in TZMySQLDatabaseMetadata.UncachedGetColumns function (called by ConvertMySQLTypeToSQLType).

These two functions should provide the same results! At the moment is not working properly, because the character based filed type should changing by the parameters like:
connBlobTest.Properties.Values['client_encoding']:='UTF8';
connBlobTest.Properties.Values['codepage']:= 'UTF8';


Please check my solution/plan (I do not use any IFDEF for MySQL or Delphi version but it should work)

Thanks
You do not have the required permissions to view the files attached to this post.
cnliou
Zeos Dev Team
Zeos Dev Team
Posts: 31
Joined: 11.11.2005, 12:18

Post by cnliou »

I am getting the feeling that the UNICODE issues are cross databases. What we actually need is perhaps a generic UNICODE implementation for all drivers.

File "multibyte.c" in psqlODBC is an invaluable resource for us in terms of its quality.

Please also refer to bug ID# 228:

http://zeosbugs.firmos.at/view.php?id=228

and

http://zeos.firmos.at/viewtopic.php?t=2960

Regards,

CN
User avatar
mdaems
Zeos Project Manager
Zeos Project Manager
Posts: 2766
Joined: 20.09.2005, 15:28
Location: Brussels, Belgium
Contact:

Post by mdaems »

aperger,

I didn't try yet, but since what mysql version nforation_schema.collations exists?

Mark
Image
User avatar
aperger
Expert Boarder
Expert Boarder
Posts: 129
Joined: 24.08.2005, 08:24
Location: Veszprém
Contact:

Post by aperger »

Hi Mark,

I use 5.1.50 version of MySQL.

Regards,
Attila
User avatar
mdaems
Zeos Project Manager
Zeos Project Manager
Posts: 2766
Joined: 20.09.2005, 15:28
Location: Brussels, Belgium
Contact:

Post by mdaems »

aperger,
I just tried your patch. (I had to change ContainsText to AnsiContainsText in one place for Delphi7 and Lazarus)
Applying the test suite to it proves the patch is quite safe.
Dbc and component test are not broken. Only 2 Bugreport tests failed:

Code: Select all


  5) Test894367: ETestFailure
     at 
      "mysql5/mysql-5: expected: <1> but was: <24>"
  6) Test894367: ETestFailure
     at 
      "mysqld5/mysqld-5: expected: <1> but was: <24>"
  7) Test894367: ETestFailure
     at 
      "mysql5p/mysql-5: expected: <1> but was: <24>"
  8) Test920589: ETestFailure
     at 
      "mysql5/mysql-5: expected: <1> but was: <24>"
  9) Test920589: ETestFailure
     at 
      "mysqld5/mysqld-5: expected: <1> but was: <24>"
 10) Test920589: ETestFailure
     at 
      "mysql5p/mysql-5: expected: <1> but was: <24>"
I must admit :I didn't look into that issue yet. Can you do it?
Here's the code:

Code: Select all

{**
  Runs a test for bug report #894367
  Wrong processing of queries with non-unique field names.
}
procedure TZTestCompMySQLBugReport.Test894367;
var
  Query: TZQuery;
begin
//  if SkipClosed then Exit;

  Query := TZQuery.Create(nil);
  try
    Query.Connection := Connection;
    Query.SQL.Text := 'SELECT a.fld1, b.fld2, 1 + 2 as fld2, a.fld2,'
      + ' c.fld1, b.fld1, c.fld2, ''xyz'' as fld1'
      + ' FROM table894367a as a, table894367b as b, table894367c as c';
    Query.Open;

    CheckEquals(Ord(ftString), Ord(Query.Fields[0].DataType));
    CheckEquals(Ord(ftFloat), Ord(Query.Fields[1].DataType));
    CheckEquals(Ord(ftLargeInt), Ord(Query.Fields[2].DataType));
    CheckEquals(Ord(ftBoolean), Ord(Query.Fields[3].DataType));
    CheckEquals(Ord(ftBlob), Ord(Query.Fields[4].DataType));
    CheckEquals(Ord(ftInteger), Ord(Query.Fields[5].DataType));
    CheckEquals(Ord(ftLargeInt), Ord(Query.Fields[6].DataType));
    CheckEquals(Ord(ftString), Ord(Query.Fields[7].DataType));

    Query.Close;

    Query.SQL.Text := 'SELECT a.*, 1 + 2 as fld2, b.*,'
      + ' c.fld1, c.fld2, ''xyz'' as fld1'
      + ' FROM table894367a as a, table894367b as b, table894367c as c';
    Query.Open;

    CheckEquals(Ord(ftString), Ord(Query.Fields[0].DataType));
    CheckEquals(Ord(ftBoolean), Ord(Query.Fields[1].DataType));
    CheckEquals(Ord(ftLargeInt), Ord(Query.Fields[2].DataType));
    CheckEquals(Ord(ftInteger), Ord(Query.Fields[3].DataType));
    CheckEquals(Ord(ftFloat), Ord(Query.Fields[4].DataType));
    CheckEquals(Ord(ftBlob), Ord(Query.Fields[5].DataType));
    CheckEquals(Ord(ftLargeInt), Ord(Query.Fields[6].DataType));
    CheckEquals(Ord(ftString), Ord(Query.Fields[7].DataType));

    Query.Close;
  finally
    Query.Free;
  end;
end;

{**
  Runs a test for bug report #920589
  Bug in method TZGenericSQLQuoteState.NextToken.
}
procedure TZTestCompMySQLBugReport.Test920589;
var
  Query: TZQuery;
begin
  if SkipClosed then Exit;

  Query := TZQuery.Create(nil);
  try
    Query.Connection := Connection;

    Query.SQL.Text := 'SELECT "aa\"aa"';
    Query.Open;

    CheckEquals(Ord(ftString), Ord(Query.Fields[0].DataType));
    CheckEquals('aa"aa', Query.Fields[0].AsString);

    Query.Close;
    
    Query.SQL.Text := 'insert delayed into log_sql'
      + ' (datum, uzivid, lockid, sqlcommand)'
      + ' values (now(), 3, 11952,'
      + ' "update global set value = \"21.3.2004 18:50:17\"'
      + ' where name = \"DAY_JOB_LAST_RUN\"")';
  finally
    Query.Free;
  end;
end;
The tests faled for Lazarus, D7, D10 and D11, not for D12.
Seems there's a problem finding the right datatypes in these extreme cases.

The 'SHOW FULL COLUMNS' you're using is supported for mysql4.1 and up, so this is OK for Zeoslib 7.

Can you provide some feedback?

Mark
Image
Locked