Page 1 of 1

Problem with transcoding to UTF8

Posted: 12.03.2013, 21:00
by marsupilami
Hello Zeos Team,

I had some problems when trying to create procedures in a Firebird database. I put the following code in a TZSqlProcessor:

Code: Select all

create procedure testproc as
begin
  /* this is a test for umlauts äöüß */
end ^
If the ClientCodepage of the TZConnection is set o UTF8, then only the part up to the trailing whitespace after umlauts gets into the database. Everything from the ä to the end will be missing.
If the ClientCodepage is set to WIN1252 everything works as expected. So it seems that the transcoding to UTF8 doesn't work as expected?

My Environment is Delphi XE2 and Zeos 7.0.3.[edit]The database is Firebrd 2.1.5, and so is the protocol in Zeos.[/edit]

If it is needed I can add a small sample program to this thread for demonstrating the problem.

Best regards,

Jan

Posted: 13.03.2013, 22:05
by EgonHugeist
marsupilami,

An example would be great! We've some tests wich do execute such script with accedentual chars but it might be possible there is an remainig issue i don't know about.

Does it happen only if you've commented a line or does it happen allways?

Posted: 14.03.2013, 18:30
by marsupilami
Hello Michael,

it seems to only happen in comments. my current Script looks like this:

Code: Select all

insert into testtable (ID, TEXT) values (1, 'this is a test for umlauts äöüß') ^

create procedure testproc as
begin
  insert into testtable (ID, TEXT) values (1, 'this is a second test for umlauts äöüß');
  /* this is the third test for umlauts äöüß */
  insert into testtable (ID, TEXT) values (1, 'this is the fourth test for umlauts äöüß');
end ^
The insert in the first line is done and the first line of the procedure also makes it into the database. But starting from the first umlaut in the comment, everything else is missing in the database. This only happens with UTF8. If I switch the ClientCodepage to WIN1252, everything works as expected. For your reference, I added my current test application.

Best regards,

Jan

Posted: 21.03.2013, 19:21
by marsupilami
Hello Zeos Team,

for me it seems like I found the bug. In ZDbcStatemant.pas there is a function TZAbstractStatement.GetEncodedSQL. In that function there is a line

ttWord, ttQuotedIdentifier, ttKeyword:

Just add ttComment to it. For me this fixes the problem. A diff -uN is included in this post.
Best regards,

Jan

Posted: 20.05.2013, 16:33
by EgonHugeist
marsupilami,

hi Jan. Moving my house is done now. I'm sorry for the delay. Have to prepare some more things in the background and i'll gonna start to check each forum hint/patch.

Be patiant..

Posted: 23.05.2013, 11:09
by marsupilami
Hello Michael :)

I guessed something like this is happening. I will be patient :)
Best regards,

Jan

Posted: 24.05.2013, 09:07
by EgonHugeist
marsupilami,

Jan it seems you current Rev is a little bit deprecated. My and SVN current code:

Code: Select all

function TZAbstractStatement.GetEncodedSQL(const SQL: {$IF defined(FPC) and defined(WITH_RAWBYTESTRING)}ZAnsiString{$ELSE}String{$IFEND}): ZAnsiString;
var
  SQLTokens: TZTokenDynArray;
  i: Integer;
begin
  if GetConnection.AutoEncodeStrings then
  begin
    Result := ''; //init for FPC
    SQLTokens := GetConnection.GetDriver.GetTokenizer.TokenizeEscapeBufferToList(SQL); //Disassembles the Query
    for i := Low(SQLTokens) to high(SQLTokens) do  //Assembles the Query
    begin
      case (SQLTokens[i].TokenType) of
        ttEscape:
          Result := Result + {$IFDEF UNICODE}ZPlainString{$ENDIF}(SQLTokens[i].Value);
        ttQuoted,
        ttWord, ttQuotedIdentifier, ttKeyword:
          Result := Result + ZPlainString(SQLTokens[i].Value);
        else
          Result := Result + ZAnsiString(SQLTokens[i].Value);
      end;
    end;
  end
  else
    {$IFDEF UNICODE}
    Result := ZPlainString(SQL);
    {$ELSE}
    Result := SQL;
    {$ENDIF}
end;
You can see: Your report is allready handled. Confirmed?

Posted: 26.05.2013, 22:06
by marsupilami
Hello Michael,

erm - I did not check any source code outside of this thread this evening. But the source code in your post still does not handle ttComment the same way that ttWord, ttQuotedIdentifier and ttKeyword are handeled and so most probably does not yet convert comments to UTF8 yet?
Or am I getting something wrong here? I will check the current SVN tomorrow.
My patch has ben done to the curent download version.
Best regards,

Jan

Posted: 26.05.2013, 22:44
by EgonHugeist
marsupilami,

i was wrong (Augenkrebs (; ). Patch done: R2267 \testing7.1 (SVN)