Full Unicode/Ansi-Support in /testing branch
Posted: 25.02.2012, 23:48
Hey guys,
I did some internal changes [s]in my private testing-egonhugeist branch[/s]...
I want to inform you about my little changes and the new behavoir of Zeos7.
First: where to get the stuff?
Use TortoiseSVN with URL:
http://svn.code.sf.net/p/zeoslib/code-0 ... es/testing
Use the "how to" from MDaems on http://zeos.firmos.at/viewtopic.php?t=841
Russian description: http://zeos.firmos.at/viewtopic.php?p=15003#15003
second: sorry for my english.
1. TZConnection: I've implemented a property for the ClientCodePage (Connection-CharacterSet) which automaticaly rearanges the ZConnection.Properties and did not use Metadata-Informations. This sets also the Client-Side CharacterSet to the Server and informs Zeos7 about the encoding the server expects or sends.
2. TZConnection.PreprepareSQL: Boolean:
This is a Query-disassamble/reassamble-Step for you. Let me explain.
The purpose of this property:
If you connect with Delphi and choose a Unicode-CharacterSet (TZConnection.ClientCodePage: String) then you must encode all your string-values manually in your code.
An example:
For Delphi7-2007/FPC you have to use:
ZQuery1.SQL.Text := 'insert into table1 (Field1,..) values ('+AnsiToUTF8('öäüßÄÜÖ')+',...);
For Delphi2009-XE2 you have to use:
ZQuery1.SQL.Text := 'insert into table1 (Field1,..) values ('+UTF8Encode('öäüßÄÜÖ')+',...);
If TZConnection.PreprepareSQL = True then you do not need this any longer. ZQuery1.SQL.Text := 'insert into table1 (Field1,..) values ('öäüßÄÜÖ',...); works perfectly.
For Delphi7-2007 i'm able to detect if your string values are still encoded.
For Delphi2009-XE2 i've to pass. I don't know a 100% save way to detect the 1Byte encoded Strings in a UnicodeString. Here i've to say: be carefull with exsiting sources!
My actual state has ONE mess: If you use "CharecterSet/Collation"-special prefixes in your Statement's... Then trouble should be the result! This is the second reason why i left this PreprepareSQL-step optional. Now you can portate FPC-Projects to Ansi-Delphi-Compilers to Unicode-Delphi-Compilers. I see no real problems here....
[s]Also I've implemented a property called ClientCodePageOptions. [/s]
3. For new exported ZConnection functions:
function GetBinaryEscapeStringFromString(const BinaryString: AnsiString): String; overload;
function GetBinaryEscapeStringFromStream(const Stream: TStream): String; overload;
function GetBinaryEscapeStringFromFile(const FileName: String): String; overload;
function GetAnsiEscapeString(const Ansi: AnsiString): String;
Details: These function are a sign for my Query-preprepare-step to ignore further translations. Also you can read in binary-files to add them to your SQL-Statements. Your can sign your own strings, Streams or i prepare your File-Data as detectable string.
4. changed GetString-behavoir of the internal IZResultSets to:
function GetString/GetStringByName(const ColumnName: string): String; The DBC-Layer and Component-Layer have the same String-Types now. Whatever "String" is type from.
This i've changed because of some registered bugreports of peoples who uses UTF8-Connections and want to compare with SourceCode-constants or ansi-files for example. Now i do prepare the Plaindriver-given data in a ansi compareable format. Which should make somthing easyier for you. If you want to compare something in UTF8 you can use the same function and set the parameter CharEncoding to ceUTF8.. I'm hopfully this will save your developing time too. (definition in ZCompatibility.pas)
5. Property TZConnection.UTF8StringAsWideField: Boolean;
Only interesting for D7-D007/FPC without Lazarus and has dependencies to the choosen characterset.
TZConnection.UTF8StringAsWideField = False:
Now Zeos does't use TWideString-Fields for Unicode-Data. If you've UTF8 Controls like http://sourceforge.net/projects/utf8vcl/ or TNT then you can enjoy real unicode.
TZConnection.UTF8StringAsWideField = True:
Zeos assumes now TWideFields for Unicode-String values. Problem: the Ansi Delphi comilers boiling down all Widstring chars to Ansi-OS compatible characters because the delphi standart controls using only AnsiStrings. Which means for D7-2007 automatic data loss or some characters where displayed as'?'. This is no result of Zeos...
6. New Zeos.inc defines:
{$DEFINE WITH_CHAR_CONTROL} this enables the Char-Control-System. Actually disabled.
7. Also i created a new generic Object called "TAbstractCodePagedInterfacedObject":
It stores the actual choosen Connection-parameters. So you can open different connections with different ClientCodePages. This Object also exports 3 litte string-helper functions which are full of compiler-directives. Here i've unified all string handlings Zeos internal needs. Which meens there is no more need for compiler directives in the Zeos sources then in these functions concerning string-translations. To stabelize the behavior for these functions i also need the TZCharEncoding types. You can find them in the unit ZCompatibility.pas.
function ZDbcString(const Ansi: AnsiString; const Encoding: TZCharEncoding = ceDefault): String;
function ZPlainString(const AStr: String; const Encoding: TZCharEncoding = ceDefault): AnsiString;
function ZStringW(const ws: WideString; const Encoding: TZCharEncoding = ceDefault): String;
Remember a Full-Unicode-IDE handles a String as UnicodeString(a type like WideString).
These functions are now the central point internal to avoid data-loss. They are able to do the critical Ansi-to-(Unicode)String translations and revert dependend on the choosen Client-Side CharacterSet. Like this:
[align=center][/align]
9. Changed bytea-behavior of PG9+:
Since Postgre9x the encode/decode-bytea behavior has changed. I've include these step in my patch here. (Now also avialbe in the testing-branch.)
10. Virtual Characterset "ZUTF8AsAnsi for SQLite":
Available for D7-D2007 and FPC without Lazarus. For all other IDE's we don't need this.
Here i start from the premise that you SQLite database is only used for a non real Unicode database. This characterset makes it possible to use SQLite with your Standart-AnsiControls. You don't need TNT-Componets any longer. Easiely switch TZConnection.PreprepareSQL := True and you can use SQLite as Ansi-Database.
These steps and changed behavior is hopefully ready and proofed for SQLite, FireBird, PostgreSQL, MySQL,ASA12, MSSQL and Oracle.
To our Delphi users. All unicode-fieldtypes whery choosen on the selected Codepage. Which means choose Latin1 or a other Ansi-Codepage the string/stream-field-types are from type TString-/TMemoField. If you choose a Unicode-Codepage the string/stream field types are from TWideString-/WideMemoField...
To our D12UP users: All Statements should work now. It doesn't matter if we've Full-Unicode-IDE or not. Also you can use other Codepages then UTF8. My modifications should make this possible without problems. Also i'd a lot of work to avoid possible dataloss.. (Actually 60 compilerwarnings left)
To all Zeos7 users: I need testers, bugreporters, developers who are interested in reliable forward developing of Zeos7 and some votes for the changes!!!!!!!
Best regards
Egonhugeist
btw: Some ideas (Firebird) of the internal change are from marsupilami, so Kudos, Jan!
And much more bigger THANK YOU! Mark to spend your rare time to be my tester and "second pair of eyes"!
I attached my little patch. It shows the changes to the last testing-branch
I did some internal changes [s]in my private testing-egonhugeist branch[/s]...
I want to inform you about my little changes and the new behavoir of Zeos7.
First: where to get the stuff?
Use TortoiseSVN with URL:
http://svn.code.sf.net/p/zeoslib/code-0 ... es/testing
Use the "how to" from MDaems on http://zeos.firmos.at/viewtopic.php?t=841
Russian description: http://zeos.firmos.at/viewtopic.php?p=15003#15003
second: sorry for my english.
1. TZConnection: I've implemented a property for the ClientCodePage (Connection-CharacterSet) which automaticaly rearanges the ZConnection.Properties and did not use Metadata-Informations. This sets also the Client-Side CharacterSet to the Server and informs Zeos7 about the encoding the server expects or sends.
2. TZConnection.PreprepareSQL: Boolean:
This is a Query-disassamble/reassamble-Step for you. Let me explain.
The purpose of this property:
If you connect with Delphi and choose a Unicode-CharacterSet (TZConnection.ClientCodePage: String) then you must encode all your string-values manually in your code.
An example:
For Delphi7-2007/FPC you have to use:
ZQuery1.SQL.Text := 'insert into table1 (Field1,..) values ('+AnsiToUTF8('öäüßÄÜÖ')+',...);
For Delphi2009-XE2 you have to use:
ZQuery1.SQL.Text := 'insert into table1 (Field1,..) values ('+UTF8Encode('öäüßÄÜÖ')+',...);
If TZConnection.PreprepareSQL = True then you do not need this any longer. ZQuery1.SQL.Text := 'insert into table1 (Field1,..) values ('öäüßÄÜÖ',...); works perfectly.
For Delphi7-2007 i'm able to detect if your string values are still encoded.
For Delphi2009-XE2 i've to pass. I don't know a 100% save way to detect the 1Byte encoded Strings in a UnicodeString. Here i've to say: be carefull with exsiting sources!
My actual state has ONE mess: If you use "CharecterSet/Collation"-special prefixes in your Statement's... Then trouble should be the result! This is the second reason why i left this PreprepareSQL-step optional. Now you can portate FPC-Projects to Ansi-Delphi-Compilers to Unicode-Delphi-Compilers. I see no real problems here....
[s]Also I've implemented a property called ClientCodePageOptions. [/s]
3. For new exported ZConnection functions:
function GetBinaryEscapeStringFromString(const BinaryString: AnsiString): String; overload;
function GetBinaryEscapeStringFromStream(const Stream: TStream): String; overload;
function GetBinaryEscapeStringFromFile(const FileName: String): String; overload;
function GetAnsiEscapeString(const Ansi: AnsiString): String;
Details: These function are a sign for my Query-preprepare-step to ignore further translations. Also you can read in binary-files to add them to your SQL-Statements. Your can sign your own strings, Streams or i prepare your File-Data as detectable string.
4. changed GetString-behavoir of the internal IZResultSets to:
function GetString/GetStringByName(const ColumnName: string): String; The DBC-Layer and Component-Layer have the same String-Types now. Whatever "String" is type from.
This i've changed because of some registered bugreports of peoples who uses UTF8-Connections and want to compare with SourceCode-constants or ansi-files for example. Now i do prepare the Plaindriver-given data in a ansi compareable format. Which should make somthing easyier for you. If you want to compare something in UTF8 you can use the same function and set the parameter CharEncoding to ceUTF8.. I'm hopfully this will save your developing time too. (definition in ZCompatibility.pas)
5. Property TZConnection.UTF8StringAsWideField: Boolean;
Only interesting for D7-D007/FPC without Lazarus and has dependencies to the choosen characterset.
TZConnection.UTF8StringAsWideField = False:
Now Zeos does't use TWideString-Fields for Unicode-Data. If you've UTF8 Controls like http://sourceforge.net/projects/utf8vcl/ or TNT then you can enjoy real unicode.
TZConnection.UTF8StringAsWideField = True:
Zeos assumes now TWideFields for Unicode-String values. Problem: the Ansi Delphi comilers boiling down all Widstring chars to Ansi-OS compatible characters because the delphi standart controls using only AnsiStrings. Which means for D7-2007 automatic data loss or some characters where displayed as'?'. This is no result of Zeos...
6. New Zeos.inc defines:
{$DEFINE WITH_CHAR_CONTROL} this enables the Char-Control-System. Actually disabled.
7. Also i created a new generic Object called "TAbstractCodePagedInterfacedObject":
It stores the actual choosen Connection-parameters. So you can open different connections with different ClientCodePages. This Object also exports 3 litte string-helper functions which are full of compiler-directives. Here i've unified all string handlings Zeos internal needs. Which meens there is no more need for compiler directives in the Zeos sources then in these functions concerning string-translations. To stabelize the behavior for these functions i also need the TZCharEncoding types. You can find them in the unit ZCompatibility.pas.
function ZDbcString(const Ansi: AnsiString; const Encoding: TZCharEncoding = ceDefault): String;
function ZPlainString(const AStr: String; const Encoding: TZCharEncoding = ceDefault): AnsiString;
function ZStringW(const ws: WideString; const Encoding: TZCharEncoding = ceDefault): String;
Remember a Full-Unicode-IDE handles a String as UnicodeString(a type like WideString).
These functions are now the central point internal to avoid data-loss. They are able to do the critical Ansi-to-(Unicode)String translations and revert dependend on the choosen Client-Side CharacterSet. Like this:
[align=center][/align]
9. Changed bytea-behavior of PG9+:
Since Postgre9x the encode/decode-bytea behavior has changed. I've include these step in my patch here. (Now also avialbe in the testing-branch.)
10. Virtual Characterset "ZUTF8AsAnsi for SQLite":
Available for D7-D2007 and FPC without Lazarus. For all other IDE's we don't need this.
Here i start from the premise that you SQLite database is only used for a non real Unicode database. This characterset makes it possible to use SQLite with your Standart-AnsiControls. You don't need TNT-Componets any longer. Easiely switch TZConnection.PreprepareSQL := True and you can use SQLite as Ansi-Database.
These steps and changed behavior is hopefully ready and proofed for SQLite, FireBird, PostgreSQL, MySQL,ASA12, MSSQL and Oracle.
To our Delphi users. All unicode-fieldtypes whery choosen on the selected Codepage. Which means choose Latin1 or a other Ansi-Codepage the string/stream-field-types are from type TString-/TMemoField. If you choose a Unicode-Codepage the string/stream field types are from TWideString-/WideMemoField...
To our D12UP users: All Statements should work now. It doesn't matter if we've Full-Unicode-IDE or not. Also you can use other Codepages then UTF8. My modifications should make this possible without problems. Also i'd a lot of work to avoid possible dataloss.. (Actually 60 compilerwarnings left)
To all Zeos7 users: I need testers, bugreporters, developers who are interested in reliable forward developing of Zeos7 and some votes for the changes!!!!!!!
Best regards
Egonhugeist
btw: Some ideas (Firebird) of the internal change are from marsupilami, so Kudos, Jan!
And much more bigger THANK YOU! Mark to spend your rare time to be my tester and "second pair of eyes"!
I attached my little patch. It shows the changes to the last testing-branch