When the ZEOS 7 new version for Delphi 2010 ?
Moderators: gto, EgonHugeist, olehs
I'm on it
ab, your framework seems impressive!
And look's like you are doing/did the same thing we're discussing here.
Something to discuss:
- Internal encoding should be UTF-8 or UCS-2 (UTF-16 LE, Delphi and Windows default) ?
UTF-8 can bring us almost ANSI speed (when using basic chars) while enabling unicode when necessary, and probably will have the lowest conversion rate when sending/getting to/from databases. (Most DBs uses UTF-8 internally because of space usage).
The conversion to UCS-2 on Unicode enabled environments will be needed when sending/getting to/from component layer (did in memory and only once for read-only operations, probably faster). It will require the use of a external library, like this from ab, which seems very good (FPC support is a plus). After some tests, way to go, in my opinion.
I'll go deep under code after writing more here, as I'm a bit "off news".
tygrys , let's keep in touch here for now? I don't use skype yet, but I can install, no prob. If we can talk, I can even improve my poor english! Meanwhile, didn't received your PM yet.
ab, your framework seems impressive!
And look's like you are doing/did the same thing we're discussing here.
Something to discuss:
- Internal encoding should be UTF-8 or UCS-2 (UTF-16 LE, Delphi and Windows default) ?
UTF-8 can bring us almost ANSI speed (when using basic chars) while enabling unicode when necessary, and probably will have the lowest conversion rate when sending/getting to/from databases. (Most DBs uses UTF-8 internally because of space usage).
The conversion to UCS-2 on Unicode enabled environments will be needed when sending/getting to/from component layer (did in memory and only once for read-only operations, probably faster). It will require the use of a external library, like this from ab, which seems very good (FPC support is a plus). After some tests, way to go, in my opinion.
I'll go deep under code after writing more here, as I'm a bit "off news".
tygrys , let's keep in touch here for now? I don't use skype yet, but I can install, no prob. If we can talk, I can even improve my poor english! Meanwhile, didn't received your PM yet.
The root cause of using UTF-8 is to maintain the compatibility with compilers older than Delphi 2009.
UTF-8 is the easiest way of having code compiling and running fast with Delphi prior 2009. The WideString implementation is much slower, and is not as native as AnsiString. AnsiString type is well handled with both Ansi-Delphi and Unicode-Delphi compilers.
I've coded some UTF-8 dedicated functions (with a lot of speed optimization) inside the SynCommons unit, so that they'll be used with Ansi-Delphi and Unicode-Delphi compilers, with no penalty.
So here is my first guess:
- low level access to the DB engines will use UTF-8, from the Client side (with conversion from a CharSet if the Client doesn't handle UTF-8, like Firebird 1.5);
- storage will use either UCS2 either UTF-8, on the Server side, depending on the preferred way of each DB;
- ZDBC will use UTF-8 encoding;
- DB components will use string encoding, i.e. AnsiString or UnicodeString depending of the Delphi compiler version used.
UTF-8 to Ansi/Unicode String conversion is very fast, especialy with the functions available in my SynCommons unit.
So using UTF-8 in ZDBC won't be noticeable with high-level VCL components, even if they are AnsiString or UnicodeString.
In my framework, I'll use ZDBC only. So help will be needed for testing higher level DB components.
UTF-8 is the easiest way of having code compiling and running fast with Delphi prior 2009. The WideString implementation is much slower, and is not as native as AnsiString. AnsiString type is well handled with both Ansi-Delphi and Unicode-Delphi compilers.
I've coded some UTF-8 dedicated functions (with a lot of speed optimization) inside the SynCommons unit, so that they'll be used with Ansi-Delphi and Unicode-Delphi compilers, with no penalty.
So here is my first guess:
- low level access to the DB engines will use UTF-8, from the Client side (with conversion from a CharSet if the Client doesn't handle UTF-8, like Firebird 1.5);
- storage will use either UCS2 either UTF-8, on the Server side, depending on the preferred way of each DB;
- ZDBC will use UTF-8 encoding;
- DB components will use string encoding, i.e. AnsiString or UnicodeString depending of the Delphi compiler version used.
UTF-8 to Ansi/Unicode String conversion is very fast, especialy with the functions available in my SynCommons unit.
So using UTF-8 in ZDBC won't be noticeable with high-level VCL components, even if they are AnsiString or UnicodeString.
In my framework, I'll use ZDBC only. So help will be needed for testing higher level DB components.
No estimation yet, but I guess I'll make a first coding rush this week-end.
So we'll be able to guess how fast it could be done.
I'll probably test it first using SQLite engine (no server to install).
Then I would need help from people with database Access.
I'll need the Oracle implementation for my next project. So I'll probably install Oracle Express edition. But no plan for MSSQL, FireBird or MySQL testing by myself.
Here is my forecast:
1. Existing Code review, in order to guess if the string schema I propose does make sense, and which new properties are to be added, and what is to be modified.
2. Code refactoring of the Zeos Core.
3. Code refactoring of the SQLite driver.
4. Code refactoring of the ZDBC Core.
5. Testing this code refactoring (I'll try to use unitary testing).
6. Make corrections according to testing.
7. Code refactoring of other DB drivers (starting with Oracle one).
8. Add tests and corrections.
It'll use at first our SynCommons unit for string types and UTF-8 handling, which have been proven to work well with Delphi 6 up to Delphi XE.
So we'll be able to guess how fast it could be done.
I'll probably test it first using SQLite engine (no server to install).
Then I would need help from people with database Access.
I'll need the Oracle implementation for my next project. So I'll probably install Oracle Express edition. But no plan for MSSQL, FireBird or MySQL testing by myself.
Here is my forecast:
1. Existing Code review, in order to guess if the string schema I propose does make sense, and which new properties are to be added, and what is to be modified.
2. Code refactoring of the Zeos Core.
3. Code refactoring of the SQLite driver.
4. Code refactoring of the ZDBC Core.
5. Testing this code refactoring (I'll try to use unitary testing).
6. Make corrections according to testing.
7. Code refactoring of other DB drivers (starting with Oracle one).
8. Add tests and corrections.
It'll use at first our SynCommons unit for string types and UTF-8 handling, which have been proven to work well with Delphi 6 up to Delphi XE.
SVN Tutorial by Mark: http://zeos.firmos.at/viewtopic.php?t=841ab wrote:If only I could come with you to see Rammstein... it's one of my dreams to go and see one of their concert.
I'm using only fossil as SCM yet...
Where could I found hints to access to the Zeos repository?
Be sure to use the testing branch, the cutting edge one (for the good and bad of that).
- mdaems
- Zeos Project Manager
- Posts: 2766
- Joined: 20.09.2005, 15:28
- Location: Brussels, Belgium
- Contact:
ab,
You get a big Hi from me too. For the moment I'm a liitle busy with other stuff too.
gto,
My weekend will be a little wild too. I'll be responsible for the Red Cross people at the following 'music' event : EuroMillions Groove City. So no coding time for me too.
ab,
Please, make setting up the zeoslib test suite your first job this weekend. Use SQLite or eventually Oracle. Don't start with MSAccess, I think it'll be a nightmare. If you can get the automated build system working, please do so. You only need apache Ant to use it, no more dependencies.
This topic is quite recent: http://zeos.firmos.at/viewtopic.php?t=2951
If you get the suite running and more than 80% of the tests ran correctly, save the log files of 1 run as your baseline (I call this the 'reflogs'). That way you can compare the effect of your changes on the test suite using software like winmerge.
Then, I think the dbc data storage will be your next target, doing the necessary changes to use the right data storage format. (If I remember correctly this is mainly situated in the ZDbcCache unit, but that I might be completely wrong about that)
Normally this change can be done without affecting the current functionality of zeoslib, as all interaction with the cache should happen using interface methods.
After that we'll have to talk about the way we handle plain database connection (= at dbc and plain level). Will we provide a way to send a prefered charset to the database driver for the connection or will zeoslib force each driver to use only one charset? The latter solution wll be easiest to implement, I think. But I'm not sure if that's the best solution in all cases.
When allowing the users to choose the connection charset, would that influence the internal cache storage format needed?
Then the next thing to look at probably is the dbc resultset object and methods. And probably rework data fetching for the SQLite database driver to allow fetching non-utf8 encoded data from that database WITHOUT affecting the way data is fetched from other database drivers.
By that time we should feel comfortable enough to look at DbcStatements and sqlstring+(blob?) parameter handling, I hope.
Practically : I suppose you'd like to get SVN update rights soon. If patches come the way gto and me like them, this can be arranged. However, it's not something we can administer ourselves as it's done by the firm sponsoring our repository server. So I prefer to use a short 'trial' period to see how things work out.
During that period you'll have to send the changes in relatively small portions as SVN diffs (preferably use the create patch feature of TortoiseSVN).
Just send them to a (new) specialised thread on this or the user patches forum.
Please try to organise the work step by step.
For instance when using my proposal in this post : first the patch for the internal cache (including eventual changes to 'core' to support this change). The gto, andrevanzuydam or I can test-run-and-apply this patch. And then we can move on.
If we don't work this way we'll end up with one huge proposal from one independent developer and applying that will become a huge risk. (Practical and functional)
Concerning the branching/merge policy I use:
- testing branch can accept all patches that do not terribly break the test suite. Please provide all patches using this branch
- after at least a week the changes that do not break the trunk test suite at all can be merged to trunk. So if changes to testing branch causes some known small issues they'll have to be fixed before merging to trunk.
Do you think we can worjk together like this?
Mark
BTW, don't hesitate to contact me using one of the means you can find in my profile. You can use these communication means any time I'm online, I'll honestly tell you when it doesn't really fit in my time scheme.
Mark
You get a big Hi from me too. For the moment I'm a liitle busy with other stuff too.
gto,
My weekend will be a little wild too. I'll be responsible for the Red Cross people at the following 'music' event : EuroMillions Groove City. So no coding time for me too.
ab,
Please, make setting up the zeoslib test suite your first job this weekend. Use SQLite or eventually Oracle. Don't start with MSAccess, I think it'll be a nightmare. If you can get the automated build system working, please do so. You only need apache Ant to use it, no more dependencies.
This topic is quite recent: http://zeos.firmos.at/viewtopic.php?t=2951
If you get the suite running and more than 80% of the tests ran correctly, save the log files of 1 run as your baseline (I call this the 'reflogs'). That way you can compare the effect of your changes on the test suite using software like winmerge.
Then, I think the dbc data storage will be your next target, doing the necessary changes to use the right data storage format. (If I remember correctly this is mainly situated in the ZDbcCache unit, but that I might be completely wrong about that)
Normally this change can be done without affecting the current functionality of zeoslib, as all interaction with the cache should happen using interface methods.
After that we'll have to talk about the way we handle plain database connection (= at dbc and plain level). Will we provide a way to send a prefered charset to the database driver for the connection or will zeoslib force each driver to use only one charset? The latter solution wll be easiest to implement, I think. But I'm not sure if that's the best solution in all cases.
When allowing the users to choose the connection charset, would that influence the internal cache storage format needed?
Then the next thing to look at probably is the dbc resultset object and methods. And probably rework data fetching for the SQLite database driver to allow fetching non-utf8 encoded data from that database WITHOUT affecting the way data is fetched from other database drivers.
By that time we should feel comfortable enough to look at DbcStatements and sqlstring+(blob?) parameter handling, I hope.
Practically : I suppose you'd like to get SVN update rights soon. If patches come the way gto and me like them, this can be arranged. However, it's not something we can administer ourselves as it's done by the firm sponsoring our repository server. So I prefer to use a short 'trial' period to see how things work out.
During that period you'll have to send the changes in relatively small portions as SVN diffs (preferably use the create patch feature of TortoiseSVN).
Just send them to a (new) specialised thread on this or the user patches forum.
Please try to organise the work step by step.
For instance when using my proposal in this post : first the patch for the internal cache (including eventual changes to 'core' to support this change). The gto, andrevanzuydam or I can test-run-and-apply this patch. And then we can move on.
If we don't work this way we'll end up with one huge proposal from one independent developer and applying that will become a huge risk. (Practical and functional)
Concerning the branching/merge policy I use:
- testing branch can accept all patches that do not terribly break the test suite. Please provide all patches using this branch
- after at least a week the changes that do not break the trunk test suite at all can be merged to trunk. So if changes to testing branch causes some known small issues they'll have to be fixed before merging to trunk.
Do you think we can worjk together like this?
Mark
BTW, don't hesitate to contact me using one of the means you can find in my profile. You can use these communication means any time I'm online, I'll honestly tell you when it doesn't really fit in my time scheme.
Mark
Let's start
I installed Ant on my computer. I really don't find Java applications so easy to deploy. So huge downloads, such for a command-line tool.
http://blog.synopse.info/post/2010/09/2 ... i-paradise
I checked out the test branch, and will begin my code audit.
I'll then try to make patches... But I guess the code refactoring will be huge, because quite all string types should be modified, with my proposal.
So here is my proposal:
I'll work localy, creating a branch using fossil, not committing anything to the official SVN. This is a matter of minutes.
I'll run all supplied tests, in order to avoid most regression, with SQLite3 at first.
About CharSet for connection to the Servers, IMHO it should NOT be fixed in the code. It will depend on the User prefered language. WinAnsi-1252 could fit most US/European people, but there are a lot of other language sets!
My guess is that UTF-8 should be the default, if supported. Then UCS2. Both don't loose any data during conversion.
Then a custom charset, on request, for DB which do not support UTF-8 or UCS2.
http://blog.synopse.info/post/2010/09/2 ... i-paradise
I checked out the test branch, and will begin my code audit.
I'll then try to make patches... But I guess the code refactoring will be huge, because quite all string types should be modified, with my proposal.
So here is my proposal:
I'll work localy, creating a branch using fossil, not committing anything to the official SVN. This is a matter of minutes.
I'll run all supplied tests, in order to avoid most regression, with SQLite3 at first.
About CharSet for connection to the Servers, IMHO it should NOT be fixed in the code. It will depend on the User prefered language. WinAnsi-1252 could fit most US/European people, but there are a lot of other language sets!
My guess is that UTF-8 should be the default, if supported. Then UCS2. Both don't loose any data during conversion.
Then a custom charset, on request, for DB which do not support UTF-8 or UCS2.
I made some code refactoring this Week End.
It'll be useless to post the modifications as patches.
I made a HUGE rewrite or code refactoring in order to handle properly UTF-8 as the root "string" type of the ZEOS core.
As I formerly stated, I used our SynCommons library for low-level handling. I.e. the RawUTF8 type is used instead of string, everywhere in the code. Only access to the VCL/RTL (like components or TStrings/TStringList) will have a conversion to string. All conversions are very fast. Even Charset conversion and date and time handling are made using fast and proven asm from SynCommons.
This will result in a very deep code refactoring of Zeos.
Perhaps some of you may be disappointed, because it will change a lot of code.
But until now, I've processed ZClasses, ZMessages, ZCompatibility, ZFunctions, ZCollections, ZVariant, ZSysUtils.
Still to go!
It'll be useless to post the modifications as patches.
I made a HUGE rewrite or code refactoring in order to handle properly UTF-8 as the root "string" type of the ZEOS core.
As I formerly stated, I used our SynCommons library for low-level handling. I.e. the RawUTF8 type is used instead of string, everywhere in the code. Only access to the VCL/RTL (like components or TStrings/TStringList) will have a conversion to string. All conversions are very fast. Even Charset conversion and date and time handling are made using fast and proven asm from SynCommons.
This will result in a very deep code refactoring of Zeos.
Perhaps some of you may be disappointed, because it will change a lot of code.
But until now, I've processed ZClasses, ZMessages, ZCompatibility, ZFunctions, ZCollections, ZVariant, ZSysUtils.
Still to go!
No problem at all with big code changes, I wanted to change Zeos code deeply some time ago but left that alone.
mdaems: I can handle the big change against SVN tree, doing standard patches! Don't worry
I think we can join efforts on that, now that I'm "less-busy" than before. My cache optimizations will have some care !
ab, when you fell OK to send your changes, alert here. You can host it or send by e-mail, I'll make the patches and stuff.
[]'s!
mdaems: I can handle the big change against SVN tree, doing standard patches! Don't worry
I think we can join efforts on that, now that I'm "less-busy" than before. My cache optimizations will have some care !
ab, when you fell OK to send your changes, alert here. You can host it or send by e-mail, I'll make the patches and stuff.
[]'s!
I just want to be sure my changes are efficient.
So perhaps I'll make a self-hosted repository (like a fork).
Then you'll be able to take a look at that and vote if it's worth merging it to the trunk.
I don't pretend to have the definitive work on your great library.
I'm just trying to make changes in order to have it working with my mORMot framework.
But the result with UTF-8 at lowest level of Zeos sounds promising. Using UTF-8 make it fully compatible with Delphi 6 up to Delphi XE, with no {$ifdef UNICODE} in the code and such.
So perhaps I'll make a self-hosted repository (like a fork).
Then you'll be able to take a look at that and vote if it's worth merging it to the trunk.
I don't pretend to have the definitive work on your great library.
I'm just trying to make changes in order to have it working with my mORMot framework.
But the result with UTF-8 at lowest level of Zeos sounds promising. Using UTF-8 make it fully compatible with Delphi 6 up to Delphi XE, with no {$ifdef UNICODE} in the code and such.
I've uploaded the Core sub directory. Purely working with UTF-8 encoding. There is no string declaration any more in those units. It calls and use the types and functions available in SynComons.pas to make it ready for Delphi 6 up to XE.
Compiles with Delphi 6/7. I didn't test it with newer version. Some "inline" statement will probably fail.
It's only a first start. But I've done a lot of code refactoring and rewrite. Perhaps I changed 3/4 of the code...
I didn't run the test suit yet. Because the test suit has to be refactored also, because it does rely on the string type, not on UTF-8...
See http://synopse.info/fossil/info/7810952a31
Compiles with Delphi 6/7. I didn't test it with newer version. Some "inline" statement will probably fail.
It's only a first start. But I've done a lot of code refactoring and rewrite. Perhaps I changed 3/4 of the code...
I didn't run the test suit yet. Because the test suit has to be refactored also, because it does rely on the string type, not on UTF-8...
See http://synopse.info/fossil/info/7810952a31