Page 1 of 1

Thoughts about TZCachedResultSet implementation

Posted: 13.11.2015, 12:06
by sftf
Why TZCachedResultSet exist
===========================
DB native client (library) gets data from the DB server in its own format and
places them in a continuous memory block. Client application (Zeoslib) gets pointer
to this memory block. App could parse and show data from this memory area,
but couldn't modify it in place since app couldn't manipulate memory block allocated by DB client's code.
DB client may have (or may not) functions to insert/update/delete rows directly in received data.
For example libpq have not such functions: all that can be done is
free memory block by calling libpq's function 'PQclear'.
Here arises a TZCachedResultSet. Its copy data from memory block
in its own single format and all manipulations on data occurs on this second copy.
So, after all rows copied by TZCachedResultSet, native data not needed any more.

How current TZCachedResultSet work
==================================
If statement's concurrency is 'rcUpdatable' 'TZCachedResultSet' created and it copy rows from native
resultset (NRS) into memory and 'IZResultSet' returned to manage this data copy.
Copying occurs automatically if we move to some row: copied records that are not in 'TZCachedResultSet' yet.
So, for example, if we move to last row all rows from NRS will be copied.
Data copy occupy more memory then NRS since native format (at least for Postgres)
is more compact and continuous.
As a result, in worst case (when all rows copied) result memory footprint doubles at least:
memory footprint of NRS + memory footprint of CRS.

Alternative TZCachedResultSet implementation
============================================
Alternative implementation is possible for TZCachedResultSet.
It could combine NRS (FNativeResultSet on picture) with slightly modifed TZCachedResultSet implementation (FCache on picture)via additional indirection layer.
cachedresultset.png
New implementation maps rows to NRS or FCache via its own list of rows:
- unmodified rows are mapped directly to corresponding NRS rows;
- a new row is added and mapped directly into FCache ;
- before updating NRS row is copied into FCache , mapped into new FCache location
and update occurs on this copy;
- before deleting NRS row is copied into FCache , mapped into new FCache location
and deleted from FCache and FRowList;
- rows in FCache updated/deleted as in original 'TZAbstractCachedResultSet'.
All operations on rows are mapped to corresponding NRS/FCache class methods.
Posting postponed updates to server should work as is.

There is one challenge: 'CompareRows' function.
Оriginal function compares rows in the same format: 'TZRowBuffer'.
In alternative implementations there are three cases:
- TZRowBuffer <=> TZRowBuffer;
- NRS row <=> TZRowBuffer;
- NRS row <=> NRS row.
So for compare function I convert NRS row to TZRowBuffer on the fly but it was slow down
comparing about 20 times (I compare first row with other 37,000 rows 40 times).
I had no choice but to cache converted rows wich effictivle means permanently
copying them from NRS to FCache.
And this this break initial assumptions about handling row copies...

However new implementation has significantly less memory footprint, but with
one exception: sorting. When sorting occurs compared rows are copied into FCache
and memory footprint grows.
Also (and this is expected behavior) each updated/deleted NRS row is copied into FCache and increases memory footprint.

Now I have working implementation based on 7.1.4-stable and I guess I can do it for 7.2.

Re: Thoughts about TZCachedResultSet implementation

Posted: 13.11.2015, 14:20
by EgonHugeist
Hi sft,

think about:

your approach would work for NativeResultSets with a scrollable cursor only.

Most plains just support forwardonly cursor. PostgreSQL and MySQL solve the scrollable and transactional (retaining) issues by moving all selected data to it's own libs.
Oracle since V10 supports such a cursor too (not implemented yet, had no time) same as ADO/OleDB/ODBC
but all these protocols are caching the Data on client side to make the cursor scrollable. FireBird/ASA/SQLite/DBLIB(TDS) do NOT support scrollable cursor.

Please STOP developing with 7.1. If you wanna help just use the \testing-7.3 branch. (this definitelly is a behavior change and can't be applyed into a component state >= Beta).

note:

I had such thoughts too. But because of performace issues(like you've pointet out,), too much work and finally error prown, i simply put them to trash.

But don't hessitate to start. Keep the code optional.

And before doing anything: think about my thoughts.

Re: Thoughts about TZCachedResultSet implementation

Posted: 13.11.2015, 17:10
by sftf
Hi, EgonHugeist.

Ok. I completely forgot about forward only cursors.
Could you tell me what goals/todo for 7.3?

Re: Thoughts about TZCachedResultSet implementation

Posted: 04.12.2015, 13:34
by EgonHugeist
The code changed loads.
Intital goals of:
- 7.2: high performance even if not everything is ready. However 7.2 performance in most test areas is minimum twice faster than 7.1.
It includes first batch-bindings (dbc-layer only) and block cursor modes
- 7.3: same goals of 7.2 but with new drivers i made like "OleDB" using Microsoft OleDB including support for FPC to make the leap for FPC to Ole and ADO using windows of course. Next (today comitted) "odbc_a" and "odbc_w" (which are selves explainatory and platform portable) except the _a/_w suffix which are used to force pure Raw(single-byte encoded strings) and pure Unicode(UCS2 or UTF16-encoded strings).

since 7.2 we've a performace overrun known component libs like UniDAC(all areas) and FireDAC as well(Direct access and if enabled my introduced TZField descendants same for FD :twisted: )

However: Patching new ideas happens against 7.3 whereas bugfixing happens against 7.2->trunk->7.3
I don't feel encuraged continuing maintaining 7.1 even if most code(making 7.1 stable) is made by my selves.. Inbetween code differs to much (and I HAVE NO TIME doing that).