Load BLOB fields on-demand?

The forum for ZeosLib 7.2 Report problems. Ask for help, post proposals for the new version and Zeoslib 7.2 features here. This is a forum that will be edited once the 7.2.x version goes into RC/stable!!

My personal intention for 7.2 is to speed up the internals as optimal a possible for all IDE's. Hope you can help?! Have fun with testing 7.2
Post Reply
User avatar
aehimself
Zeos Dev Team
Zeos Dev Team
Posts: 766
Joined: 18.11.2018, 17:37
Location: Hungary

Load BLOB fields on-demand?

Post by aehimself »

I am using a patched version of Zeos 7.2.4 (r5698, reports as 7.2.5-rc) with multiple databases (MySQL, Oracle and MsSQL mainly). I am writing a basic database browser application where you can enter a query and a DBGrid will display all the information.
The data link is: TZConnection <- TZQuery <- TDataSet <- TDBGrid

When I execute a simple SELECT * FROM MYTABLE:

[2019.07.31 18:57:58.166] Opening dataset...
[2019.07.31 18:57:58.166] SQL trace: Statement 6 : SELECT ID, XMLBLOB, SERIAL, LASTCHANGED, CREATED, MODIFIER, CREATOR, C_EXTERNALID, SUBSCREATOR, SUBSLASTMODIFIER FROM MYTABLE
[2019.07.31 18:57:58.182] SQL trace: Statement 6
[2019.07.31 19:57:17.548] Formatting grid...
[2019.07.31 19:57:19.825] Query returned 355 662 record(s) in 59,36 minute(s)

...if I leave the only BLOB field out:

[2019.07.31 18:57:28.141] Opening dataset...
[2019.07.31 18:57:28.141] SQL trace: Statement 5 : SELECT ID, SERIAL, LASTCHANGED, CREATED, MODIFIER, CREATOR, C_EXTERNALID, SUBSCREATOR, SUBSLASTMODIFIER FROM MYTABLE
[2019.07.31 18:57:28.265] SQL trace: Statement 5
[2019.07.31 18:57:37.820] Formatting grid...
[2019.07.31 18:57:39.820] Query returned 355 662 record(s) in 11,97 second(s)

Is there any way to force the blob fields to load only when accessed? I read something about TDataSetProvider.Options poFetchBlobsOnDemand but I can not seem to make it work:

TZConnection
^
TZQuery <- TDataSet <- TDBGrid
^
TDataSetProvider

Is there a way in Zeos to make this happen? If not, how exactly should I link the components together to make it work with TDataSetProvider?

P.s.: Sorry if it's a noob question, I don't have experience with data-aware components (this is my first project with them, actually).
Delphi 12.1, Zeos 8 from latest GIT snapshot
Using:
- MySQL server 8.0.18; libmariadb.dll 3.3.8
- Oracle server 11.2.0, 12.1.0, 19.0.0; oci.dll 21.13
- MSSQL 2012, 2019; sybdb.dll FreeTDS_2435
- SQLite 3.45.2
marsupilami
Platinum Boarder
Platinum Boarder
Posts: 1918
Joined: 17.01.2011, 14:17

Re: Load BLOB fields on-demand?

Post by marsupilami »

Hello aehimself,

do you develop that tool for a specific target database? I am asking because I started a similar project.

Anyway - TDataSetProvider will not help you there because it isn't Zeos. This functionality has to be provided by the dbc layer driver. Some of them probably already can do that and others can't because the way the database works. The dblib driver is obe example where I am sure, it can't work. Firebird might work. For the other drivers I would have to do some research.
This is the reason why I am asking this question - which database do you want to use?

Best regards,

Jan
marsupilami
Platinum Boarder
Platinum Boarder
Posts: 1918
Joined: 17.01.2011, 14:17

Re: Load BLOB fields on-demand?

Post by marsupilami »

aehimself wrote:(MySQL, Oracle and MsSQL mainly)
Sorry, I overlooked that part when I wrote my first answer. So - here is what I know:
  • MySQL: I seem to remember that MySQL loads the whole result set into the client access library. I can only assume that blob fields will be transferred from the server to the client and stored in memory. If this is true, there is no chance for delaying the loading of blobs.
  • Oracle: Honestly I have no clue. I would have to check back with EgonHugeist.
  • MsSQL:
    • dblib: If you use the dblib driver (FreeTDS, mssql), there is no chance of delaying the loading of blobs because this driver caches all data in memory. This is because we might need to query for additional metadata from the database and cannot do that before all the data that we asked for is transferred from the server.
    • ADO: Here your mileage may vary. As far as I understand the TDS protocol, one cannot delay the loading of BLOBs. But the ADO driver for MS SQL Server might use some methods that allow it to only transfer rows, Zeos really fetches. So you might have a situation where the first rows of a result set are loaded (including BLOBs) while the other rows are not (yet) fetched from the server.
The problem is - these things are the point of view from the DBC layer. The Component layer (TZQuery, TZReadOnlyQuery) might enforce the use of caches. Most databases will return a forward only, read only cursor. Since datasets are expected to be scrollable, Zeos will add an internal cache that holds rows that already were fetched from the server to ensure, we can scroll backwards to them. This cache will chace the blob fields as soon as the row of a result set gets accessed the first time.

As a rule of thumb - if you want to have a chance of using delayed loading of rows, use a read only query. Also ask your users to not query fields which they don't need.
User avatar
aehimself
Zeos Dev Team
Zeos Dev Team
Posts: 766
Joined: 18.11.2018, 17:37
Location: Hungary

Re: Load BLOB fields on-demand?

Post by aehimself »

Well, at least this answers why I found no article about this on the Internet :)
I do understand that data transfer and components are on different layers and might not be able to change the others behavior, I was just hoping there's a parameter which I can set and will work similar to ZConnection's .Ping method.

Of course minimizing the downloaded fields / nr. of records is the best solution, but noone can stop someone from entering simply "SELECT * FROM HUGETABLE" and then to complain that the application is frozen for an hour :D

Should I move all the database components in a separate thread? That way the main application window wouldn't be frozen but I'm worried how Zeos (and the data transfer itself) would react on a TerminateThread if someone presses the cancel button. Plus I doubt if it's a good practice to "feed" a VCL component some data from a separate thread...
How exactly this should be handled? :)
Delphi 12.1, Zeos 8 from latest GIT snapshot
Using:
- MySQL server 8.0.18; libmariadb.dll 3.3.8
- Oracle server 11.2.0, 12.1.0, 19.0.0; oci.dll 21.13
- MSSQL 2012, 2019; sybdb.dll FreeTDS_2435
- SQLite 3.45.2
marsupilami
Platinum Boarder
Platinum Boarder
Posts: 1918
Joined: 17.01.2011, 14:17

Re: Load BLOB fields on-demand?

Post by marsupilami »

aehimself wrote:Well, at least this answers why I found no article about this on the Internet :)
I do understand that data transfer and components are on different layers and might not be able to change the others behavior, I was just hoping there's a parameter which I can set and will work similar to ZConnection's .Ping method.
As I said - this highly depends on the DBMS used. Firebird/Interbase could handle a situation like that because they have a separate API for BLOBs. Blobs have internal IDs there. On PostgreSQL one could use the "old" way of having BLOBs with OIDs. I assume that could solve the problem there. But as far as I remember, PostgreSQL encourages the use of the TOAST facility. Also OID-Blobs don't work well with autocommit on PostgreSQL. And for MS SQL / Sybase - I am pretty sure that they cannot handle a situation like this. I assume, their wire protocol (TDS) simply doesn't allow to not transfer BLOBs if they are queried.
aehimself wrote:Of course minimizing the downloaded fields / nr. of records is the best solution, but noone can stop someone from entering simply "SELECT * FROM HUGETABLE" and then to complain that the application is frozen for an hour :D
I do understand you. But unfortunately sometimes users have to pay for doing stupid things - like asking for tons of data ;)
aehimself wrote:Should I move all the database components in a separate thread? That way the main application window wouldn't be frozen but I'm worried how Zeos (and the data transfer itself) would react on a TerminateThread if someone presses the cancel button. Plus I doubt if it's a good practice to "feed" a VCL component some data from a separate thread...
How exactly this should be handled? :)
Duh - hmm - well - Zeos is meant to be used in a synchronous, blocking way. That means it doesn't have any concept of cancelling an operation. Also we do have the requirement that Zeos can be used with threads - yes. But each thread needs to have it's own connection. So the most easy thing you could do is to employ a scheme like this:
  • Have TZConnection and TZQuery in the main thread. Prepare everything.
  • Disconnect all other components, like TDataSource etc. from TZQuery.
  • Have a separate thread open the query and maybe make sure, it fetches all data (TZQuery.FetchAll).
  • End the separate thread, hand over control of TZQuery and the TZConnection to the main thread.
  • reconnect TZQuery to TDataSource and the like.
This would allow you to still have a responsive main thread, while data is fetched. But still there is no way to cancel a running query and reclaim the connection. You have to wait until the thread finishes it's execution before you can do anything on TZQuery or TZConnection. Everything else will most probably break something and you will get access violations and the like.

I really would like to have asynchronous data access components that allow the main thread to be responsive while data gets loaded in the background. But I am not sure if Delphis TDataSet component can work like this. Also currently there is no driver in Zeos that is prepared to use asynchronous calls for databases.

Best regards,

Jan
marsupilami
Platinum Boarder
Platinum Boarder
Posts: 1918
Joined: 17.01.2011, 14:17

Re: Load BLOB fields on-demand?

Post by marsupilami »

Hello,

I stand to be corrected - to some degree: There is an option for delayed loading of blobs: TZQuery and TZReadonlyQuery have the doCachedLobs flag in the options property. If that flag gets set, LOBs (BLOBs and CLOBs) can be loaded on demand on some databases. The only databases supporting this currently are Sybase Adaptive Server Anywhere, PostgreSQL when using OID BLOBs and Interbase and Firebird. Other drivers don't support this flag and will ignore it silently.

Best regards,

Jan
User avatar
aehimself
Zeos Dev Team
Zeos Dev Team
Posts: 766
Joined: 18.11.2018, 17:37
Location: Hungary

Re: Load BLOB fields on-demand?

Post by aehimself »

marsupilami wrote:So the most easy thing you could do is to employ a scheme like this:
  • Have TZConnection and TZQuery in the main thread. Prepare everything.
  • Disconnect all other components, like TDataSource etc. from TZQuery.
  • Have a separate thread open the query and maybe make sure, it fetches all data (TZQuery.FetchAll).
  • End the separate thread, hand over control of TZQuery and the TZConnection to the main thread.
  • reconnect TZQuery to TDataSource and the like.
It's been a while and I was trying to implement this. After tons and tons of access violations I re-read your suggestion and I had to realize I was attempting it the wrong way. This solution works perfectly, but I have to be extremely careful on when I am allowing the next operation to start. I found it the easiest if I flip a property to true before any action and false after. The setter method if the incoming value is true simply waits for the value to be false before allowing the flow to continue.

I created a TThread descendant object. In the constructor you have to pass a Procedure Of Object, which will be called by the thread (SQLConnection.Connect, SQLQuery.Open, etc).
There is also a Class Procedure named .Open where you pass a TZAbstractRODataSet and it creates and executes the .Open method of it. It only passes back the flow to the calling method once the thread is done and re-raises any exceptions happened (so the calling methods exception handler can act as we desire).
I'm not saying the solution is perfect, but if anyone wants to give it a try, feel free to!

And again, @marsupilami, thank you for the suggestion! If you guys think it worths an extra 2k in the package, it also could be implemented in the component itself. I would be really happy to finally contribute :)
You do not have the required permissions to view the files attached to this post.
Delphi 12.1, Zeos 8 from latest GIT snapshot
Using:
- MySQL server 8.0.18; libmariadb.dll 3.3.8
- Oracle server 11.2.0, 12.1.0, 19.0.0; oci.dll 21.13
- MSSQL 2012, 2019; sybdb.dll FreeTDS_2435
- SQLite 3.45.2
User avatar
aehimself
Zeos Dev Team
Zeos Dev Team
Posts: 766
Joined: 18.11.2018, 17:37
Location: Hungary

Re: Load BLOB fields on-demand?

Post by aehimself »

P.s.: I forgot to add the .FetchAll to the .Open class procedure, which is a big-big mistake! While opening the dataset works fine, upon accessing any properties (like .RecordCount) starts the fetch in the main thread, causing the main application to freeze again.
Feel free to add it yourself, or download the updated unit at https://www.aecentral.org/Public/uSQLInThread.zip
Delphi 12.1, Zeos 8 from latest GIT snapshot
Using:
- MySQL server 8.0.18; libmariadb.dll 3.3.8
- Oracle server 11.2.0, 12.1.0, 19.0.0; oci.dll 21.13
- MSSQL 2012, 2019; sybdb.dll FreeTDS_2435
- SQLite 3.45.2
marsupilami
Platinum Boarder
Platinum Boarder
Posts: 1918
Joined: 17.01.2011, 14:17

Re: Load BLOB fields on-demand?

Post by marsupilami »

Hello,

I will definitly take a look on your code, as soon as time permits.
aehimself wrote:P.s.: I forgot to add the .FetchAll to the .Open class procedure, which is a big-big mistake! While opening the dataset works fine, upon accessing any properties (like .RecordCount) starts the fetch in the main thread, causing the main application to freeze again.
That seems strange to me. I see no reason for the program to freeze, if usage of the connection and of the query object are transferred back to the main thread. I wonder how this affects protocols that support delayed loading of BLOBs like Firebird.

Best regards,

Jan
User avatar
aehimself
Zeos Dev Team
Zeos Dev Team
Posts: 766
Joined: 18.11.2018, 17:37
Location: Hungary

Re: Load BLOB fields on-demand?

Post by aehimself »

marsupilami wrote:That seems strange to me. I see no reason for the program to freeze, if usage of the connection and of the query object are transferred back to the main thread.
It's kind of logical. Application code looks like this:

Code: Select all

 TSQLInThread.Open(query);
 maxrecords := query.RecordCount;
RecordCount sends us to TZAbstractRODataset.GetRecordCount, which will call TZAbstractRODataset.FetchRows if the dataset is not uni-directional. There is a cycle there:

Code: Select all

    if RowCount = 0 then
    begin
      while FetchOneRow do;
      Result := True;
    end
And since at this point the executing thread is the VCL again... well, we know what happens. The problem is solved if .FetchAll is called in TSQLInThread.Open. That will take care of fetching in the background thread, so the above cycle will exit immediately as there is nothing else to fetch.
marsupilami wrote:I wonder how this affects protocols that support delayed loading of BLOBs like Firebird.
Now this is something I can not test. My assumption is that there will be a slight delay when accessing the field (as the data is being downloaded) otherwise it will work... but anything can happen :)
Delphi 12.1, Zeos 8 from latest GIT snapshot
Using:
- MySQL server 8.0.18; libmariadb.dll 3.3.8
- Oracle server 11.2.0, 12.1.0, 19.0.0; oci.dll 21.13
- MSSQL 2012, 2019; sybdb.dll FreeTDS_2435
- SQLite 3.45.2
Post Reply