Page 1 of 1

TZQuery fetch control

Posted: 09.12.2014, 17:58
by sftf
Hi!
Can TZQuery or TZReadOnlyQuery fetch records on demand by configurable portions (records number).

Something like http://docwiki.embarcadero.com/RADStudi ... t_Fetching:
"FireDAC is fetching rowsets according to the FetchOptions.Mode property:
fmOnDemand--the rowset is automatically fetched when the dataset is trying to move the current position beyond the last fetched record."

'select * from table limit XXX' is not an option.

I check FetchRow in select from table with 37700 records (PostgreSQL).
FetchRow seems don't do what it should: with FetchRow := 0 or FetchRow : = 10 program loads from DB all rows in app memory, regardless of FetchRow value.
This is proved by application memory consumption right after TZQuery.Open: its the same with FetchRow := 0 or FetchRow : = 10.
There is only one visual effects of FetchRow - after TZQuery.Open DBGrid shows only FetchRow rows, but all records already loaded from server into memory .

Next evedence about FetchRow is sources (ZEOSDBO-7.1.4-stable.zip).
ZQuery.Open calls
TZAbstractRODataset.InternalOpen where all records loaded

Code: Select all

    ...
    { Creates an SQL statement and resultsets }
    if not FUseCurrentStatment then
      if FSQL.StatementCount> 0 then
        ResultSet := CreateResultSet(FSQL.Statements[0].SQL, -1)  // LOAD ALL RECORDS (memory consumption pops to max)
Next

Code: Select all

function TZAbstractRODataset.CreateResultSet(const SQL: string;  MaxRows: Integer): IZResultSet;
where MaxRows is not at all about fetching portion by portion:

Code: Select all

    if MaxRows > 0 then
      Statement.SetMaxRows(MaxRows);
  ...
      Result := Statement.ExecuteQueryPrepared; // after execution  memory consumption pops to max
It only hard limit maximum records can readed into dataset.

Conclusion
TZAbstractRODataset.InternalOpen load all records into dataset from DB regardless of FetchRow value.
Am I wrong and maybe there is options?

Re: TZQuery fetch control

Posted: 10.12.2014, 22:57
by EgonHugeist
We are NOT FireDac.

AFAIK FetchRow limtits only the cached rows of the ResultSets. But we (actually) do not take control about LIMIT AND OFFSET. Feel free to implement and share your findings!

What's making me curious: FetchRow=10 should load 10 rows only. Not all. Which driver are you using for?

Re: TZQuery fetch control

Posted: 11.12.2014, 10:06
by sftf
"Which driver are you using for?"
You mean database driver?
PostgreSQL 9.3. with 'postgresql-9' protocol.

Sorry for FireDAC - I mention it for example only to clarify my question.

Now I read http://www.postgresql.org/docs/9.3/stat ... -mode.html:
"Ordinarily, libpq collects a SQL command's entire result and returns it to the application as a single PGresult...
for commands that return a large number of rows...applications can use PQsendQuery and PQgetResult in single-row mode...
To enter single-row mode, call PQsetSingleRowMode...
"

Since PQsetSingleRowMode call does not exist in the 7.1.4-stable source code, I concluded that ZQuery (actually libpq) loads entire rowset from server at once.

I'm not absolutely sure about my conclusions, so I ask about ZQuery fetching capabilties.

Re: TZQuery fetch control

Posted: 12.12.2014, 17:49
by sftf
Further study showed the following.
Call "stack"

Code: Select all

procedure TZAbstractRODataset.InternalOpen;
  ResultSet := CreateResultSet(FSQL.Statements[0].SQL, FetchRow)

Code: Select all

function TZAbstractRODataset.CreateResultSet(const SQL: string;
  MaxRows: Integer): IZResultSet;
  Result := Statement.ExecuteQueryPrepared;

Code: Select all

function TZPostgreSQLCAPIPreparedStatement.ExecuteQueryPrepared: IZResultSet;
  QueryHandle := ExectuteInternal(ASQL, SSQL, lcExecute);

Code: Select all

function TZPostgreSQLCAPIPreparedStatement.ExectuteInternal(const SQL: RawByteString;
  const LogSQL: String; const LoggingCategory: TZLoggingCategory): PZPostgreSQLResult;
  Result := FPlainDriver.ExecuteQuery(FConnectionHandle, PAnsiChar(SQL));

Code: Select all

function TZPostgreSQLBaseDriver.ExecuteQuery(
  Handle: PZPostgreSQLConnect; Query: PAnsiChar): PZPostgreSQLResult;
begin
  Result := POSTGRESQL_API.PQexec(Handle, Query); // load whole rowset
end;
where

Code: Select all

TZPOSTGRESQL_API = record
(
  PQexec:          TPQexec;

)
TPQexec          = function(Handle: PPGconn; Query: PAnsiChar): PPGresult; cdecl;

Next I read
http://www.postgresql.org/docs/9.3/stat ... async.html
The PQexec function is adequate for submitting commands in normal, synchronous applications.
...
PQexec always collects the command's entire result, buffering it in a single
PGresult. While this simplifies error-handling logic for the application, it
can be impractical for results containing many rows.

PQsendQuery
Submits a command to the server without waiting for the result(s). 1 is
returned if the command was successfully dispatched and 0 if not (in which
case, use PQerrorMessage to get more information about the failure).
So now ZEOS uses synchronous interaction with PostgreSQL, since
function TZPostgreSQLBaseDriver.SendQuery(Handle: PZPostgreSQLConnect; Query: PAnsiChar): Integer;
begin
Result := POSTGRESQL_API.PQsendQuery(Handle, Query);
end;
never used in the source.
And finally: TZQuery always loads whole ruleset and in current version of ZEOS its impossible to control this.

Now I guess developing async interaction with PostgreSQL will be tediuos task...

Re: TZQuery fetch control

Posted: 11.01.2015, 09:02
by EgonHugeist
Hi,

seems like i need a PG update :) I'm using 9.1,Mark 8.4 AFAIK yet.

Have to admit i din't know the possibilities to do that! But this sounds great. Well let's see what i can do. Performance might be interesting too!

I'll give you a second reply if this is implemented *7.2+*??

Re: TZQuery fetch control

Posted: 11.01.2015, 17:12
by sftf
I've recently started trying to implement async functions and SingleRowMode in Zeos by myself.
So now I implemented PQsendQueryPrepared at dbc level and "async open+SingleRowMode+get first row" in TZAbstractRODataset.

But during this hack I realize that IZResultSet and other classes assumes retrieving the whole result from backend at once and not portion by portion (row by row) - there is no resultset with support bunch of PGresults as one set.
Without this it forced to accumulate result rows at application level by copying rows to, for example, TbufDataset which tooooo slow and eats extra memory.

The same issues bother me when I try to implement controlled fetch from backend manually with "declare cursor...fetch count" since I forced to collect retrieved rows by copying them to TbufDataset from TZquery after each "fetch count".
Another inconsistency I encountered: with query opened via PQexec/PQexecPrepared we have whole resultset after open and may iterate through all rows immediately, but with PQSendQuery/PQSendQueryPrepared with SingleRowMode there is not clear how to implement retrieving remaining rows/PGresult's since we should get them explicitly.

And more over after executing PQSendQuery/PQSendQueryPrepared its not allowed to execute another PQSendQuery/PQSendQueryPrepared on the same connection until all PGresults are received wich force us open one connection per query and not fully compatible with T(SQL/Z)query working logic - with PQSendQuery instead PQexec after TZQuery.Open we cannot open another query until TZQuery.Close or should open new connection for another TZQuery.
http://www.postgresql.org/message-id/13 ... 11@mail.ru

If I understand correctly TZPostgreSQLStatement used to retrieve resultset and TZPostgreSQLResultSet used to access resultset and both assumes only one PZPostgreSQLResult at a time.

So now I think about implementation of kind of TZPostgreSQLStatement and TZPostgreSQLResultSet wich can handle and work with multipile (list of PZPostgreSQLResult) as one IZResultSet.
This should allow accumulate PZPostgreSQLResult's (PGresult's) in SingleRowMode at this level of abstraction and get rid us of extra copying and maybe helps with accumulaing rows with "declare cursor...fetch count" at higher levels.
I definitely would like to now what experienced developers think about this.
Another question is should such specific PostgreSQL functionality exposed to higher levels like TZAbstractRODataset?
Dirty modified 7.1.4 src in https://yadi.sk/d/cM9Xswq3du9qu only for example.

Re: TZQuery fetch control

Posted: 11.01.2015, 17:39
by EgonHugeist
Today i updated to PG9.4.

Like you i did start an approach to get this running. On my side it works inbetween but i have some new fails of my tests where i had no time to check them (maybe this night...).
Things i haven't testet:
if the singlerow mode was set once, is it possible to open a new COMMAND_TUBLES_OK result again?

Accordingly the memory consumptions of updateable ResultSets i think a little trick could help:
A updatable RS is a cached-RS. This needs to be filled row by row. If this is done and the "native" RS returns Next=False we could close the handle with PGclear. This frees the memory in libpq.


Note: working side by side is nice to improve and check different ideas. BUT please do NOT use 7.1. Switch to latest SVN version of \testing-7.2

Re: TZQuery fetch control

Posted: 11.01.2015, 23:20
by EgonHugeist
For the moment i have to pass.

The Async functions work nice with DBC but with updateable RS where meta-informations are retrieved i failed.
Maybe you have more success?

Did commit a patch: R3594 \testing-7.2

What i'm doing:
first of all TZDataSet.FetchRow property is handled as expected for the Cached-ResultSets.
if MaxRow limit is reached, the NativeRS.Next = False i close the query-handle and free the memory in libpq.dll because the data already is cached on our side. Forgett about TBufDataset...

What i couldn't resolve: The TZReadOnlyQuery-Memory for this case. :x:

Re: TZQuery fetch control

Posted: 12.01.2015, 13:09
by sftf
EgonHugeist wrote:For the moment i have to pass.
The Async functions work nice with DBC but with updateable RS where meta-informations are retrieved i failed.
Maybe you have more success?
I not fully understand about your problems with updateable RS in this case... are you talking about PGresult's metadata(fileds definitions)?
If you look at my modifed src I introduced 3 new members in TZPostgreSQLStatement:

Code: Select all

function TZPostgreSQLStatement.SendQuery(const SQL: RawByteString): boolean;
var
  ConnectionHandle: PZPostgreSQLConnect;
begin
  Result := False;
  ConnectionHandle := GetConnectionHandle();
  ASQL := SQL; //Preprepares the SQL and Sets the AnsiSQL
  Result := FPlainDriver.SendQuery(ConnectionHandle, PAnsiChar(ASQL)) = 1;
  DriverManager.LogMessage(lcExecute, FPlainDriver.GetProtocol, LogSQL);
end;

function TZPostgreSQLStatement.SetSingleRowMode: Boolean;
begin
  Result := FPlainDriver.SetSingleRowMode(GetConnectionHandle) = 1;
end;

function TZPostgreSQLStatement.GetResultSet: IZResultSet;
begin
  Result := CreateResultSet(LogSQL, FPlainDriver.GetResult(GetConnectionHandle))
end;                                                                            

and Pg's PQsendQuery do not return PGresult but 0/1 so we can't get any metadata about resultset until we call PQgetResult.
In my function TZPostgreSQLCAPIPreparedStatement.ExecuteQueryPreparedAsync: IZResultSet; I force single row mode only for tests - single row mode should be property of course since it not related directly to async functionality.

Ok, what work for me:

Code: Select all

procedure xxx;
var
query :TZQuery;
fld,tup: integer;
begin
    query := TZQuery.Create(self);
    query.Connection := ZConnection1;
    query.SQL.Clear;
    query.SQL.Add('select * from contracts where sk = :p1');
    query.Prepare;
    query.ParamByName('p1').Value := 400;
    query.OpenAsync;
    for fld := 0 to query.FieldCount - 1 do
        ShowMessage(query.Fields[fld].FieldName);

    query.First;
    ShowMessage(inttostr(query.RecordCount));
    while not query.eof do begin
        for fld := 0 to query.FieldCount - 1 do
          ShowMessage(query.Fields[fld].AsString);
        query.next;
    end;
end;
and in example above calls are:

Code: Select all

procedure TZAbstractRODataset.OpenAsync;
begin
  if not Active then begin
    if Statement is TZPostgreSQLCAPIPreparedStatement then begin
        FOpenAsync := True;
        IsUniDirectional :=True;
    end;
    Open;
  end;
end;      
then

Code: Select all

procedure TZAbstractRODataset.InternalOpen;
var
  ColumnList: TObjectList;
  I: Integer;
begin  
...
      if FSQL.StatementCount> 0 then
        ResultSet := CreateResultSet(FSQL.Statements[0].SQL, -1)
      else
...
   if DefaultFields and not FRefreshInProgress then
    begin
      CreateFields;   
...
end;
next

Code: Select all

function TZAbstractRODataset.CreateResultSet(const SQL: string;
  MaxRows: Integer): IZResultSet;
begin
     ...
      if FOpenAsync then
        Result := Statement.ExecuteQueryPreparedAsync //<-- introduced by me
      else
        Result := Statement.ExecuteQueryPrepared;
     ...
then SendQuery and immediatly get result since TZAbstractRODataset.CreateResultSet must return some data because TZAbstractRODataset.InternalOpen after ResultSet := CreateResultSet(FSQL.Statements[0].SQL, -1) needs ResultSet immediately for defining fields/columns.

Code: Select all

function TZPostgreSQLCAPIPreparedStatement.ExecuteQueryPreparedAsync: IZResultSet; // gapsf
begin
  if SendQueryPrepared then begin
    if SetSingleRowMode then // <-- here I silently force single row mode but should dont
      Result := GetResultSet // return only one (first) row, not whole rowset
    else
      Result := CreateResultSet(GetResultSet); // <-- maybe I wrong with CreateResultSet() since GetResultSet already have CreateResultSet
  end
  else
    Result := nil;
    inherited ExecuteQueryPreparedAsync;
end;
EgonHugeist wrote: Did commit a patch: R3594 \testing-7.2

What i'm doing:
first of all TZDataSet.FetchRow property is handled as expected for the Cached-ResultSets.
if MaxRow limit is reached, the NativeRS.Next = False i close the query-handle and free the memory in libpq.dll because the data already is cached on our side.
I can't find this in source code...and can't figure out what you mean here...
EgonHugeist wrote: Forgett about TBufDataset...
Consider this as example:

Code: Select all

type
  txview = class(TWinControl)
  private
    cursorname: string;
    datasource: tdatasource;
    query: TZReadOnlyQuery;
    ds : TbufDataSet;
    dbgrid : TDBGrid;
    allfetched : boolean;

...
implementation
...
procedure copyrecords(src, dst : TDataSet);
var
  fld : Integer;
  cr : TBookmark;
begin
    cr := dst.GetBookmark;
    src.DisableControls;
    dst.DisableControls;
    src.first;
    while not src.eof do begin
      dst.Append;
      for fld := 0 to src.FieldCount - 1 do
           dst.Fields[fld].Value := src.Fields[fld].Value;
      src.next;
    end;
    dst.GotoBookmark(cr);
    src.EnableControls;
    dst.EnableControls;
end;

procedure txview.fetch_view(DataSet: TDataSet);
begin
    if ds.Eof and not allfetched then begin
        ds.DisableControls;
        ds.BeforeScroll := nil;
        query.Refresh;
        if query.RecordCount > 0 then begin
            copyrecords(query,ds);
            ds.BeforeScroll := @fetch_view;
        end
        else begin
            allfetched := True;
            showmessage(inttostr(ds.RecordCount));
            query.SQL.Clear;
            query.SQL.Add('close '+cursorname);
            query.ExecSQL;

            query.SQL.Clear;
            query.SQL.Add('end');
            query.ExecSQL;
        end;
        ds.EnableControls;
        dbgrid.Refresh;
//        showmessage(inttostr(ds.RecordCount));
    end;
end;

procedure txview.update_view(conn: TZConnection; sql : string);
var
  fld : Integer;
  guid : TGuid;
begin
    qupdate := TZUpdateSQL.Create(self);
    query.UpdateObject := qupdate;
    query.Connection := conn;

    CreateGUID(guid);
    cursorname := StringReplace(GUIDToString(guid),'{','B',[]);
    cursorname := StringReplace(cursorname,'}','E',[]);
    cursorname := StringReplace(cursorname,'-','',[rfReplaceAll]);
    query.SQL.Clear;
    query.SQL.Add('begin');
    query.ExecSQL;

    query.SQL.Clear;
    query.SQL.Add('declare '+cursorname+' cursor for '+sql);
    query.ExecSQL;

    query.SQL.Clear;
    query.SQL.Add('fetch forward 100 in '+cursorname);
    query.Prepare;
    query.open;

    datasource.DataSet := query;
    dbgrid.DataSource := datasource;
    query.BeforeScroll := @fetch_view;
    dbgrid.Parent := self;
end;                                            
Here I have to copy the records somewhere - to another dataset - to accumulate them as user scroll through DBGrid
since T(Z*)/SQLQuery can't accumulate records from repeated Open/Refresh calls.
Absolutly the same situation with using PostgreSQL's SingleRowMode at lower levels of Zeoslib since now IZStatement in conjunction with IZResultSet can't accumulate records from consecutive PQgetResult calls to provide them for higher levels as one IZResultSet.
Moreover with PQSendQuery its possible to send several different SQL statements at once and retrieve results of them with series of PQgetResult - each call return result for next SQL statement sent early.
But now Zeoslib and FPC Database layer as whole not suited for that.

Summarizing: to be consistent it requires far more thinking/tries/work and cornerstone is IZResultSet implementation which can handle results from several PQgetResult as one piece.
EgonHugeist wrote: What i couldn't resolve: The TZReadOnlyQuery-Memory for this case.
What wrong with memory in TZReadOnlyQuery?

Re: TZQuery fetch control

Posted: 12.01.2015, 23:26
by EgonHugeist
Well seems like we're talking about different things.

No time for a detailed answer... Got it running in a different way. See attached zip file (made agains latest 7.2 from SVN and includes PG changed files only!).

Note (Why i'm not surpriced if PosgreSQL came to shove and a new feature is born :?: )
Fetching performance is going to nuts with this idea. Just 15% of the generic PG_tuples results:

{
"Engine": "ZEOS PostgreSQL Anync",
"CreateTableTime": "175.50ms",
"NumberOfElements": 5000,
"InsertTime": "2.48s",
"InsertRate": 2014,
"InsertBatchTime": "161.03ms",
"InsertBatchRate": 31049,
"InsertTransactionTime": "772.09ms",
"InsertTransactionRate": 6475,
"InsertBatchTransactionTime": "134.05ms",
"InsertBatchTransactionRate": 37298,
"ReadOneByOneTime": "2.87s",
"ReadOneByOneRate": 1737,
"ReadAllVirtualTime": "333.21ms",
"ReadAllVirtualRate": 15005,
"ReadAllDirectTime": "302.52ms",
"ReadAllDirectRate": 16527,
"ClientCloseTime": "12.42ms"
}
vs. the generic version:
{
"Engine": "ZEOS PostgreSQL",
"CreateTableTime": "104.84ms",
"NumberOfElements": 5000,
"InsertTime": "2.54s",
"InsertRate": 1963,
"InsertBatchTime": "166.32ms",
"InsertBatchRate": 30060,
"InsertTransactionTime": "739.45ms",
"InsertTransactionRate": 6761,
"InsertBatchTransactionTime": "127.93ms",
"InsertBatchTransactionRate": 39081,
"ReadOneByOneTime": "716.59ms",
"ReadOneByOneRate": 6977,
"ReadAllVirtualTime": "56.86ms",
"ReadAllVirtualRate": 87935,
"ReadAllDirectTime": "37.23ms",
"ReadAllDirectRate": 134282,
"ClientCloseTime": "12.94ms"
}

Annoying, really...
Thats's why i bound your idea to a MaxRow param of the PG-Statment.

Async updates aren't handled yet.
Again: Please stop 7.1 approaches. There we've a behavior lock. I can't apply your patches.

Detailed answer later.. No time for all your suggestions/questions yet.

Re: TZQuery fetch control

Posted: 11.10.2015, 18:52
by sftf
Accordingly the memory consumptions of updateable ResultSets i think a little trick could help:
A updatable RS is a cached-RS. This needs to be filled row by row. If this is done and the "native" RS returns Next=False we could close the handle with PGclear. This frees the memory in libpq.
Yes, It is possible to free NativeResultSet after all rows fetched into CachedResultSet.
Soon I'll post my thoughts about more memory efficient usage of TZCachedResultSet.

As for libpq's 'Asynchronous Command Processing' and 'Single-row mode' - I'm not interested in this option any more.
I considered these two new features to implement 'paging'/'fetch control' from backend (PostgreSQL actually), but:
- these features specific to PostgreSQL and perhaps has different implementation in other databases;
- only one async query is allowed at a time per connection in libpq.dll and more over - it's limitation of current PostgreSQL backend protocol (http://postgresql.nabble.com/libpq-why- ... l#a5832807). It's not very handy since require extra connection per each simultaneous async query;
- ZeosLib architecture is blocking and all classes are written with the assumption that the queries returns all their rows at once and not by N rows at a time;
- FPC DataSet layer written with this assumption too: after DataSet.Open there is no way to get extra rows into it from opened or another statement;
Now I think It's not worth it.

So I plan to implement 'paging'/'fetch control' at application level with help of server cursors.

Re: TZQuery fetch control

Posted: 13.10.2015, 18:58
by marsupilami
Hello sftf,

I am curious - I did not actally understand what you want to use this feature for or how you wanted to use it. For me something like that only makes sense in a forward only dataset? What did you want to do? There are some use cases where I think about having some forward only (read only?) dataset, so I can work on all rows of big tables without having to load them into my application all at once and without having to implement some paging algorithm myslf...

Best rgards,

Jan

Re: TZQuery fetch control

Posted: 14.10.2015, 10:41
by sftf
I try to rewrite our internal Foxpo 2.6 DOS app as 2-tire: client (Lazarus) and server (PostgreSQL). It have table like 'contracts' with now about 32000 records or, for example, 'clients' table. Our stuff is used to be able to view all contracts (as well as other parts of app data ) as (possibly large) list and scroll through it as needed.

Of course, user should have ability to search for contract or customer - it will be implemented with 'select...where...'.
But in reality stuff people often even do not remember how the customer is named in our system. So they begin to view through a list of ALL customers to guess the customer's record. It's really helps because people are used to...

Also, in old app, stuff people are used to see some data when they open customers or contracts. So if app simply not load contracts at all (and search will the only option) or only part of contracts to speedup UI, people are beginning to complain that "all customers are gone somewhere" or "I don't see any contract in our program" or "it was such customer here in list but now I don't see it".

So 'contracts' tables is about 15MB and result of 'select * from contracts' query is transferred from the server in about 1.5 seconds over 100Mb network, creating a visible delay before the data will be displayed.
Over slower networks it will be seconds, but working with Internet is not main option for this app.
That it - in app I would like be able "open and show" relatively large dataset instantly, without sensible delays and scroll through it in UI back and forward. Also I would like to be able process relatively large number of records on client just in case.

One way to simulate instant open is to load data by small portions as needed.
For example first 100 records is loaded and shown - user see them and start do what they want to do. When he/she tries to go to 101 record, next 100 records loaded (from cursor) and appended to already loaded records and so on until cursor is exhausted.
Also potentially this saves traffic if user stops on some record.