Page 1 of 1

Feature question

Posted: 19.09.2008, 11:02
by yeohray
I am currently evaluating Zeoslib and have a question. Is the following possible using zeoslib?

- user runs a query that retrieves 1000 rows
- zeoslib retrieves first 10 rows
- my app processes each row, does some calculation etc
- when my app tries to retrieve the 11th row, zeoslib then retrieves the next batch of 10 rows
- my app again processes each row, and this loop continues until all 1000 rows have been processed
- my app will not need to navigate backwards, so there isn't a need to store all 1000 rows in memory

My aim basically is to minimise memory requirements, as my users could potentially retrieves tens of thousands of rows. I just need to look at each row only once, from beginning to end, that's all. Most data access libraries I've evaluated retrieves the entire result set first.

Thanks in advance.


Ray Mond

Posted: 19.09.2008, 12:24
by mdaems
Hi,

As far as I know this isn't completely possible using the components.

The incremental fetching has just been added by Patyi. But this isn't added to the Zeoslib 6.6 version. But this one is very easy to patch. See http://fisheye2.atlassian.com/changelog/zeos/?cs=407 for what must be done.

This does not free the memory of the formerly processed rows, however. This is because a TDataset is supposed to be searcheable, browseable, ... So all read rows are cached.

You might be able to do this using the dbc interface directly, however. I've just had a quick look at the resultset classes and my first impression is that it's not necessarily caching the results. This depends on the database used and the settings on the statement object.
See for example the code in ZDBcMysqlStatement.pas:

Code: Select all

  if (GetResultSetConcurrency <> rcReadOnly) or (FUseResult
    and (GetResultSetType <> rtForwardOnly)) then
  begin
    CachedResolver := TZMySQLCachedResolver.Create(FPlainDriver, FHandle, Self,
      NativeResultSet.GetMetaData);
    CachedResultSet := TZCachedResultSet.Create(NativeResultSet, SQL,
      CachedResolver);
    CachedResultSet.SetConcurrency(GetResultSetConcurrency);
    Result := CachedResultSet; <-- CACHED
  end else
    Result := NativeResultSet;   <-- UNCACHED
Unfortunately there's not so much documentation available on the use of the dbc interfaces. But for the basic functionality : have a look at the dbc project in the examples directory.

Mark

BTW : it would be nice if you could post your conclusions and eventually a code sample here

Posted: 19.09.2008, 13:39
by yeohray
Thanks for your detailed explanation. I'll post a code sample if I manage to work around the caching issue.

Ray Mond

Posted: 20.09.2008, 04:51
by yeohray
I took at a look at the mysql interface, and it isn't easy. There is an option to use the mysql_use_result function, that only fetches each row individually, but the rest of the library is designed without supporting that function i.e. it tries to retrieve column definitions for the result set, when you can't do that if you haven't yet retrieved all the rows from the result set using the mysql_fetch_row function. It would appear you need to retrieve all the other supporting stuff (column definitions, index definitions etc) from another connection.

I guess supporting the ability to retrieve each row without buffering it needs to be built-in from the ground up.


Ray Mond

Posted: 20.09.2008, 10:34
by mdaems
That's more a libmysql issue than a zeoslib caching problem, I think. But I agree, it would be nice if this problem could be avoided.

No plans to do this yet, however. If you could find a way to do the column metadata retrieval before the actual sql is executed this would help, I believe. Maybe you could file a feature request to solve this mysql_use_result problem anyway (in the feature requests forum). Or even better if you have a working (but failing) sample : add it to the bug tracker, including your sample (don't forget the data creation script).

As a workaround you could try to force the metadata queries to be fired before the actual query is sent. If the query tables and columns are determined dynamically this would be difficult however. For static queries with only variable where clauses this could be done by just sending a 'where 1=2' variant of the query at the start of your program.

Mark

Posted: 20.09.2008, 15:57
by yeohray
Thanks for the suggestions. I should also have added that it made no difference if the native result set was used. The memory consumption of my app still increases linearly with the amount of data retrieved. So there's actually 2 issues to address:

- use mysql_use_result and mysql_fetch_row to retrieve data incrementally
- store only one row in memory at any one time, and allow user access to the row data

By the looks of it, it's not an easy change. Also, don't really know if other users would find it useful.

Ray Mond