Page 1 of 2
PostgreSQL bug with array field
Posted: 12.04.2022, 12:01
by stoffman
Hi,
In the database I work with there is a field defined as varchar(60)[] (which means it is an array field) when selecting it I get the following error:
"Cannot access blob record in column 1 with type String"
The error comes from TZRowAccessor.GetBlob function, it checks for the length of the column and expect it to be <= 0 otherwise it fails. However, the column does have a length which is 60 ...
Code: Select all
stString: if FColumnLengths[ColumnIndex] <= 0 then begin
CP := FColumnCodePages[ColumnIndex];
Result := TZRowAccessorRawByteStringLob.CreateWithDataAddess(PZVarLenDataRef(TempBlob), CP, ConSettings, FOpenLobStreams);
end else goto Fail;
Shouldn't the field be stArray? and not stString?
It seems that the cause of the error is accessing the field using AsString function (the field type is ftMemo)
Yoni
Re: PostgreSQL bug with array field
Posted: 19.04.2022, 17:08
by marsupilami
Hello Yoni,
soo - I was in the process of telling you that Zeos doesn't support array columns. But it seems that I am wrong. There is some support for array columns but I am not sure if Egonhugeist ever completed that task.
Could you maybe put together a test case (sql script, test program)? Or even better - maybe you want to dig into the code yourself? The postgres driver is pretty clear. The API is pretty good compared to what other RDBMS do.
Best regards,
Jan
Re: PostgreSQL bug with array field
Posted: 21.04.2022, 14:09
by MJFShark
I wonder if this is an effect of changing the Pg driver mode to binary. In string mode, I believe it would return a csv-like representation of the array data. I have not tested that; it's just based on my reading up on the issue. In binary mode you'd have to construct that string using the structures and info in Pg's array.h file. If it seems like doing that (basically just returning it as a string instead of stArray) is the way to go, I can take a look, though my time may be limited, and I have no prior experience with this particular aspect of Pg.
The stack overflow question I was reading:
https://stackoverflow.com/a/4832674/113128
-Mark
Re: PostgreSQL bug with array field
Posted: 22.04.2022, 18:02
by stoffman
@marsupilami - Can you point me to the code Egonhugeist wrote?
As a side note, Firefird/Inerbase also supports array fields. Does its driver support arrays?
@MJFShark - Thanks! I'll use that as a blueprint to the fix
Re: PostgreSQL bug with array field
Posted: 24.04.2022, 11:27
by marsupilami
MJFShark wrote: ↑21.04.2022, 14:09
I wonder if this is an effect of changing the Pg driver mode to binary.
I think, it is. I seem to remember that Zeos treated every unknown field type as stString (=fsString). This allowed to represent data just the way PostgreSQL sent it to Zeos. And I assume that it also allowed updating data as long as the same data format was used. The downside is that every data type that wasn't string and had a TField-descandant (numeric, double precision, ...) had to be converted to and from string which made the driver slower than it needed to be in most cases.
stoffman wrote: ↑22.04.2022, 18:02
@marsupilami - Can you point me to the code Egonhugeist wrote?
Hmm - that code is all over Zeos
Honestly - i did a search for "stArray" in the Zeos code. There are some basic things there:
- ZDatasetParam.pas: The new TZParam seems to have new methods for handling arrays. I assume that they are not meant for array parameters but they are a starting point. Currently we use them for bulk DML, I think.
- ZDbcCache.pas: The TZRowAccessor knows how to do some reservations if a column is of type stArray in its constructor. This is important because the TZRowAccessor is one of the central classes in Zeos for handling in memory data.
- ZDbcFirebirdResultet.pas: the new firebird driver knows how to tell the outside world that a column is an array column.
- ZDbcInterbase6Utils.pas: Also the od ISC API driver for Firebird / Interbase knows how to tell the ouside world that a column is an array column.
But I think these are only some slight preparations. As far as I can see there is no array support in IZResultSet, which is the base interface for all database result sets. If we want full array column support the first step is to extend this interface to support arrays and then to adjust all implementations in Zeos to at least generate some error if one of the new methods gets called and if the driver doesn't support array columns.
Since the ZDBC API is based on the JDBC API one would think of going the Java way to just add methods like GetArray and SetArray. Problem here is: Java has an Array class, whereas Delphi doesn't. So maybe we would have to implement our own TZArray class and use that. Maybe it makes sense to have a good look at the RowAccessor before deciding this because it at least seems to have some ideas on how to handle arrays there.
Extending Zeos to be able to handle array columns correctly seems to be a major extension of the Zeos internals.
stoffman wrote: ↑22.04.2022, 18:02
As a side note, Firefird/Inerbase also supports array fields. Does its driver support arrays?
Honestly - I don't know. Seeing all the facts above, I assume that no - these drivers don't support arrays - because honestly I don't know how they would make them accessible with the current IZResultSet API.
Re: PostgreSQL bug with array field
Posted: 24.04.2022, 13:30
by MJFShark
My proposal would be to keep the datatype used for arrays as an stStream and not use stArray for this. Doing so requires two changes:
#1: if ColumnType is a stream in TZPostgreSQLResultSet.Open set its precision to zero. It will only be non-zero in cases where the array element is a varchar(#)[] type. This appears to fix the "Cannot access blob record in column 1 with type String" error.
#2: Write the code to get Pg binary arrays as a string, matching the results that you get when you use pglib in string mode. I've actually done this for several of the basic types (integers, floats, and varchar/text.) I've only tested it with one and two dimension arrays and the code is definitely not production ready.
-Mark
Re: PostgreSQL bug with array field
Posted: 25.04.2022, 08:55
by Fr0sT
According to
https://firebirdsql.org/file/documentat ... erence.pdf there's some support of arrays in FB but they're based on BLOBS anyway and mainly make sense if used in SQL. From what FB devs have said, nobody cares about supporting them in client libs because of little use and no real benefit. In case of urgent requirement, there's always PSomeArray(Field.Data)^ cast.
Re: PostgreSQL bug with array field
Posted: 26.04.2022, 08:43
by marsupilami
@Mark: I am willing to accept any patches that improve the situation. Getting these things to work the same way as they worked on Zeos 7.2 is a good step. How do you handle writing array data in these situations?
@Fr0st: I seem to remember somethimg similar. The question is do we want to keep ignoring array fields in Firebird?
Re: PostgreSQL bug with array field
Posted: 26.04.2022, 08:57
by stoffman
@marsupilami @Mark @Fr0st
Text vs. Binary / Correctnes vs. Speed
Following this thread and another one I opened it is obvious that the current implemention of the Postgresql driver has bugs while interfacing with other components (ZMemTable, Grids etc..). With your help it was pinned down to the way the binary data is being parsed into pascal aware types.
After researching the issue for the last couple of days, it looks like this parsing issue cannot be
fully resolved. As there are many types, just for example:
- User defined Compound types
- Arrays
- Range types
- Geometries (from PostGis)
The nice people of PostgreSQL are fully aware of this issue! the wire protocol supports the mixing of both text and binary data for the same result set. But (and this is a big but) libpq.dll doesn't. It only supports text or binary.
Native drivers (like in .NET/JAVA) do support mixed mode. But unfortently this is not feasable in our case to write a native driver. While others that depends on the client library default to text mode (like psycopg for python)
I bealive that something should work correctly before it will work fast. So it should seriously be considered to revert back to the text protocol.
That being said, I'll check over the coming weeks what does it mean to
keep the binary mode: how to covert types like polygon and ranges, and what to do with unknown types (any suggestions would be great)
Thank you all
.NET driver and the excat conversation we are just having:
https://github.com/npgsql/npgsql/issues/447
https://github.com/npgsql/npgsql/issues/370
A 10 years old effort to add mixed mode to libpq (never happen) :
https://www.postgresql.org/message-id/D ... wien.gv.at
PQexecParams - see the resultFormat param:
https://www.postgresql.org/docs/current ... ELECT-INFO
Re: PostgreSQL bug with array field
Posted: 26.04.2022, 12:47
by MJFShark
Good points by all and it's definitely a shame that pglib doesn't support mixed mode. I don't personally use the more esoteric types of the various databases (varray in Oracle is another one off the top of my head.) Note that I believe you can reset the pg driver to string mode using "BinaryWireResultMode=False". I think using binary mode wasn't just about performance, but more about dealing with converting the PQgetValue strings into the other formats properly. My code that I mentioned above is just to convert the binary PQgetvalue for arrays to a formatted string to match the non-binary results (for example "{{1,2},{3,4}}".) I did finish the code with some help from Peter Below. It would have to be extended to handle more types and to correctly dquote/escape the data values. I did nothing at all related to writing arrays.
Re: PostgreSQL bug with array field
Posted: 26.04.2022, 13:02
by marsupilami
stoffman wrote: ↑26.04.2022, 08:57
I bealive that something should work correctly before it will work fast. So it should seriously be considered to revert back to the text protocol.
Well - no. There are several reasons why that is a bad idea. Yes - the implementation for using binary data types has limitations. Such as not supporting certain data types. But it works for the data types that fit 95% of all use cases and has better overall performance when it comes to these 95%. I don't see a reason to force a slower solution upon these 95% for the sake of supporting more or less rare edge cases and unusual data types.
My proposal is different: We add a parameter that tells the PostgreSQL statement wether to request a binary result or a text result. We also (re)add the text processing resultset, that was used in Zeos 7.2, if it isn't in the code anymore. This allows people who don't fit into the 95% to use text results if they need to. Also it enables us to improve the binary result set so it can handle more data types in the future. And it keeps the performance improvements for the majority of the users.
Re: PostgreSQL bug with array field
Posted: 26.04.2022, 13:23
by marsupilami
MJFShark wrote: ↑26.04.2022, 12:47
Note that I believe you can reset the pg driver to string mode using "BinaryWireResultMode=False".
I didn't know about that parameter. Will have to add that to the documentation. This also invalidates my previous suggestion about porting the result set from Zeos 7.2 to Zeos 8.
MJFShark wrote: ↑26.04.2022, 12:47
I think using binary mode wasn't just about performance, but more about dealing with converting the PQgetValue strings into the other formats properly.
Binary mode was about a multitude of things: Performance, rounding errors on floating point types, representation of date and time. These are mentioned in the (not yet published) release notes for Zeos 8.0.
MJFShark wrote: ↑26.04.2022, 12:47
My code that I mentioned above is just to convert the binary PQgetvalue for arrays to a formatted string to match the non-binary results (for example "{{1,2},{3,4}}".) I did finish the code with some help from Peter Below. It would have to be extended to handle more types and to correctly dquote/escape the data values. I did nothing at all related to writing arrays.
I wonder wether it makes sense to just tell people to use text mode result sets if one wants to use arrays. Otherwise we will have to document which array types we support and which array types we don't support. Rearding writing of arrays: I seem to remember that Egonhugeist told me PostgreSQL always supports writing of string literals. So maybe we just have to send a string literal to PostgreSQL in the case of arrays?
Re: PostgreSQL bug with array field
Posted: 26.04.2022, 16:17
by MJFShark
marsupilami wrote: ↑26.04.2022, 13:23
I wonder wether it makes sense to just tell people to use text mode result sets if one wants to use arrays. Otherwise we will have to document which array types we support and which array types we don't support. Rearding writing of arrays: I seem to remember that Egonhugeist told me PostgreSQL always supports writing of string literals. So maybe we just have to send a string literal to PostgreSQL in the case of arrays?
I think that makes sense. When I experiment with some of the rarely used types I usually check out the various casting or formatting functions the database provides. Pg includes a bunch of array helper functions.
Re: PostgreSQL bug with array field
Posted: 27.04.2022, 09:36
by marsupilami
Ok - so what we have to do is disable array fields, possibly marking them as stUnknown until we have some good array support. Users should use BinaryWireResultMode=False if they want to work with arrays until that time.
@stoffman: is that something you can agree to? Could you please check if BinaryWireResultMode=False solves your problem?
Re: PostgreSQL bug with array field
Posted: 27.04.2022, 10:55
by stoffman
@marsupilami
I've just experimented with BinaryWireResultMode=False , I used ZReadOnlyQuery
1. When copying the data to ZMemTable and viewing that MemTable in a gird I get the "Cannot access blob record in column with type String"
2. When showing the data in the grid directly from the ZReadOnlyQuery it worked as expected.