Page 1 of 1

Zeos 7.3 + MySQL TWideString.AsBytes trimmed

Posted: 21.09.2020, 21:07
by aehimself
I guess this will be more likely a metadata issue, but here we go.

MySQL 8.0.18, client version: 8.0.18, database access component version: 7.3.1-b99a7ce8

I run the query: "SHOW ENGINE INNODB STATUS"

And then, I am trying to get the "status" field by bytes (var tb = query.FieldByName('Status').AsBytes)
Instead of the 9-something thousand bytes, I get 80 in the array.

The trimming happens at TZAbstractRODataSet.GetFieldData, line 2787

Code: Select all

2784              begin
2785                blen := blen shl 1;
2786                if blen >= NativeUint(Field.DataSize-2) then
2787                  blen := NativeUint(Field.DataSize-2);
2788                Move(P^, Pointer(Buffer)^, blen);
2789                PWord(PAnsiChar(Buffer)+blen)^ := 0;
2790             end;
blen is 9792 (then becomes 19584 after the SHL), that is equal to Length(PWideChar(P)). Checking with the debugger, PWideChar(P) seems to contain all the text, so all is fine until this line.

Then, the problem comes. Field.FSize is 40, so Field.DataSize returns 82. As blen is far greater than this amount it is simply trimmed at 82 - 2 = 80 byte(s), which is 40 characters in Unicode.

I know that this is not a real table, but shouldn't the correct metadata be returned by the SQL server anyway?

Re: Zeos 7.3 + MySQL TWideString.AsBytes trimmed

Posted: 22.09.2020, 12:21
by marsupilami
Hello aehimself,

usually I would say you are right. The SQL Server should return the correct data. Your best bet is to see wether the correct field length is set in the native result set of MySQL. Look for the creation of TZColumnInfo records or objects there.
Small warning: I don't know how the MySQL driver works internally. So there might be other places to look for...

Best regards,

Jan

Re: Zeos 7.3 + MySQL TWideString.AsBytes trimmed

Posted: 23.09.2020, 06:03
by EgonHugeist
Hello aehimself,

on my side:
MariaDB:
field `Type` returns a length of 10 chars,
field `Name`returns a length of 512 chars,
field `Status` returns a length of 10 chars.

MySQL5.7:
field `Type` returns a length of 10 chars,
field `Name`returns a length of 512 chars,
field `Status` returns a length of 10 chars.

MySQL8.0.19:
field `Type` returns a length of 10 chars,
field `Name`returns a length of 512 chars,
field `Status` returns a length of 10 chars.

Note `Name`is blank, `Status` returns '============================',
means i can not find a field having 40 chars :(

Track this in ZDbcMySqlUtils.pas method: GetMySQLColumnInfoFromFieldHandle().

Is it possible you reopen the table more than once and the field-sizes would change? :shock:
Track this by: ZDbcMySqlStatement.pas method TZAbstractMySQLPreparedStatement.CreateResultSet

Code: Select all

function TZAbstractMySQLPreparedStatement.CreateResultSet(const SQL: string;
  BufferIndex: Integer; FieldCount: UInt): IZResultSet;
var
  CachedResolver: TZMySQLCachedResolver;
  NativeResultSet: TZAbstractMySQLResultSet;
  CachedResultSet: TZCachedResultSet;
  MYSQL_ColumnsBinding: PMYSQL_ColumnsBinding;
begin
  FLastWasOutParams := IsOutParamResult;
  if FLastWasOutParams and (FOpenResultSet <> nil) then <========== comment this line ==========<
    IZResultSet(FOpenResultSet).Close;
.......

Re: Zeos 7.3 + MySQL TWideString.AsBytes trimmed

Posted: 24.09.2020, 21:19
by aehimself
Hello,

I think you are getting it wrong. query.FieldByName('Status').Size returns 10 at me as well. I'm talking about TField.FSize, if you go into debugging the query.FieldByName('Status').AsBytes method.

TWideStringField.AsBytes -> step into.
TWideStringField.GetValue(s) -> step into, as this already returns truncated value
TField.GetData -> step into, as this already returns truncated value
TZAbstractRODataset.GetFieldData -> Step into, as this already returns truncated value
inherited TDataSet.GetFieldData -> Step into, as this already returns truncated value
TZAbstractRODataset.GetFieldData -> Step into, as this already returns truncated value

Now we are in the method, where the trimming happens. Line 2786 calls Field.DataSize. Step into.
TWideStringField.GetDataSize calls Size. Step into.
TField.GetSize -> FSize = 10.

Step out. As Field.GetSize returned 10, so (Size + 1) * SizeOf(WideChar) will be 22. So, Size - 2 is 20.



Moment of realization. What the actual hell. Field size indeed changed, now it returns 10...? Commenting out the line you suggested or switching to 64 bit makes no difference.

The strange thing is, I open and close the dataset 3 times in my test app, result is always the same.

If I delete the two lines from ZAbstractRODataSet:

Code: Select all

2786                if blen >= NativeUint(Field.DataSize-2) then
2787                  blen := NativeUint(Field.DataSize-2);
Result is fine, but sometimes I get a dynarray release warning when I close the test application -> memory corruption happened.

Question.

If you have a ZQuery object, with the SQL "SHOW ENGINE INNODB STATUS", does this raise an exception at you?

Code: Select all

 If ZQuery1.FieldByName('Status').AsString <> TEncoding.Unicode.GetString(ZQuery1.FieldByName('Status').AsBytes) Then
   Raise Exception.Create('No');

Re: Zeos 7.3 + MySQL TWideString.AsBytes trimmed

Posted: 25.09.2020, 06:13
by EgonHugeist
No i don't think i got you wrong. I'm just a step further to find root of evel.
I'm trying to evaluate why the hell the data in the

Code: Select all

P := FResultSet.GetPWideChar(ColumnIndex, bLen);
can be longer than the given field-Size.
You can not comment the lines you suggest, or change the "truncation" code you mention: You simply would overrun your memory and this leads to painful exceptions somewhere else as you describe.

So if you call MyField.AsWideString/AsString what's the result? (it should bypass the GetFieldData if the field is the TZUnicodeStringField) Is the result longer than 10 characters?

Edit. It is much, much longer. I've benn running into your example trap too. So i tracked the columntype mapping. Everything is ok. FieldType FIELD_TYPE_VAR_STRING retrieved, no long flags set, size is 40 (utf8mb4) so i can't see a bug there.
Note the TZFields are using the IZResultSet directly. They do bypass the slow Get/SetFielddata quirk for the main purposes. That's why you get different results. ZDBC doesn't care about fixed buffer sizes. ZDBC represents what the driver returns. Set a Breakpoint in function TZAbstractMySQLResultSet.GetPAnsiChar(ColumnIndex: Integer; out Len: NativeUInt): PAnsiChar; Then you'll see the Data is much longer. MySQL, wondering. Maybe google knows it? Honestly all we can do is overriding the GetAsBytes method in TZRawStringField and TZUnicodeStringField. That would fix your problem. But the Grids and ClientDataSet's whould see the trunkated strings only..

Question, why are you permanently dealing with the byte arrays? Is there a reason?

Re: Zeos 7.3 + MySQL TWideString.AsBytes trimmed

Posted: 25.09.2020, 09:54
by aehimself
EgonHugeist wrote: 25.09.2020, 06:13No i don't think i got you wrong. I'm just a step further to find root of evel.
No wonder about that :) I'm still just trying...
EgonHugeist wrote: 25.09.2020, 06:13You can not comment the lines you suggest, or change the "truncation" code you mention: You simply would overrun your memory and this leads to painful exceptions somewhere else as you describe.
I am well aware of this. I just wanted to see if that part of the code isn't executed would the data be complete? The answer is yes, but - as expected - it did corrupt the memory.
EgonHugeist wrote: 25.09.2020, 06:13So if you call MyField.AsWideString/AsString what's the result? (it should bypass the GetFieldData if the field is the TZUnicodeStringField) Is the result longer than 10 characters?
Using .AsString properly returns the data. Also, I have the query connected to a DBGrid, and data shows up just fine there.
EgonHugeist wrote: 25.09.2020, 06:13Honestly all we can do is overriding the GetAsBytes method in TZRawStringField and TZUnicodeStringField. That would fix your problem. But the Grids and ClientDataSet's whould see the trunkated strings only..
So, what is the curlpit? Column size is wrongly returned by MySQL? I'm asking because the size 40 is clearly not enough for the data this query returns. Overriding the method would work, but - as a user - I would expect this not firing off, as they must be equal:

Code: Select all

 If ZQuery1.FieldByName('Status').AsString <> TEncoding.Unicode.GetString(ZQuery1.FieldByName('Status').AsBytes) Then
   Raise Exception.Create('No');
EgonHugeist wrote: 25.09.2020, 06:13Question, why are you permanently dealing with the byte arrays? Is there a reason?
I just say one thing. Character encodings. At work, there are loads of strings stored as BLOBS/CLOBS, zipped and Base64 encoded. I created a viewer, which can overcome this problem by allowing the user to select the character encoding... and to be able to do that, I must handle the data as a byte array, not as a String.

TBytes are not evil. I started to use them when I first faced a character encoding issue (UTF-8 received by my web service, obviously couldn't handle it as String) and then I changed all my custom encryption methods to pure binary too, instead of Strings. Everything became faster (no extra memory allocations) and now I properly can process non-textual data.
TBytes are not evil. They just need a bit of getting-used-to.

Re: Zeos 7.3 + MySQL TWideString.AsBytes trimmed

Posted: 26.09.2020, 14:33
by aehimself
@Egonhugeist - you are THE MAN!
I saw this commit on GitHub so I went ahead and tested... I'm really glad to say it works as expected!

And that opened up a new issue - for me. MySQL returns line breaks as #10 only, so when I convert the byte array to UTF16 and place the result to a TMemo... everything is in one line and it lags like crazy.

Anyway, TWideString.AsBytes now works as expected!
Thank you!

Re: Zeos 7.3 + MySQL TWideString.AsBytes trimmed

Posted: 08.10.2020, 17:39
by EgonHugeist
ae,
no idea what the problem should be according the "encodings". Using TBytes has widely been advertised. Reason has been NEXT_GEN. Inbetween (XE10.4) NEXT_GEN is dead. And the logic is obsolate for me. If you need nice encoding methods. Look to Core\ZEncoding.pas

According this thread: the first culprit is MySQL. A field wich has a length of 10 shouldn't return thousands of chars.
Second, it would be possible for compilers like XE3+ (TValueBuffer has been introduced) to increase the buffersizes by making a SetLength(), but the performance cost would be to high.(the field/tatasets are getting slower and slower be each new version) and would not be possible for FPC or XE2down. So i decided to leave the compiler fields untouched (fkData Fields only) and writer faster one.

However, glad to make you happy!