Page 1 of 2

SQL-Server2019 Win10 21H2

Posted: 03.07.2022, 16:42
by nlanger
I work with Delphi 10.4 and Zeos 8.0 beta (without ZParam) on Windows 10. Up to version 1903 everything works fine. Now the clients have Win10 version 21H2 on it. In the case of SQL select with larger amounts of data, an access violation (stack overflow) occurs with ZQuery.Close, ZQuery.Free or ZQuery.Connection=nil. However, this error only occurs with very special data constellations.

I managed to build a test program for it.
The Zeos3a test program generates the error. Zeos3b resets the field again.

Here, too, the error sometimes only occurs when debugging (F9 in Delphi) and not without a debugger (Shift+Ctrl+F9 in Delphi).
Unfortunately, it occurs again and again with the clients, which leads to the total crash of the program. The error does not appear for hours and then suddenly constantly.

I've already searched the Zeos source (using a debugger). The error occurs when freeing memory of fields and their data. e.g. in ...

..\src\dbc\ZDbcCache.pas
procedure TZRowAccessor.ClearBuffer
at
System.FreeMem(PPointer(@Buffer^.Columns[FColumnOffsets[FVarLenCols] +1])^);

and

procedure TZIndexPairList.Clear;
at
FreeMem(Items);

The error also occurs in other places, if you build a try ..except end around it as a test.
It seems as if Win10 has already used the memory for something else.

The test program only changes one value in the "MA" field from two digits to four digits - only then does the error occur here.
Norbert

Re: SQL-Server2019 Win10 21H2

Posted: 04.07.2022, 15:13
by marsupilami
Hello Norbert,

I can confirm the bug. It also happens on FPC. I am not sure how to prevent it though because the actual access violation happens somewhere in the memory manager.

On FPC i can work around this using the cmem memory manager. Using a modified cmem unit this also works on Delphi:

Code: Select all

{
    This file is part of the Free Pascal run time library.
    Copyright (c) 1999 by Michael Van Canneyt, member of the
    Free Pascal development team

    Implements a memory manager that uses the C memory management.

    See the file COPYING.FPC, included in this distribution,
    for details about the copyright.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

 **********************************************************************}
unit cmem;

interface

type
  PtrUInt = LongWord;
  PPtrUInt = ^PtrUInt;

Const

{$if defined(go32v2) or defined(wii)}
  {$define USE_STATIC_LIBC}
{$endif}

{$if defined(win32)}
  LibName = 'msvcrt';
{$elseif defined(win64)}
  LibName = 'msvcrt';
{$elseif defined(wince)}
  LibName = 'coredll';
{$elseif defined(netware)}
  LibName = 'clib';
{$elseif defined(netwlibc)}
  LibName = 'libc';
{$elseif defined(macos)}
  LibName = 'StdCLib';
{$elseif defined(beos)}
  LibName = 'root';
{$else}
  LibName = 'c';
{$endif}

{$ifdef USE_STATIC_LIBC}
  {$linklib c}
Function malloc (Size : ptruint) : Pointer;cdecl; external;
Procedure free (P : pointer); cdecl; external;
function realloc (P : Pointer; Size : ptruint) : pointer;cdecl; external;
Function calloc (unitSize,UnitCount : ptruint) : pointer;cdecl; external;
{$else not USE_STATIC_LIBC}
Function Malloc (Size : ptruint) : Pointer; cdecl; external LibName name 'malloc';
Procedure Free (P : pointer); cdecl; external LibName name 'free';
function ReAlloc (P : Pointer; Size : ptruint) : pointer; cdecl; external LibName name 'realloc';
Function CAlloc (unitSize,UnitCount : ptruint) : pointer; cdecl; external LibName name 'calloc';
{$endif not USE_STATIC_LIBC}

implementation

Function CGetMem  (Size : NativeInt) : Pointer;
begin
  Result:=Malloc(Size+sizeof(ptruint));
  if (Result <> nil) then
    begin
      Pptruint(Result)^ := size;
      inc(ptruint(Result), sizeof(ptruint));
    end;
end;

Function CFreeMem (P : pointer) : Integer;
begin
  if (p <> nil) then
    dec(ptruint(p),sizeof(ptruint));
  Free(P);
  CFreeMem:=0;
end;

Function CFreeMemSize(p:pointer;Size:ptruint):ptruint;

begin
  if size<=0 then
    exit;
  if (p <> nil) then
    begin
      if (size <> Pptruint(ptruint(p)-sizeof(ptruint))^) then
        runerror(204);
    end;
  CFreeMemSize:=CFreeMem(P);
end;

Function CAllocMem(Size : NativeInt) : Pointer;
begin
  Result:=calloc(Size+sizeof(ptruint),1);
  if (Result <> nil) then
    begin
      Pptruint(Result)^ := size;
      inc(ptruint(Result),sizeof(ptruint));
    end;
end;

Function CReAllocMem (p:pointer;Size:NativeInt):Pointer;

begin
  if size=0 then
    begin
      if p<>nil then
        begin
          dec(ptruint(p),sizeof(ptruint));
          free(p);
          p:=nil;
        end;
    end
  else
    begin
      inc(size,sizeof(ptruint));
      if p=nil then
        p:=malloc(Size)
      else
        begin
          dec(ptruint(p),sizeof(ptruint));
          p:=realloc(p,size);
        end;
      if (p <> nil) then
        begin
          Pptruint(p)^ := size-sizeof(ptruint);
          inc(ptruint(p),sizeof(ptruint));
        end;
    end;
  CReAllocMem:=p;
end;

Function CMemSize (p:pointer): ptruint;

begin
  CMemSize:=Pptruint(ptruint(p)-sizeof(ptruint))^;
end;

function CGetHeapStatus:THeapStatus;

var res: THeapStatus;

begin
  fillchar(res,sizeof(res),0);
  CGetHeapStatus:=res;
end;

Const
 CMemoryManager : TMemoryManagerEx =
    (
      GetMem : CGetmem;
      FreeMem : CFreeMem;
      ReallocMem : CReAllocMem;
      AllocMem : CAllocMem;
      RegisterExpectedMemoryLeak: nil;
      UnregisterExpectedMemoryLeak: nil;
    );

Var
  OldMemoryManager : TMemoryManagerEx;

Initialization
  GetMemoryManager (OldMemoryManager);
  SetMemoryManager (CmemoryManager);

Finalization
  SetMemoryManager (OldMemoryManager);
end.
Honestly I don't have a clue on how to fix that (yet). Maybe it makes sense to start a thread on this in DelphiPraxis and / or the Lazarus forum...

Best regards,

Jan

Re: SQL-Server2019 Win10 21H2

Posted: 04.07.2022, 15:19
by marsupilami
Note: The effect of Zeos3b can be included on my system by simply adding the following line after ZConn.Connect:
  ZConn.ExecuteDirect('update zwversand set ma=''Hä''');

Re: SQL-Server2019 Win10 21H2

Posted: 05.07.2022, 12:47
by marsupilami
The problem was introduced in Rev. 6366 of Zeos 7.3-testing...

Re: SQL-Server2019 Win10 21H2

Posted: 05.07.2022, 13:55
by marsupilami
So - this is theory. In TZAbstractCachedResultSet.AfterClose we call FRowAccessor.DisposeBuffer(FUpdatedRow) to free the memory associated with that row. The buffer seems to hold a collection of pointers that point to the actual data. We traverse these pointers and free the memory that it points to.
So far so good. It seems that the memory manager stores a pointer to vital information just in front of the memory. It does this in GETMEN.INC in SysFreeMem(P: Pointer):

Code: Select all

  {Get the block header in edx}
  mov edx, [eax - 4]
So I assume that somehow we overwrite that information at some point and then things start to go wrong. Infortunately the changes in rev. 6366 are very big so it is not possible to pinpont the root of this.
The cmem memory manager possibly gets around this by storing management information somewhere else.

Re: SQL-Server2019 Win10 21H2

Posted: 05.07.2022, 15:23
by marsupilami
I think I fixed the problem in [r7824] for Zeos 8. Could you please update your Zeos 8 and check it?

Re: SQL-Server2019 Win10 21H2

Posted: 05.07.2022, 18:38
by aehimself
Can't wait for this to show up in Git. Maybe it fixes my Oracle UTF16 issue as well...

Re: SQL-Server2019 Win10 21H2

Posted: 06.07.2022, 08:08
by marsupilami
aehimself wrote: 05.07.2022, 18:38 Can't wait for this to show up in Git. Maybe it fixes my Oracle UTF16 issue as well...
The fix is part of of commit 6700b4d for master on github. On 8.0-patches it is c1e01e. But I have doubts wether this will fix your problems. This bug is only triggered under certain circumstances:
  • A Widechar-String has to be stored in the result set.
  • The field value must not be null.
  • You replace the field value with a Widechar-string that is two times as long as the old string.
Only then will the bug trigger.

Re: SQL-Server2019 Win10 21H2

Posted: 06.07.2022, 13:34
by nlanger
Hello Jan

Yes, that solves the problem.
I installed [r7829] and the error no longer occurs in the test program and in the correct large program.

great!!!

Norbert

Re: SQL-Server2019 Win10 21H2

Posted: 06.07.2022, 13:44
by nlanger
Hello Jan

While you're at it, I have one more bug that I've tried to build myself - but I'm not sure if it's 100% accurate.

If the program runs on a laptop with WLAN and the ZConnection is always open (which makes sense) and you carry the laptop around the company and change the WLAN or you plug in/disconnect the LAN cable - occurs at the next ZQuery access error on. Actually you only have to do the Zconn.Close and Open and of course reactivate all ZQuery's that were open - then it would continue without errors.

I intercepted this - is certainly not the best solution.

=> see the 3 places with {$IFDEF ReConnect}

Norbert

Re: SQL-Server2019 Win10 21H2

Posted: 06.07.2022, 14:32
by nlanger
There is a small error in the current sources.
When compiling without ZParam ...

raise EZSQLException.Create(SInvalidVarByteArray);

must not be used in the Line 1917 from ZDatasetUtils.pas ;-)

use => EZDatabaseError.Create()

Norbert

Re: SQL-Server2019 Win10 21H2

Posted: 06.07.2022, 15:40
by marsupilami
nlanger wrote: 06.07.2022, 13:44 Hello Jan

While you're at it, I have one more bug that I've tried to build myself - but I'm not sure if it's 100% accurate.

If the program runs on a laptop with WLAN and the ZConnection is always open (which makes sense) and you carry the laptop around the company and change the WLAN or you plug in/disconnect the LAN cable - occurs at the next ZQuery access error on. Actually you only have to do the Zconn.Close and Open and of course reactivate all ZQuery's that were open - then it would continue without errors.

I intercepted this - is certainly not the best solution.

=> see the 3 places with {$IFDEF ReConnect}

Norbert
I tried to see what you do but this file seems to not be based on a recent version of Zeos 8? In Zeos 8 Egonhugeist introduced the OnLost event on the TZConnection. This allows Zeos to handle everything necessary (close datasets, free ressources where necessary) and then gives you the chance to reconnect. Maybe this is what you need?

Re: SQL-Server2019 Win10 21H2

Posted: 06.07.2022, 15:56
by marsupilami
nlanger wrote: 06.07.2022, 14:32 There is a small error in the current sources.
When compiling without ZParam ...

raise EZSQLException.Create(SInvalidVarByteArray);

must not be used in the Line 1917 from ZDatasetUtils.pas ;-)

use => EZDatabaseError.Create()

Norbert
Hello Norbert,

this should be fixed in the latest tree.

Best regards,

Jan

Re: SQL-Server2019 Win10 21H2

Posted: 07.07.2022, 13:43
by nlanger
I tried to see what you do but this file seems to not be based on a recent version of Zeos 8? In Zeos 8 Egonhugeist introduced the OnLost event on the TZConnection. This allows Zeos to handle everything necessary (close datasets, free ressources where necessary) and then gives you the chance to reconnect. Maybe this is what you need?

Hello Jan

no, that doesn't solve the problem. The onLost-event is not triggered, if the network is short for e.g. 10 seconds is gone. If you then use a ZQuery again, ZConnection is still Connected=True and access violations occur in the ZQuery.

The ZConnection does not seem to find out that the TCP/IP was offline for a short time and the connection information needs to be updated. You can try this by removing the LAN cable for a short time or "deactivating" the network card for a short time and after about 10 seconds again " activated".

Norbert

Re: SQL-Server2019 Win10 21H2

Posted: 07.07.2022, 17:34
by marsupilami
Hello Norbert,
nlanger wrote: 07.07.2022, 13:43 no, that doesn't solve the problem. The onLost-event is not triggered, if the network is short for e.g. 10 seconds is gone. If you then use a ZQuery again, ZConnection is still Connected=True and access violations occur in the ZQuery.
So - I did a test on my computer. I do get the following error message:
---------------------------
Project1
---------------------------
SQL Error: OLEDB Error
SQLState: 08S01 Native Error: 10054
Error message: Kommunikationsverbindungsfehler
Source: Microsoft OLE DB Driver for SQL Server
SQLServer details:
SQLState: 08S01 Native Error: 10054
Error message: TCP Provider: Eine vorhandene Verbindung wurde vom Remotehost geschlossen.

Source: Microsoft OLE DB Driver for SQL Server
SQLServer details:
Line: 0, Error state: 1, Severity: 16
Code: 10054 SQL: select @@SPID
---------------------------
OK
---------------------------
The OnLost event is triggered and then the exception is raised - wether I do a reconnect or not. I found a problem in the OleDB driver when doing a .ReConnect which lead to an access violation. I will commit the fix to subversion today.
But - Egonhugeist only made this work for MS SQL Server and SQLSTATE 08S01. If you don't get an exception of the type EZDatabaseConnectionLostError please let me know what type of database you use (I assume it is MS SQL Server) and which SQLState you get.
nlanger wrote: 07.07.2022, 13:43The ZConnection does not seem to find out that the TCP/IP was offline for a short time and the connection information needs to be updated. You can try this by removing the LAN cable for a short time or "deactivating" the network card for a short time and after about 10 seconds again " activated".
Well this is not how these things work ;) Zeos will only be notified about the lost connection, when it tries to do some work on the connection. It isn't possible to set Active to False in advance. This leads to the following order of events: User tries to do something -> Zeos gets an error with SQLState=08S01 -> Zeos calls the OnLost handler of TZConnection -> Zeos raises an EZDatabaseConnectionLostError.

I will make the OleDB driver initiate a connection loss error for all drivers when they reurn an 08S01 SQLSTATE because this should be the standardized error code.

Best regards,

Jan