Page 1 of 1
with OleDB is tuncate the stings with Umlaute ? etc. " ÖÄU üöä ß"
Posted: 23.09.2020, 14:02
by Hardy55
Hallo dear Zeosteam,
well since years I use your DB-Tool
and I must say it's the best on the (open-) marked.
I try to move a way from the extern database-dirvers like freeTDS etc...
and now I'm testing the OLDB connections.
=======================================
ZeosVersion : 7.3.1-beta
revision : 6829
Protocol : mssql
Freetds-1.1.6 ....x86_64... libsybdb-5.dll
Lazarus 2.0.10 r63526 FPC 3.2.0 x86_64-win64-win32/win64
Database MSSQL 2019 developer
=======================================
Resuls with freeTDS
ort
------------------------------
Mülheim
München
Illingen
Bietigheim-Bissingen
Bönnigheim
Vaihingen
Korntal-Münchingen
(10 Zeilen betroffen)
and now testing this
Protocol : OleDB
Database = Provider=SQLNCLI11.1;Persist Security Info=False;User ID=mig19;Initial Catalog=test;Data Source=MyServer\Vers19;Initial File Name="";Server SPN=""
or
Database = Provider=MSOLEDBSQL;Server=linux-kjoz;Database=test;MARS Connection=True;
get the same result
Resuls with OleDB
ort
------------------------------
M
M
Illingen
Bietigheim-Bissingen
B
Vaihingen
Korntal-M
Any ideas why the OleDB is tuncate the stings with Umlaute ? etc. " ÖÄU üöä ß"
best regards
Gerhard
Re: with OleDB is tuncate the stings with Umlaute ? etc. " ÖÄU üöä ß"
Posted: 24.09.2020, 05:39
by EgonHugeist
Hello Gerhard,
nope i have no clue yet. Seems like an encoding issue on either Driver or Zeos side(I'm forcing the driver for a full UTF16 conversion for all VARCHAR columns iirc).. We're using SQL-Server 2016 on a linux system using OleDB DelphiXE6 for production yet. JFYI
Let's find the bug:
1. Which column-type is used for the field? VARCHAR/NVARCHAR[?/MAX] ?
2. Do you use the new UTF8 feature of V2019?
3. What happens if you use the "SQLOLEDB.1" driver? Same result too?
4. Which default collation is used?
5. What happens if you set Your-TStringField.Transliterate := True;
6. Which ControlsCodePage is set in your TZConnection?
Re: with OleDB is tuncate the stings with Umlaute ? etc. " ÖÄU üöä ß"
Posted: 24.09.2020, 10:55
by Hardy55
Hello Michael,
sorry for my late reply!
Well i tested now the
1. Which column-type is used for the field? VARCHAR/NVARCHAR[?/MAX] ?
with both fieldtypes the same result!
2. Do you use the new UTF8 feature of V2019?
NO ---> because the production enviroment has various Verions of SQL-Server
from 2012 up to then Latest - so that option not possible.
3. What happens if you use the "SQLOLEDB.1" driver? Same result too?
Provider=SQLOLEDB.1;Persist Security Info=False;User ID=mig19;Initial Catalog=test;Data Source=MyServer\Vers19
jep the same result .
4. Which default collation is used?
Standartsprache: German
Sortierung: Latin1_General_CI_AS
5. What happens if you set Your-TStringField.Transliterate := True;
sorry but how to set ??
do you mean that >>> RawCharacterTransliterateOptions all option are set.
6. Which ControlsCodePage is set in your TZConnection?
conn1.ClientCodepage := ''; but with CP_UTF16 get the same result!
conn1.ControlsCodePage:= cCP_UTF8
Best regards
Gerahrd
Re: with OleDB is tuncate the stings with Umlaute ? etc. " ÖÄU üöä ß"
Posted: 24.09.2020, 16:52
by Hardy55
so I make some tests.
Text File
...Var daten :
Results with : wideString, unicodeString ==> M?nchen
ansiString, RawByteString, String ==> M
Re: with OleDB is tuncate the stings with Umlaute ? etc. " ÖÄU üöä ß"
Posted: 24.09.2020, 17:23
by Hardy55
sorry make a mistake ... forget the last post!
The textfile has data with a fixed record length!
when in load the file as myList.LoadFromFile('O:\TEST.DAT',TEncoding.ANSI);
It looks ok, but the result of the utf8 converting is a shifting of data away from the orginal position.
Without TEncoding.ANSI the texfile do not expand
and Results from the DATEN string
with : wideString, unicodeString ==> "M?nchen"
ansiString, RawByteString, String ==> "M"
this does not happen with the FreeTDS dirver!
sorry I have not clue
how to fix it
Best regards
Gerhard
Re: with OleDB is tuncate the stings with Umlaute ? etc. " ÖÄU üöä ß"
Posted: 25.09.2020, 06:03
by EgonHugeist
Howdy Hardy,
ahh that's clearing the things up. So it's no Zeos bug.
Gerhard, FreeTDS is raw string based wheras OleDB is a UTF16 driver (SQLServer since 2005 is W based).
Did you check what the SQL-Serverstudio shows for the values if you use FreeTDS?
However OleDB is a W driver. Thus all encodings are taken from the ControlsCodePage you're using. Sadly LCL keeps the UTF8 aproach and is doing ton's of A2W2A conversion for ever OS except the UTF8 linux targets. Means you have no other logical choice for the ControlsCodePage. Biggest proplem: LCL is silently hacking the DefaultSystemCodePage. If they wouldn't that you could set Encoding in TZRawCharacterTransliterateOptions to "encDefaultSystemCodePage" and enable the "Param"-translitation (but then all !raw! params need same encoding).
Soweit so gut. Gerhard, you just need to convert the AnsiEncoding to UTF[8/16], means, load the stream, change the encoding from Ansi to UTF8(A2W2A) if you bind the Param as ftMemo type or (much simpler) convert the data from Ansi to UTF16 and use ftWideMemo instead. Bind the parameter afterwards. That's one of the reasons why UnicodeStrings in practice are more intuitive than UTF8.
Hop it helps..
Re: with OleDB is tuncate the stings with Umlaute ? etc. " ÖÄU üöä ß"
Posted: 25.09.2020, 13:58
by Hardy55
Hallo Michael,
I decide that the TDS-Driver is the better decion for my purpose.
The whole struggele with the converting of strings between ANSI-ASCII-UTF16-UTF8 is a holy crap.
Nichts führ ungut! So is es.
So I will stay at the TDS-Driver this made more sence, but i'll keep an eye of the oledb dirver.
Thank you very much and have a nice weekend Michale.
Gerhard
Re: with OleDB is tuncate the stings with Umlaute ? etc. " ÖÄU üöä ß"
Posted: 17.12.2020, 07:16
by EgonHugeist
Hello Gerhard,
have a look @
viewtopic.php?f=50&t=129484
See
EgonHugeist wrote:
also new LoadFromFile/LoadFromStream overloads has been added. and the old ones are tagged as deprecated(encoding inconsistent). Reason was to help uses with all the db encoding quirk.
Each of the new overloads do support a CodePage. CP(1200) means UTF16, 65001 is UTF8 .... See ZEncoding.pas for a full list of codepages. If the CP = 0/zero we essume it's a binary File/Stream.
Now you can specify the codepage of the text-streams and zeos is doing the conversions for you. Is that interesting? Did that help?