Character problem using postgre

The alpha/beta tester's forum for ZeosLib 7.0.x series

Report problems concerning our Delphi 2009+ version and new Zeoslib 7.0 features here.

This is a forum that will be removed once the 7.X version goes into stable!!

Moderators: gto, EgonHugeist, olehs

Locked
yesod
Fresh Boarder
Fresh Boarder
Posts: 4
Joined: 07.10.2009, 17:10

Character problem using postgre

Post by yesod »

I use Postgre with
ENCODING = UTF8
LC_CTYPE = French_Canada.1252

In pgAdmin the accent é,è seems to be saved correctly.

When i read the value in Delphi 2009 using Zeos like this

QueryTable.Fields[1].AsString

I get faulty charaters for the accent.

I've tried QueryTable.Fields[1].AsAnsiString but I get the same results.

If I set

ENCODING = WIN1252
LC_CTYPE = French_Canada.1252

I can see the faulty charaters in pgadmin.

Thanks for your help
yesod
Fresh Boarder
Fresh Boarder
Posts: 4
Joined: 07.10.2009, 17:10

Solved

Post by yesod »

Using ENCODING = 'UTF8' in postgre and setting

Connection.Properties.Values['codepage']:='UTF8';
Connection.Properties.Values['client_encoding']:='UTF8';

solved the problem.

Seems you cannot use WIN1252 anymore.
marcov
Senior Boarder
Senior Boarder
Posts: 95
Joined: 24.06.2010, 09:17

Post by marcov »

I did this too, but then get exceptions on startup that the tdbmemo fields should then be of type "ftwidememo". If I change that, the memos are not correctly fills (it looks like opening UTF-16 text with a ascii only editor, iow like T H I S I S T E X T )

Does sb recognize this? Is it correct to change the type to ftwidememo?

It looks like something that is already widestring is interepreted as ansistring and then expanded again into a widestring.

Note: I'm still using the december 7.0 alpha. If there is a point in using trunk, please say so, and I'll try.
User avatar
mdaems
Zeos Project Manager
Zeos Project Manager
Posts: 2766
Joined: 20.09.2005, 15:28
Location: Brussels, Belgium
Contact:

Post by mdaems »

marcov,
Using trunk isn't that bad idea. Certainly not for experienced fpc + SVN users as you are. However, it's not sure your issue will be fixed there.

Mark
Image
marcov
Senior Boarder
Senior Boarder
Posts: 95
Joined: 24.06.2010, 09:17

Post by marcov »

mdaems wrote:marcov,
Using trunk isn't that bad idea. Certainly not for experienced fpc + SVN users as you are. However, it's not sure your issue will be fixed there.
I've set up a way to easily test with trunk (on a different machine).

The trouble is that I don't know my way in the DB parts of the VCL, so I can't pinpoint the exact problem.

I started with reproducing, and confirmed the problem is not fixed, but maybe my original description was wrong, so I'll try to be more precise below. I also reduced the application I have to a single table using one, with a second one via a lookup field

First, as said, I simply changed the properties of the ZConnection to contain:

codepage=utf8
client_encoding=utf8

Then the error raised is:

Project xxx.exe raised exception class EDatabaseError with message "ZSomeQuery: Type mismatch for field "somememo", expecting Memo, actual: WideMemo"

Which is the first check in TZAbstractRODataset.CheckFieldCompatibility

This is a TMemofield with blobtype set to ftMemo.

When I first googled this, I got the impression that changing it to ftwidememo should fix it, but the above reads "expecting Memo". But maybe I shouldn't assume this at this point.

So the first question becomes:

What should the field type of an memo be under unicode version?

Most notably, is a "blobtype=ftmemo" field truely a one byte/char (ansistring) kind of field, or is simply in the native stringtype (=string, and thus utf16). And client_encoding=utf_8, does that mean that ftmemo is alright (since it is an 1 byte encoding)?

If so, then of course the memo/widememo difference is irrelevant for uncode Delphi, and anything gets converted to unicode for the GUI.

It would help tremendously if sb could give me some fixed points (what the blobtype of the field should be, and in what part to search for the conversions)
marcov
Senior Boarder
Senior Boarder
Posts: 95
Joined: 24.06.2010, 09:17

Post by marcov »

A few hours debugging down the line, I'm not so sure this is a bug. Rather an unexpected sideeffect of changing encoding on the connection

I noticed that the dbmemo field is ansistring based, also in D2009. One needs a dbwidememo for unicode.

So I deleted the tdbmemo, and reimported the field from the DB, and it became indeed tdbwidememo, and the problem looks solved. So apparantly the connection properties influence the way fields are imported, something to keep in mind for the future.

Probably forcing an ansistring widget on an UTF8 field that isn't prepared for it causes funky conversions. Maybe there is improvement to be had there. (detection or exception) Something _is_ rotten there, since plain ascii became mangled in the process (indicating a typing mistake somewhere)

I also had a few places where I employed the "memo in grid" gettext solution, and had to fix the typecast in Text:=(tdbmemo(sender).asstring here and there.

Maybe some of this experience is FAQ worthy?
User avatar
mdaems
Zeos Project Manager
Zeos Project Manager
Posts: 2766
Joined: 20.09.2005, 15:28
Location: Brussels, Belgium
Contact:

Post by mdaems »

marcov,

Indeed : something _IS_ rotten in the D2009 unicode strings thing. gto did try to do an initial conversion, but other D2009 aren't very helpfull in solving the issues they meet. And I'm not a D2009+ user.

The delete and recreate solution isn't a real problem in my opinion. It's just a logical consequence of converting from ansistrings to unicode, I think.

The mangling thing has been mentioned before. I wonder if somebody will ever take the effort to provide a solution that doesn't break behaviour on older/competing compilers.

Mark
Image
Locked