Massive slowdowns due to SHOW TABLES and SHOW FULL COLUMNS

The forum for ZeosLib 7.2 Report problems. Ask for help, post proposals for the new version and Zeoslib 7.2 features here. This is a forum that will be edited once the 7.2.x version goes into RC/stable!!

My personal intention for 7.2 is to speed up the internals as optimal a possible for all IDE's. Hope you can help?! Have fun with testing 7.2
Post Reply
dfumagalli
Fresh Boarder
Fresh Boarder
Posts: 9
Joined: 18.01.2014, 11:49

Massive slowdowns due to SHOW TABLES and SHOW FULL COLUMNS

Post by dfumagalli »

Hello,

please pardon my utter ignorance about the inner mechanics of your beautiful library.

I have developed a massively large and complex Delphi XE5 application that acts as "glue" between etherogeneous remote front end and MySQL 5 back ends.

So far it's working pretty well but it's slow, very slow. In fact I have to insert and delete records more than I'd like (but the back end demands that) and that's a factor.

But there's another. Everywhere in my queries, ZEOS puts in some "SHOW TABLES" and "SHOW FULL COLUMNS". The applications work across internet and no, I cannot make it 3 tiers DataSnap. :cry:

I am attaching a snippet of the typical queries generated per every master-detail record:

Code: Select all

2014-02-13 12:30:14 cat: Execute, proto: mysql-5, msg: Native SetAutoCommit Falsecall
2014-02-13 12:30:14 cat: Execute, proto: mysql-5, msg: Statement 80 : SELECT product_id, geo_zone_id FROM fw_product_to_geozone WHERE product_id = 68 AND geo_zone_id = 14 
2014-02-13 12:30:15 cat: Execute, proto: mysql-5, msg: Statement 82 : INSERT INTO fw_product_to_geozone (product_id, geo_zone_id, date_added, date_modified, user_modified) VALUES (68, 14, '2014-02-13 12:30:15', '2014-02-13 12:30:15', 'dfuma_000')
2014-02-13 12:30:16 cat: Execute, proto: mysql-5, msg: Native Commit call
2014-02-13 12:30:16 cat: Execute, proto: mysql-5, msg: Native SetAutoCommit Truecall
2014-02-13 12:30:16 cat: Execute, proto: mysql-5, msg: Statement 84 : SELECT dz.delivery_zone_id AS delivery_zone_id, dz.geo_zone_id AS geo_zone_id, dz.status AS status, lc.name AS name FROM fw_delivery_zone dz LEFT JOIN fw_localized_country lc ON (dz.visual_country_id = lc.country_id AND lc.language_id = 2) ORDER BY lc.name
2014-02-13 12:30:16 cat: Execute, proto: mysql-5, msg: Statement 85 : SHOW TABLES FROM xc_7614s LIKE 'fw_delivery_zone'
2014-02-13 12:30:17 cat: Execute, proto: mysql-5, msg: Statement 86 : SHOW FULL COLUMNS FROM xc_7614s.fw_delivery_zone LIKE '%'
2014-02-13 12:30:17 cat: Execute, proto: mysql-5, msg: Statement 87 : SHOW TABLES FROM xc_7614s LIKE 'fw_localized_country'
2014-02-13 12:30:17 cat: Execute, proto: mysql-5, msg: Statement 88 : SHOW FULL COLUMNS FROM xc_7614s.fw_localized_country LIKE '%'
While I can't really cut out some slow stuff going on here, I'd really love to be able and skip all those SHOW commands operating on tables containing lots of fields. I am expecially vary of the 'LIKE %' bits.

Is there a way to do that? What are the downsides? I do need to show the data through ClientDataSets but I am specifying all the fields one by one (that is, I don't send "SELECT *" queries that require metadata discovery).

If you need further information or details I'll gladly provide them.

Thanks in advance.
marsupilami
Platinum Boarder
Platinum Boarder
Posts: 1918
Joined: 17.01.2011, 14:17

Re: Massive slowdowns due to SHOW TABLES and SHOW FULL COLUM

Post by marsupilami »

Hello dfumagalli,

this seems to be a case similar to this: http://zeoslib.sourceforge.net/viewtopic.php?f=20&t=11040
Most probably the commands you see are there for Zeos to get meta data information about your database. You can keep Zeos from doing these metadata queries by setting TZConnection.UseMetadata to false. This should suppress most of the queries but also would render all your queries ReadOnly unless you provide suitable TZUpdateSQL components. For more information on TZConnection.UseMetadata see this thread: http://zeoslib.sourceforge.net/viewtopi ... =33&t=3590
Best regards,

Jan
dfumagalli
Fresh Boarder
Fresh Boarder
Posts: 9
Joined: 18.01.2014, 11:49

Re: Massive slowdowns due to SHOW TABLES and SHOW FULL COLUM

Post by dfumagalli »

Hello,

thank you for your reply.

I see you mention this as a known issue (or feature :P ), I'd like to add some considerations about it because I believe they do affect ZEOS performance compared to other libraries.

1) I am already using TZReadOnlyQueries for all but 1 query components.

What could still cause this kind of "spam"?
It could be the fact I use ZEOS queries to forward the fetched data to ClientDataSets and vice versa.
I cannot use Update SQL components, I must do a sensible amount of super-complex and dynamic (i.e. different fields updated per same tables) queries AND do them with multithreading. So I have to use the Data Provider's OnBeforeUpdateRecord and build those queries in there (through a specific dynamic queries builder I have coded).

2) I get this "spam" even well after program start. I take it, the schemas should be fetched only once and possibly when the connection is made, not randomly and repeatedly (despite using read only query components - no table components at all) throughout the whole program run.

3) Why does ZEOS query the database schema at run time to begin with? I am using stored field defs, isn't the database schema only needed to provide for sensible field defs defaults? Why invoking it later and at run time if not for stored procs?


Edit: by reading the second link, apparently ZEOS fetches meta data whenever the "Active" property is set to true.

If this is true, then I feel there's an issue at hand. I can understand the need to be able to manage stored procedures and other "volatile" structures, but this should not happen except when expressly requested (with a property in design mode maybe).

Why? Because in Delphi it's normal to see master - detail query chains implemented in the master's AfterScroll event as follows:

Master scrolls => AfterScroll is triggered => Detail query.Active = off => change detail query :param(s) / SQL code => Detail query.Active = on.

I have 6 master-details nested levels, every time a master record (anywhere in this hierarchy) is scrolled, ALL the details queries get Active = false, their :params change and then ALL
of them get Active = true and this causes meta-data to be re-downloaded again.
This causes a massively huge influx of data even if the effective amount of data fetched for the details is very tiny.
marsupilami
Platinum Boarder
Platinum Boarder
Posts: 1918
Joined: 17.01.2011, 14:17

Re: Massive slowdowns due to SHOW TABLES and SHOW FULL COLUM

Post by marsupilami »

Hello dfumagalli,

usually Zeos needs to query meta data for building queries that delete, update and modify the data in your underlying database, for executing stored procedures and so on. As far as I know, Zeos does not cache meta data about a database by default but it gets the meta data, whenever it needs the meta data. Which Metadata has to be fetched depends on the database and which metadata it returns with the result set.
A lot of these things depend on the author of the dbc level database driver in Zeos. Usually it is more easy to get the necessary meta data, use it and then forget about it than to implement a caching mechanism. If your queries run on the local network, that usually is no problem. In the internet it is different though. So - if you want to cache metadata, your best bet probably is to have a look at the mysql dbc driver and the mysql documentation to see, why these queries (show tables, show full columns, ...) are run and how you can cache the fetched meta data.
Usually it makes no sense to fetch all meta data at the beginning of the connection - this would increase the time needed to initialize everything. Also you could download a lot of meta data you don't need. Just imagine, you want to work with one table in a database that has 1000 tables...
Best regards,

Jan
dfumagalli
Fresh Boarder
Fresh Boarder
Posts: 9
Joined: 18.01.2014, 11:49

Re: Massive slowdowns due to SHOW TABLES and SHOW FULL COLUM

Post by dfumagalli »

marsupilami wrote:Hello dfumagalli,

usually Zeos needs to query meta data for building queries that delete, update and modify the data in your underlying database, for executing stored procedures and so on. As far as I know, Zeos does not cache meta data about a database by default but it gets the meta data, whenever it needs the meta data. Which Metadata has to be fetched depends on the database and which metadata it returns with the result set.
A lot of these things depend on the author of the dbc level database driver in Zeos. Usually it is more easy to get the necessary meta data, use it and then forget about it than to implement a caching mechanism. If your queries run on the local network, that usually is no problem. In the internet it is different though. So - if you want to cache metadata, your best bet probably is to have a look at the mysql dbc driver and the mysql documentation to see, why these queries (show tables, show full columns, ...) are run and how you can cache the fetched meta data.
Usually it makes no sense to fetch all meta data at the beginning of the connection - this would increase the time needed to initialize everything. Also you could download a lot of meta data you don't need. Just imagine, you want to work with one table in a database that has 1000 tables...
Best regards,

Jan
Hello,

I sort of agree with your message but the last sentences.
You don't need to load metadata for 1000 tables, Delphi has several functionalities to navigate a form's components. A metadata cache manager would only need to load metadata for tables instantiated on a form.
Even easier: the cache manager would only need to cache metadata "just in time". It'd hold a list of table names => metadata (key => value) and when a new table is needed it'd just query for its metadata. This way, an after scroll event would only cause 1 metadata load at the first Active = true, the next ones would always hit the cache and cause no further traffic.
marsupilami
Platinum Boarder
Platinum Boarder
Posts: 1918
Joined: 17.01.2011, 14:17

Re: Massive slowdowns due to SHOW TABLES and SHOW FULL COLUM

Post by marsupilami »

Hello dfumagalli,
Even easier: the cache manager would only need to cache metadata "just in time".
This is exactly what I meant - maybe I didn't express myself correctly. So - if I can help you, let me know. I am not an expert for DBC drivers but I did some work for one or two drivers there :)
Best regards,

Jan
User avatar
mdaems
Zeos Project Manager
Zeos Project Manager
Posts: 2766
Joined: 20.09.2005, 15:28
Location: Brussels, Belgium
Contact:

Re: Massive slowdowns due to SHOW TABLES and SHOW FULL COLUM

Post by mdaems »

Actually, there is a metadata cache built into zeoslib. And it should only load data at first use. I haven't debugged it for a long time, so there might be caching bugs. The code should be driver independent, I believe. Just the way the data is retrieved depends on the driver.
During one connection metadata corresponding to a specific key should be retrieved only once. But reopening a connection or using multiple connections starts a new cache.
If you need this, please try to spend some time debugging. Metadata code isn't too complex for an average user, I think. Even if you can't find the reason, it would already a great help if you could provide a small test program showing the problem. Best way to show queries executed in a test program is adding a zsqlmonitor component.

Mark
Image
dfumagalli
Fresh Boarder
Fresh Boarder
Posts: 9
Joined: 18.01.2014, 11:49

Re: Massive slowdowns due to SHOW TABLES and SHOW FULL COLUM

Post by dfumagalli »

mdaems wrote:Actually, there is a metadata cache built into zeoslib. And it should only load data at first use. I haven't debugged it for a long time, so there might be caching bugs. The code should be driver independent, I believe. Just the way the data is retrieved depends on the driver.
During one connection metadata corresponding to a specific key should be retrieved only once. But reopening a connection or using multiple connections starts a new cache.
If you need this, please try to spend some time debugging. Metadata code isn't too complex for an average user, I think. Even if you can't find the reason, it would already a great help if you could provide a small test program showing the problem. Best way to show queries executed in a test program is adding a zsqlmonitor component.

Mark
Hello,

I think the "issue" (it's not really an issue but the way the driver works) is that this architecture works very well in many cases but not in other fairly common cases.

To restrict the "domain" of the topic: this effect is best observed in master-detail queries. I have not checked if the phenomenon happens in "automated" master detail, where you just specify the linked master-detail field(s) in the Object Inspector. Where I have checked is the case of manual master detail, typically managed in the AfterScroll event.
In there (as you well know), the detail query is closed, parameters changed and so on, then the detail query is reopened.
By reading your reply (quoted above) "reopening a connection" starts a new cache.

Let's think at the implications of this, in my 5-6 master-detail queries chain: every single master record that is scrolled (it happens plenty since I am showing the queries as DB grids through ClientDataSets), the whole SHOW TABLES mumbo jumbo happens again for every detail record which in turn triggers the next query-in-detail-chain Close => Open which in turn triggers another SHOW TABLES which in turn...

This creates an almost geometric scale up of metadata querying, with huge effects on remote MySQL connections.

Basically opening my master table and doing a very minimal record scroll (about 20 records) takes *1-2 minutes*. The queries involved have no aggregates, it's just blind browsing usually with 1-2 fields (they are indexed both on server and as stored index defs) in a WHERE clause which cause 2-4 small columns to be fetched.

A quite painful optimization removing needless detail tables re-opening on the after scroll has squashed that time down to 15 seconds as the SHOW TABLES spam is greatly reduced.

The website that uses those same tables, performs the same queries but also performs aggregates on them, shows a page in below 2 seconds. So, there's still room for improvement :)

Wouldn't it be impossible to provide an option or checkbox to deny creating a new cache on re-open? If so, I could set the all the detail queries which always return the same columns to reuse their cache. I estimate the time would go down to 5-6 seconds. Still tangible but the end users can understand waiting 5-6 seconds. Much better than accepting 2 minutes.
User avatar
mdaems
Zeos Project Manager
Zeos Project Manager
Posts: 2766
Joined: 20.09.2005, 15:28
Location: Brussels, Belgium
Contact:

Re: Massive slowdowns due to SHOW TABLES and SHOW FULL COLUM

Post by mdaems »

Well, what I actually wanted to say was that there shouldn't be metadata queries at all when 'after scroll requerying'.
I think you misunderstood 'reopening a connection'. Because to requery a detail table you don't have to reopen the connection, just the query. So all queries using the same TZConnection should be using the same cache. And when you don't edit the queries, apart from the parameter values (I suppose you already use queries that have parameter placeholders on the fields refering to the master table), I don't see any reason why we need more metadata on a second refresh. Then you have met a bug.
Things that can point to the causes of the bug:
- temporary tables involved? This isn't necessaily a bug, but a library weakness.
- 'strange' table names. Maybe there are cache misses because the cache key is encoded different from the way it is checked.
- identifier (table or column name) quoting rules

Mark
Image
dfumagalli
Fresh Boarder
Fresh Boarder
Posts: 9
Joined: 18.01.2014, 11:49

Re: Massive slowdowns due to SHOW TABLES and SHOW FULL COLUM

Post by dfumagalli »

mdaems wrote:Well, what I actually wanted to say was that there shouldn't be metadata queries at all when 'after scroll requerying'.
I think you misunderstood 'reopening a connection'. Because to requery a detail table you don't have to reopen the connection, just the query. So all queries using the same TZConnection should be using the same cache. And when you don't edit the queries, apart from the parameter values (I suppose you already use queries that have parameter placeholders on the fields refering to the master table), I don't see any reason why we need more metadata on a second refresh. Then you have met a bug.
Things that can point to the causes of the bug:
- temporary tables involved? This isn't necessaily a bug, but a library weakness.
- 'strange' table names. Maybe there are cache misses because the cache key is encoded different from the way it is checked.
- identifier (table or column name) quoting rules

Mark
Hello,

Yes, all the components work on one TZConnection and I don't disconnect if on AfterScroll but only perform the "Close => change parameters => Open" of the TReadOnlyQuery components.

- No temp tables. The whole startup (where the slowdown happens) is read only.
- Everything in the database is UTF8. It's MySQL 5.6 installed under an Ubuntu server if that can suggest any information.
- I don't think I am using any quoted column name. What I have is: some queries have commented out bits, because requirements changed.

Here are some of the queries.
I won't post the insert and update queries because:

1) They are not used all at program startup, where the slowdown happens. They execute fast as well.
2) They are so ever-changing I had to code a "query builder", almost an ORM, that adapts to lots of conditions and situations and crafts different SQL queries every time. Those queries are then executed in a TDataSetProvider.OnBeforeUpdateRecord().

Code: Select all

SELECT
  pr.product_id AS product_id,
  pr.model AS model,
  pd.name AS name,
  pr.image AS image,
  pr.status AS status,
  pr.date_added AS date_added,
  pr.date_modified AS date_modified

FROM oc_product pr
LEFT JOIN oc_product_description pd ON pr.product_id = pd.product_id AND pd.language_id = :language_id
-- WHERE
--  pr.status = 1
ORDER BY
  pd.name

Code: Select all

SELECT *
FROM oc_product_option

Code: Select all

SELECT
  option_id,
  type,
  sort_order

FROM oc_option
ORDER BY
  sort_order

Code: Select all

SELECT *
FROM oc_option_description
WHERE language_id = :language_id

Code: Select all

SELECT *
FROM oc_option_value
ORDER BY sort_order

Code: Select all

SELECT *
FROM oc_option_value_description
WHERE language_id = :language_id

Code: Select all

SELECT
po.product_id AS product_id,
po.option_id AS option_id,
od.name AS name,
op.sort_order AS sort_order

FROM oc_product_option po
INNER JOIN oc_option_description od ON (po.option_id = od.option_id)
INNER JOIN oc_option op ON (po.option_id = op.option_id)
WHERE (po.product_id = :product_id) AND (od.language_id = :language_id)
ORDER BY op.sort_order

Code: Select all

SELECT
pov.product_id AS product_id,
pov.option_id AS option_id,
pov.option_value_id AS option_value_id,
ovd.name AS name,
ov.sort_order AS sort_order

FROM oc_product_option_value pov
INNER JOIN oc_option_value ov ON (ov.option_id = pov.option_id AND ov.option_value_id = pov.option_value_id)
INNER JOIN oc_option_value_description ovd ON (ov.option_value_id = ovd.option_value_id)
WHERE (ov.option_id = :option_id) AND (ovd.language_id = :language_id) AND (pov.product_id = :product_id)
ORDER BY ov.sort_order

Code: Select all

SELECT
  dz.delivery_zone_id AS delivery_zone_id,
  dz.geo_zone_id AS geo_zone_id,
  dz.status AS status,
  lc.name AS name

FROM fw_delivery_zone dz
LEFT JOIN fw_localized_country lc ON (dz.visual_country_id = lc.country_id AND lc.language_id = :language_id)

ORDER BY lc.name

Code: Select all

SELECT
  pr.product_id AS product_id,
  ptg.geo_zone_id AS geo_zone_id,
  99 AS associated,
  pr.model AS model,
  pd.name AS name,
  pr.image AS image,
  pr.status AS status,
  ptg.date_added AS date_added,
  ptg.date_modified AS date_modified,
  ptg.user_modified AS user_modified

FROM oc_product pr
LEFT JOIN oc_product_description pd ON pr.product_id = pd.product_id AND pd.language_id = :language_id
LEFT JOIN fw_product_to_geozone ptg ON pr.product_id = ptg.product_id AND ptg.geo_zone_id = :geo_zone_id
-- WHERE
--  pr.status = 1
ORDER BY
  pd.name

Code: Select all

SELECT
po.product_option_id AS product_option_id,
po.product_id AS product_id,
po.option_id AS option_id,
od.name AS name,
op.sort_order AS sort_order

FROM oc_product_option po
INNER JOIN oc_option_description od ON (po.option_id = od.option_id)
INNER JOIN oc_option op ON (po.option_id = op.option_id)
WHERE (po.product_id = :product_id) AND (od.language_id = :language_id) AND
((po.option_id = :allowed_option_id_1) OR (po.option_id = :allowed_option_id_2))
ORDER BY op.sort_order

Code: Select all

SELECT
99 AS first_row,
pov.product_id AS product_id,
lpo.geo_zone_id AS geo_zone_id,
lpo.quantity AS quantity,
lpo.subtract AS subtract,
lpo.price AS price,
lpo.price_prefix AS price_prefix,
lpo.points AS points,
lpo.points_prefix AS points_prefix,
lpo.weight AS weight,
lpo.weight_prefix AS weight_prefix,
lpo.date_added AS date_added,
lpo.date_modified AS date_modified,
lpo.user_modified AS user_modified,
pov.product_option_value_id AS product_option_value_id,
pov.option_id AS option_id,
pov.option_value_id AS option_value_id,
ovd.name AS name,
ov.sort_order AS sort_order

FROM oc_product_option_value pov
INNER JOIN oc_option_value ov ON (ov.option_id = pov.option_id AND ov.option_value_id = pov.option_value_id)
INNER JOIN oc_option_value_description ovd ON (ov.option_value_id = ovd.option_value_id)
LEFT JOIN fw_local_product_option_value lpo ON (lpo.option_id = pov.option_id AND lpo.option_value_id = pov.option_value_id) AND (lpo.product_id = :product_id) AND (lpo.geo_zone_id = :geo_zone_id)

WHERE
(pov.product_id = :product_id) AND
(ov.option_id = :option_id) AND
(ovd.language_id = :language_id)

ORDER BY ov.sort_order
Post Reply