Page 1 of 1
Iterate a dataset in multithreaded application
Posted: 31.01.2023, 19:13
by stoffman
Hi,
Is there a way to iterate the same dataset in 2 different threads? from the look of it the standard dataset.eof/first/next is not design for such a case
Thanks,
Yoni
Re: Iterate a dataset in multithreaded application
Posted: 31.01.2023, 19:19
by aehimself
At the moment no, there's none.
What you can do however is to execute the same query in two threads, two different datasets and the first one processes the first half, the second one processes the other half.
Re: Iterate a dataset in multithreaded application
Posted: 01.02.2023, 13:13
by marsupilami
Hello Stoffman,
I amnot sure, what you want to do. If the TZQuery serves a kinda worklist for several threads you could copy its contents into another object where you protect access by a mutex or some other lock mechanism.
What you also could do is protect access to the TZQuery and its connection by a mutex. No other lock mechanisma are allowed in this scenarion. Usually I would encapsulate it into a separate object which then does all the mutex things. This effectively serializes access to the TZQuery. This is because one TZConnection and all objects that are connected to it may be accessed by one thread only at a time.
Best regards,
Jan
Re: Iterate a dataset in multithreaded application
Posted: 03.02.2023, 22:29
by stoffman
Hi Jan,
Sure I can do locks and critical sections and what not, but that is exactly what I'm trying to avoid...
I have several use cases for such a thing, but the most pressing one is being able to show the TZQuery in a grid while being able to do some calculations in a different thread (i.e. not the UI thread) the calculation is heavy and may take several seconds and I wish to avoid the UI from being freeze. Unfortunately, the data itself is also big, in the order of few hundreds MB, so copy them to another object is also time consuming (so while it performs better it still freezes the UI)
Thanks,
Stoffman
Re: Iterate a dataset in multithreaded application
Posted: 04.02.2023, 13:04
by aehimself
Why not to push everything in the background, loading and calculating? If calculation takes a second or two, most of your time will be spent with fetching anyway I suppose.
I'm usually dimming all the contents and showing a progress indicator while allowing the user to cancel everything:
Capture1.PNG
Re: Iterate a dataset in multithreaded application
Posted: 04.02.2023, 23:00
by stoffman
While your solution looks great, it is not the affect I'm looking for. I want my application to remain responsive and let the user continue working while doing the calculations.
If I could go fully multithreaded with TDataset I would, for example, start the calculations *before* the user asks for them (making a good use of all the CPU cores
) and present the user with the results instantly once it is being requested.
Re: Iterate a dataset in multithreaded application
Posted: 04.02.2023, 23:31
by aehimself
In an ideal world there's no delay; a 30 GB file is downloaded instantly from the Internet, the latest games run flawless on a 2k era Intel Celeron and Chrome is more than satisfied with 64 kb of RAM.
Our challenge as software developers is to make the "real world" acceptable for the users.
Requesting a large amount of data and allowing your user to work with it instantly is not going to happen. Move loading to a background thread and do not block anything so he can continue to work with the previous.
Back to suggestions, you might want to drop the component layer and move up to DBC. That way you can have loading and processing in a background thread and have a VCL component (virtual treeview, stringgrid, etc) refresh its contents every second or so. Time spent between the first click and the arrival of the final record won't change but your application might seem faster and more responsive.
Re: Iterate a dataset in multithreaded application
Posted: 06.02.2023, 08:57
by marsupilami
I don't know the exact nature of your application. But given the restrictions on threading that Zeos (and most probably every other data access layer) has, your options are limited. You have to make sure that any TZConnection object and all objects that depend on it live in one thread only at a time.
So I only see the following options:
- Lock the user UI and show the user a message and maybe some kind of progress bar while you are processing the data. Downside: The user will have to wait while the processing is done.
- Do the calculations in the background and load the data a second time: Downsides: This will increase memory requirements and will take some time before the data is available for the calculation.
- Do the calculations in the background and use the data from the main thread: This might require some serious work as you might not be able to use TZQuery.Locate or any other methods that move the record pointer. Otherwise your DBGrid will keep jumping around. Also you have to keep threaded access to any shared ressources in mind - may that be the TZQuery or some other objects that transport data between threads.
- Do the calculations in the background thread and also load the data there. Update you UI when you have data for it. Downsides: This also makes the user wait. And your UI programming might get more complex because you might not be able to use database aware components.
While writing these things down, I wondered why you don't get the calculations done on the server in a stored procedure? The data is available there almost instantly. Usually there are options on the server to also keep memory consumption low. You only need to transfer the result to the client. If the computation takes too long on the server for your main thread to wait for it, you could wait for the result in a background thread and display it when it arrives. You still would need a second connection to the database for this.
stoffman wrote: ↑04.02.2023, 23:00
If I could go fully multithreaded with TDataset I would, for example, start the calculations *before* the user asks for them (making a good use of all the CPU cores
) and present the user with the results instantly once it is being requested.
Hmmm - isn't everybody saving energy these days? Maybe it is an option to only do the calculations if users really need them? This way you don't waste your time on solving a problem that possibly doesn't happen too often. You probably save some amount of energy because you don't do unnecessary calculations. And your users get to apreciate the results because they have to wait for them and possibly they will only request them when they are needed.
Re: Iterate a dataset in multithreaded application
Posted: 06.02.2023, 21:22
by stoffman
I would like to thank you both for the suggestions and the time and effort you put into this question.
No easy solution indeed.