Multithreaded application best practices
Multithreaded application best practices
Hello all,
I was lurking around the forums for long, but now the time arrived for an actual registration Therefore - first of all - let me Embrace Zeos a little bit!
I started to develop applications with database backends about 6 years ago. I quickly met the standard issues with available components (not cross-platform; not supporting all mainstream databases; buggy or simply CRAZY expensive) and then I found Zeos. Since then this became the first component I install after a dev machine meltdown (I experiment with Windows APIs a lot ). It's really good to know that there is an actively maintained suite to build our applications upon. For free. So thank you for all the extremely good work and keep it up!
A recent job change forced me to upgrade from Delphi 7 to 10.2 so I started to patch / rewrite some of my past applications. There is one which is using a small (~7-10 tables) database backend, with about 5 worker threads. All threads should be able to read (and occasionally write) all tables. In the past I solved this by using ZQuery and writing separate DLLs for each database type but this is a lot pain to maintain. Furthermore I quickly realized that by using ZTables even the data-processing performance will improve where it's the most critical so if I rebuild this from scratch - why not to implement a better way.
I read a lot about multithreading applications with databases, but I am really curious if there are actually best practices to follow?
My main problem here is that the golden rule (each thread must have it's own database connection) introduces a new issue - if a thread is updating a table, no other threads will know that (on database level) the data was changed. This can be solved by sending a signal to refresh, but (according to my prior testing) this operation can be slow. An other idea is to "lock" the table which will be modified (in the application level; with a critical section or so) and update them all from code but this feels a bit hacky if you know what I mean.
So please, do not restrain yourselves. Hit me with the best practices or ideas of a multithreaded database application!
I was lurking around the forums for long, but now the time arrived for an actual registration Therefore - first of all - let me Embrace Zeos a little bit!
I started to develop applications with database backends about 6 years ago. I quickly met the standard issues with available components (not cross-platform; not supporting all mainstream databases; buggy or simply CRAZY expensive) and then I found Zeos. Since then this became the first component I install after a dev machine meltdown (I experiment with Windows APIs a lot ). It's really good to know that there is an actively maintained suite to build our applications upon. For free. So thank you for all the extremely good work and keep it up!
A recent job change forced me to upgrade from Delphi 7 to 10.2 so I started to patch / rewrite some of my past applications. There is one which is using a small (~7-10 tables) database backend, with about 5 worker threads. All threads should be able to read (and occasionally write) all tables. In the past I solved this by using ZQuery and writing separate DLLs for each database type but this is a lot pain to maintain. Furthermore I quickly realized that by using ZTables even the data-processing performance will improve where it's the most critical so if I rebuild this from scratch - why not to implement a better way.
I read a lot about multithreading applications with databases, but I am really curious if there are actually best practices to follow?
My main problem here is that the golden rule (each thread must have it's own database connection) introduces a new issue - if a thread is updating a table, no other threads will know that (on database level) the data was changed. This can be solved by sending a signal to refresh, but (according to my prior testing) this operation can be slow. An other idea is to "lock" the table which will be modified (in the application level; with a critical section or so) and update them all from code but this feels a bit hacky if you know what I mean.
So please, do not restrain yourselves. Hit me with the best practices or ideas of a multithreaded database application!
Re: Multithreaded application best practices
Well, usually the only right way is "one connection = one thread". But if you really need multithreaded access to same objects, you'll have to sync access to them - crit sections, semaphores, events etc. Just keep in mind that even simultaneous access to two different tables is not safe if they belong to the same connection (they use the same transaction and other connection internals). The worst thing is that access conflicts are chaotic and very hard to track. So if you can't allow one-thread-at-a-time access to DB objects, you'll probably have to redesign your app's structure.
Re: Multithreaded application best practices
I got a similar idea yesterday and started working on mapping specific tables explicitly to specific threads. We'll see if I can change the business logic to allow this at the end.
- Thread T1 with one ZConnection and two ZTables (to tables A and B)
- Thread T2 with one ZConnection and four ZTables (to tables C, D, E and F)
- Thread T3 with one ZConnection and one ZTable (to table G)
All ZConnections are pointing to the database on the same host, all ZTables are using the ZConnection inside their own thread. This should be safe, even if all threads are reading / writing to all of their own tables; right?
Another thing is if I create a thread for database connections ONLY. It would publish ZTables via let's say "Function RequestTableA: TZTable;", secured with critical sections. I'll also need to create "Procedure ReleaseTableA;" which releases the lock and resets any filters etc. This would be the easiest, but this model introduces some scary deadlock opportunities; plus we already step over the boundaries of threads...
You mean same connection inside or between threads? I mean, let's say I haveFr0sT wrote:Just keep in mind that even simultaneous access to two different tables is not safe if they belong to the same connection
- Thread T1 with one ZConnection and two ZTables (to tables A and B)
- Thread T2 with one ZConnection and four ZTables (to tables C, D, E and F)
- Thread T3 with one ZConnection and one ZTable (to table G)
All ZConnections are pointing to the database on the same host, all ZTables are using the ZConnection inside their own thread. This should be safe, even if all threads are reading / writing to all of their own tables; right?
Another thing is if I create a thread for database connections ONLY. It would publish ZTables via let's say "Function RequestTableA: TZTable;", secured with critical sections. I'll also need to create "Procedure ReleaseTableA;" which releases the lock and resets any filters etc. This would be the easiest, but this model introduces some scary deadlock opportunities; plus we already step over the boundaries of threads...
Delphi 12.2, Zeos 8 from latest GIT snapshot
Using:
- MySQL server 8.0.18; libmysql.dll 8.0.40 x64 5.7.19 x68, libmariadb.dll 3.3.11
- Oracle server 11.2.0, 12.1.0, 19.0.0; oci.dll 21.15
- MSSQL 2012, 2019; sybdb.dll FreeTDS_3102
- SQLite 3.47
Using:
- MySQL server 8.0.18; libmysql.dll 8.0.40 x64 5.7.19 x68, libmariadb.dll 3.3.11
- Oracle server 11.2.0, 12.1.0, 19.0.0; oci.dll 21.15
- MSSQL 2012, 2019; sybdb.dll FreeTDS_3102
- SQLite 3.47
Re: Multithreaded application best practices
Yes, as long as a thread uses its personal connection and dataset objects, Zeos is thread-safe just like a client lib is. I was talking about the case whenaehimself wrote:You mean same connection inside or between threads? I mean, let's say I have
- Thread T1 with one ZConnection and two ZTables (to tables A and B)
- Thread T2 with one ZConnection and four ZTables (to tables C, D, E and F)
- Thread T3 with one ZConnection and one ZTable (to table G)
All ZConnections are pointing to the database on the same host, all ZTables are using the ZConnection inside their own thread. This should be safe, even if all threads are reading / writing to all of their own tables; right?
- ZConnection
- Thread T1 with ZTable A linked to ZConnection
- Thread T2 with ZTable B linked to ZConnection
This is potentially not safe mainly because in Zeos a transaction is not a separate object but is included into a connection. In theory, if you just read from two tables and do not change transaction state, it should be safe but I guess no special efforts were made to ensure this.
Re: Multithreaded application best practices
Hi!
I tried to use 1 TZConnection for different TZReadOnlyQuery in different threads.
As a result, I got Access Violation in case the threads are accessing at the same time.
So I can not call a thread-protected ZConnection.
But that was before the advent of ZEOS 7.2.4. Exactly the version on which I tried not remember.
Personally, I stopped at Formula 1 thread = 1 connection.
And yet, I do not use DataSet as a data warehouse, I overtake all data into my structures (array, Tlist),
and process already in them.
My application is built like this:
1. Data Acquisition Application
1.1. threads for data downloads from the Internet (JSON)
1.2. threads load this data into the database (parsing JSON and saving in relational form)
2. processing application
2.1. processing threads within the database
2.2. data load threads from the database to the application memory - a heap (4-6 GB)
2.3. threads for processing data in heap
2.4. threads of uploading processed data to DB
2.5. threads for clean data heap of unnecessary data
All this works on the system of typical tasks.
Each thread takes the most priority task from the queue, does its job, generates tasks for other types of threads.
I tried to use 1 TZConnection for different TZReadOnlyQuery in different threads.
As a result, I got Access Violation in case the threads are accessing at the same time.
So I can not call a thread-protected ZConnection.
But that was before the advent of ZEOS 7.2.4. Exactly the version on which I tried not remember.
Personally, I stopped at Formula 1 thread = 1 connection.
And yet, I do not use DataSet as a data warehouse, I overtake all data into my structures (array, Tlist),
and process already in them.
My application is built like this:
1. Data Acquisition Application
1.1. threads for data downloads from the Internet (JSON)
1.2. threads load this data into the database (parsing JSON and saving in relational form)
2. processing application
2.1. processing threads within the database
2.2. data load threads from the database to the application memory - a heap (4-6 GB)
2.3. threads for processing data in heap
2.4. threads of uploading processed data to DB
2.5. threads for clean data heap of unnecessary data
All this works on the system of typical tasks.
Each thread takes the most priority task from the queue, does its job, generates tasks for other types of threads.
Re: Multithreaded application best practices
@ Fr0sT, thank you for the clarification; that's how I imagined would be safe.
@ Stinkard; that's the best setup I can think of considering thread safety and processing speed. If you keep your data in chunks (object, arrays) one thread can pick one, do it's job and load the results in the database. Unfortunately it not a suitable solution for me as I am really restricted by resource usage and close to real-time results appearing in the database.
Would definitely worth to re-think the internal data structure though. Once you get used to work with objects...
@ Stinkard; that's the best setup I can think of considering thread safety and processing speed. If you keep your data in chunks (object, arrays) one thread can pick one, do it's job and load the results in the database. Unfortunately it not a suitable solution for me as I am really restricted by resource usage and close to real-time results appearing in the database.
Would definitely worth to re-think the internal data structure though. Once you get used to work with objects...
Delphi 12.2, Zeos 8 from latest GIT snapshot
Using:
- MySQL server 8.0.18; libmysql.dll 8.0.40 x64 5.7.19 x68, libmariadb.dll 3.3.11
- Oracle server 11.2.0, 12.1.0, 19.0.0; oci.dll 21.15
- MSSQL 2012, 2019; sybdb.dll FreeTDS_3102
- SQLite 3.47
Using:
- MySQL server 8.0.18; libmysql.dll 8.0.40 x64 5.7.19 x68, libmariadb.dll 3.3.11
- Oracle server 11.2.0, 12.1.0, 19.0.0; oci.dll 21.15
- MSSQL 2012, 2019; sybdb.dll FreeTDS_3102
- SQLite 3.47
Re: Multithreaded application best practices
Everything depends on your data flow. If your app is just a reader or writer, that's one case. I myself have an app with real-time data exchange. I have multiple isolated threads for reading and for writing owning their personal connection/query/dataset objects. If you need to access a dataset from multiple threads, that's another case.
-
- Platinum Boarder
- Posts: 1956
- Joined: 17.01.2011, 14:17
Re: Multithreaded application best practices
Hello stinkard,
for me it seems that the problem is that you need to make sure, that updated in the database doesn't overlap. If you can get to that point, you can easily have one connection per thread:
Best regards,
Jan
for me it seems that the problem is that you need to make sure, that updated in the database doesn't overlap. If you can get to that point, you can easily have one connection per thread:
As long, as the threads that parse the data don't do overlapping inserts / updates, I don't see a reason to not have one connection per thread here? Maybe you need to make sure you work with tiReadCommited to see the work of other threads as soon as they commit their work to the database? Also keeping transactions short lived might help here?stinkard wrote: 1. Data Acquisition Application
1.1. threads for data downloads from the Internet (JSON)
1.2. threads load this data into the database (parsing JSON and saving in relational form)
The same goes here: As long as threads that update the database don't do overlapping updates, there should be no problems in using a separate connection per thread? Or is there something here that I don't get?stinkard wrote:2. processing application
2.1. processing threads within the database
2.2. data load threads from the database to the application memory - a heap (4-6 GB)
2.3. threads for processing data in heap
2.4. threads of uploading processed data to DB
2.5. threads for clean data heap of unnecessary data
Best regards,
Jan
Re: Multithreaded application best practices
Hello!
Jan, I have no problems with multi threading and locks in the database.
I just shared my approach with aehimself.
In fact, I have 400 incoming info flows, each flow is placed in some separated tables.
I have data Intersection only on a level of reference tables and tables with processing tasks.
But their requests are extremely short.
After processing it turns out about 12000 tables with analytics and reports, but they are also divided into 400 parts by incoming flows.
I can apply partitioning, but, since in Postgres it caused performance degradation, the solution was to story data in separate tables.
Let's look at the results of the experiments, what happened to the performance of partitioned tables in PG 11.
Maybe I will change the approach to data storage.
While 13,500 tables do not spoil my life I do not need to have access to this data as a whole.
My main job is to process 400 data flows as fast as possible in the mode close to real time.
Jan, I have no problems with multi threading and locks in the database.
I just shared my approach with aehimself.
In fact, I have 400 incoming info flows, each flow is placed in some separated tables.
I have data Intersection only on a level of reference tables and tables with processing tasks.
But their requests are extremely short.
After processing it turns out about 12000 tables with analytics and reports, but they are also divided into 400 parts by incoming flows.
I can apply partitioning, but, since in Postgres it caused performance degradation, the solution was to story data in separate tables.
Let's look at the results of the experiments, what happened to the performance of partitioned tables in PG 11.
Maybe I will change the approach to data storage.
While 13,500 tables do not spoil my life I do not need to have access to this data as a whole.
My main job is to process 400 data flows as fast as possible in the mode close to real time.
Best regards, Max.
Re: Multithreaded application best practices
Yes, I have no overlaps, and I use 1 connection in each DB thread.marsupilami wrote: for me it seems that the problem is that you need to make sure, that updated in the database doesn't overlap. If you can get to that point, you can easily have one connection per thread:
I use default settings of isolation level. No any problem.marsupilami wrote: As long, as the threads that parse the data don't do overlapping inserts / updates, I don't see a reason to not have one connection per thread here? Maybe you need to make sure you work with tiReadCommited to see the work of other threads as soon as they commit their work to the database? Also keeping transactions short lived might help here?
I had some problems with dead locks in DB. But using explicit table locks in the correct order at the beginning of a transaction block solved this problem.
No proplems here. All is ok.marsupilami wrote: The same goes here: As long as threads that update the database don't do overlapping updates, there should be no problems in using a separate connection per thread? Or is there something here that I don't get?
Best regards, Max.
Re: Multithreaded application best practices
Yep, as far as I can see both Fr0st's and Stinkard's practices are the same. Considering the importance I guess we could say this is an other golden rule.
- 1 thread = 1 database connection
- Think on explicit table usage for threads if possible. If not, use a locking mechanism (in DB or in app level, doesn't matter) to make sure no writes will happen to the same table at a time. Here, keep in mind that intersections can cause deadlocks (details here, if unfamiliar)
Cheers to both of you guys; if anyone has any other suggestions I'm more than happy to listen
P.s.: I guess I'll go with the explicit mapping way. It's not that much effort to implement, seems to be the most safe and immediately solves my other-threads-should-refresh issue.
- 1 thread = 1 database connection
- Think on explicit table usage for threads if possible. If not, use a locking mechanism (in DB or in app level, doesn't matter) to make sure no writes will happen to the same table at a time. Here, keep in mind that intersections can cause deadlocks (details here, if unfamiliar)
Cheers to both of you guys; if anyone has any other suggestions I'm more than happy to listen
P.s.: I guess I'll go with the explicit mapping way. It's not that much effort to implement, seems to be the most safe and immediately solves my other-threads-should-refresh issue.
Delphi 12.2, Zeos 8 from latest GIT snapshot
Using:
- MySQL server 8.0.18; libmysql.dll 8.0.40 x64 5.7.19 x68, libmariadb.dll 3.3.11
- Oracle server 11.2.0, 12.1.0, 19.0.0; oci.dll 21.15
- MSSQL 2012, 2019; sybdb.dll FreeTDS_3102
- SQLite 3.47
Using:
- MySQL server 8.0.18; libmysql.dll 8.0.40 x64 5.7.19 x68, libmariadb.dll 3.3.11
- Oracle server 11.2.0, 12.1.0, 19.0.0; oci.dll 21.15
- MSSQL 2012, 2019; sybdb.dll FreeTDS_3102
- SQLite 3.47
-
- Platinum Boarder
- Posts: 1956
- Joined: 17.01.2011, 14:17
Re: Multithreaded application best practices
Sorry - I somehow misread the thread.stinkard wrote: No proplems here. All is ok.
Best regards,
Jan
Re: Multithreaded application best practices
What if I want to use a splash screen when waiting for establishing connection ?
* A splash screen should be shown first.
* Connection should be started establishing.
* Meanwhile splash screen should be able to respond user messages.
In this situation, a secondary thread should establish connection but after then, main thread should use the same only connection.
How can I achieve this ?
I guess if I call ZConnection1.Connect even in another thread and I use synchronise, my main thread is blocked.
Can I call ZConnection1.Connect on the data module in another thread if I do not touch until secondary thread ends when connection establishes (ie, if I do not connect queries and datasources) ?
Thank you all in advance.
Ömür Ölmez
Best Regards
* A splash screen should be shown first.
* Connection should be started establishing.
* Meanwhile splash screen should be able to respond user messages.
In this situation, a secondary thread should establish connection but after then, main thread should use the same only connection.
How can I achieve this ?
I guess if I call ZConnection1.Connect even in another thread and I use synchronise, my main thread is blocked.
Can I call ZConnection1.Connect on the data module in another thread if I do not touch until secondary thread ends when connection establishes (ie, if I do not connect queries and datasources) ?
Thank you all in advance.
Ömür Ölmez
Best Regards
Re: Multithreaded application best practices
The way I am handling these things is:
- Have all components on the VLC form. DB aware components are allowed and working well.
- Before any background operation, disconnect all DB aware components from their datasources
- Show your "loading" screen
- Create and execute a thread which only performs the operation you want to do (eg. call TZConnection.Connect, TZQuery.Open, etc)
- Do not do anything in the VLC thread, use a callback (e.g. TThread.OnTerminate) instead to hide your loading screen
- Reconnect all DB aware components
I'm using this method for a really long time in my desktop application. If something starts to misbehave it's because I messed something up.
If you follow this logic, you will even be able to utilize TZConnection.AbortOperation if your protocol supports it.
- Have all components on the VLC form. DB aware components are allowed and working well.
- Before any background operation, disconnect all DB aware components from their datasources
- Show your "loading" screen
- Create and execute a thread which only performs the operation you want to do (eg. call TZConnection.Connect, TZQuery.Open, etc)
- Do not do anything in the VLC thread, use a callback (e.g. TThread.OnTerminate) instead to hide your loading screen
- Reconnect all DB aware components
I'm using this method for a really long time in my desktop application. If something starts to misbehave it's because I messed something up.
If you follow this logic, you will even be able to utilize TZConnection.AbortOperation if your protocol supports it.
Delphi 12.2, Zeos 8 from latest GIT snapshot
Using:
- MySQL server 8.0.18; libmysql.dll 8.0.40 x64 5.7.19 x68, libmariadb.dll 3.3.11
- Oracle server 11.2.0, 12.1.0, 19.0.0; oci.dll 21.15
- MSSQL 2012, 2019; sybdb.dll FreeTDS_3102
- SQLite 3.47
Using:
- MySQL server 8.0.18; libmysql.dll 8.0.40 x64 5.7.19 x68, libmariadb.dll 3.3.11
- Oracle server 11.2.0, 12.1.0, 19.0.0; oci.dll 21.15
- MSSQL 2012, 2019; sybdb.dll FreeTDS_3102
- SQLite 3.47
Re: Multithreaded application best practices
Thank you @aehimself
It works like a charm.
It works like a charm.