MantisBT - Open CASCADE
View Issue Details
0022939Open CASCADE[OCCT] OCCT:Modeling Datapublic2012-01-30 17:452012-11-16 13:16
[OCCT] 6.5.2 
[OCCT] 6.5.4[OCCT] 6.5.4 
0022939: Make B-Spline internal cache thread-safe to be used in multy-threaded mode
Now B-Spline internal cache is not thread-safe. The parallel BrepMesh algorithm generates exceptions because of this problem. It is necessary to improve the thread-safety of B-Spline cache to avoid data races in future. See issues #0022835 and 0022850 for additional information and test models.
No tags attached.
doc test_report.doc (69,120) 2012-05-03 12:59
Issue History
2012-01-30 17:45epvNew Issue
2012-01-30 17:45epvAssigned To => azn
2012-01-30 17:53abvProjectInternal => Open CASCADE
2012-02-02 10:08abvTarget Version6.5.3 => 6.5.4
2012-02-15 14:43aznNote Added: 0019574
2012-02-15 14:44aznNote Edited: 0019574bug_revision_view_page.php?bugnote_id=19574#r3463
2012-02-15 14:49aznNote Edited: 0019574bug_revision_view_page.php?bugnote_id=19574#r3464
2012-02-15 14:57aznAssigned Toazn => abv
2012-02-15 15:12abvAssigned Toabv => pdn
2012-02-15 15:12abvStatusnew => resolved
2012-02-20 14:25pdnNote Added: 0019666
2012-02-20 14:25pdnAssigned Topdn => azn
2012-02-20 14:25pdnStatusresolved => assigned
2012-03-16 17:35aznNote Added: 0020022
2012-03-16 17:35aznNote Edited: 0020022bug_revision_view_page.php?bugnote_id=20022#r3668
2012-03-16 17:36aznNote Added: 0020023
2012-03-16 17:36aznStatusassigned => resolved
2012-03-16 17:36aznNote Deleted: 0020023
2012-03-16 17:37aznAssigned Toazn => pdn
2012-03-17 18:31Roman LyginNote Added: 0020032
2012-03-18 10:53abvNote Added: 0020034
2012-03-18 10:53abvAssigned Topdn => azn
2012-03-18 10:53abvStatusresolved => assigned
2012-03-18 10:58abvNote Added: 0020035
2012-03-19 14:24aznNote Added: 0020038
2012-03-19 14:25aznAssigned Toazn => abv
2012-03-19 15:10aznNote Edited: 0020038bug_revision_view_page.php?bugnote_id=20038#r3676
2012-05-03 12:56aznRelationship addedchild of 0022835
2012-05-03 12:59aznFile Added: test_report.doc
2012-05-03 12:59aznNote Added: 0020465
2012-05-03 13:40aznStatusassigned => resolved
2012-05-03 14:15Roman LyginNote Added: 0020466
2012-05-03 14:48Roman LyginNote Added: 0020467
2012-05-03 15:25abvNote Added: 0020468
2012-05-03 15:25abvAssigned Toabv => azn
2012-05-03 15:25abvStatusresolved => assigned
2012-05-03 15:57Roman LyginNote Added: 0020469
2012-05-03 16:00abvNote Edited: 0020469bug_revision_view_page.php?bugnote_id=20469#r3880
2012-05-03 16:00abvNote Edited: 0020467bug_revision_view_page.php?bugnote_id=20467#r3882
2012-05-03 16:06abvNote Added: 0020470
2012-05-04 15:23aznNote Added: 0020476
2012-05-04 15:23aznAssigned Toazn => abv
2012-05-04 15:23aznStatusassigned => resolved
2012-05-04 15:59abvNote Added: 0020479
2012-05-04 15:59abvAssigned Toabv => azn
2012-05-04 15:59abvStatusresolved => reviewed
2012-05-04 16:10apnAssigned Toazn => apn
2012-05-05 10:06apnNote Added: 0020483
2012-05-05 10:26apnAssigned Toapn => azn
2012-05-05 10:26apnStatusreviewed => assigned
2012-05-10 15:43apnNote Edited: 0020483bug_revision_view_page.php?bugnote_id=20483#r3894
2012-05-10 15:44apnStatusassigned => resolved
2012-05-10 15:44apnStatusresolved => reviewed
2012-05-10 15:45apnStatusreviewed => tested
2012-05-10 15:45apnAssigned Toazn => bugmaster
2012-05-14 12:41aznChangeset attached => occt master 83ada95b
2012-05-14 12:41aznAssigned Tobugmaster => azn
2012-05-14 12:41aznStatustested => verified
2012-05-14 12:41aznResolutionopen => fixed
2012-11-16 13:15bugmasterFixed in Version => 6.5.4
2012-11-16 13:16bugmasterStatusverified => closed

2012-02-15 14:43   
(edited on: 2012-02-15 14:49)
Branch [^] is ready to be reviewed.

Please don't forget to call method SetReentrant(true) while testing.
This can be done by the command mpparallel 1.

2012-02-20 14:25   
Approach to bind cash to thread ID does not look feasilbe frome the industrial point of view. How the cash will be released in case of significan number of threads.

What would be pleased to see:
1) test case when 10-100 bspline surfaces is created and 10-100 threads work with them.
2) just to put siple mutext to define if this code is real bottleneck.
2012-03-16 17:35   
Now the problem is solved by mutex( [^]).
Tests is showed that the usage of mutex doesn't affect performance when operating in single threaded mode (tests measured time spent for loading of the shape and its triangulation). Also important to note that this approach can cause performance degradation for a specific face which shared the same B-spline curves or surfaces (when running in parallel mode).

Roman Lygin   
2012-03-17 18:31   
Can I get a preview of the fix before it is committed ? I reported this issue ( [^]), so obviously have some bias in it.
Beware of solutions involving mutexes. By definition, this creates a non-scalable code which will result in performance degradation for highly contended situations. To minimize this effect, a spin mutex which involves busy waiting instead of immediate going to sleep, can be used. Consider using tbb::spin_mutex. OCC Standard_Mutex does not provide busy waiting.
Another idea to consider is to have user-defined flag disabling use of caching whatsover. Avoiding using cache can be beneficial in other circumstances not just in multi-threaded configurations. For example, workloads used in Shape Healing revealed that there were >50% cache misses, so time spent for cache population was effectively wasted.
Likely, this flag should be per instance-defined. To avoid increase of data size this flag can be made bit field with other boolean flags.
Extra convenience, to have a global flag that would define default flags for all B-Splines created.
2012-03-18 10:53   
The proposed improvement protects functions rather than cache. Please re-implement this so as to have one instance of mutex per instance of the cache (i.e. mutex should be the field of the relevant class instead of static variable in the function).
2012-03-18 10:58   
Roman, I still consider protection by mutex as being better than no protection at all. It may be activated only if Standard::IsReentrant() is true; however I doubt if this would make sense taking that this will become the default mode soon.

This change does not close the issue for sure; possibility to improve further (extract cache to separate class or else) is open.
2012-03-19 14:24   
(edited on: 2012-03-19 15:10)
Now mutex is a field of class.

2012-05-03 12:59   
The branch CR22939 is ready to be reviewed.
The attached file test_report.doc describes measuring performance with mutexes and without them.
Roman Lygin   
2012-05-03 14:15   
Notes/questions/suggestions for test case:
1. Can you please add a test case source code for Test_Report.doc? I bet results strongly depend upon the algorithm (be it meshing, projection or other algorithm) and workload - in particular, the probability of contention.
2. It would also be helpful if you could try more cores than 2
3.There is basically no speed up. What is the root-cause? This can be for instance, because serial part is overwhelming (Ahmdal's law) or because the algorithm cannot concurrently run on this model. You might want to try Intel Amplifier XE in concurrency mode to analyze concurrency.
Roman Lygin   
2012-05-03 14:48   
(edited on: 2012-05-03 16:00)
Please disregard (3) above - invalid comment. Sorry for confusion.

2012-05-03 15:25   
Review remarks (minor):

1. There seems to be no need to protect methods InvalidateCache(): these methods are called only when b-spline data are modified, when parallel access to the same b-spline is invalid anyway (i.e. regardless of cache). I suggest this protection should be removed.

2. Fields myMutex and *cache could be declared mutable, in order to avoid casting 'this' to non-const pointer in const methods.

3. Please avoid changes of formatting in the places not affected by actual fix, to minimize amount of modified code

I suggest the branch will be tested after these remarks are taken into account (even if partially).
Roman Lygin   
2012-05-03 15:57   
(edited on: 2012-05-03 16:00)
1. Re Andrey's (1) above - you should always be careful when concurrently writing to shared data. You can break machine word boundaries and hence result in inconsistent value (neither of two writes). Of course, this won't apply in this case, and writing validcache=0 will produce consistent result.

2. For high contention algorithms Standard_Mutex should be bad as it will go to slip right away. In such cases spinning-like mutexes are better alternative. Consider use tbb::spin_mutex. If you want to have TBB-independent API you might want to have wrappers #if HAVE_TBB which will use system API (like Windows critical section)

2012-05-03 16:06   

On (1): my point was that these methods change fields of the b-spline, and this is not protected, thus protecting only cache in this case is useless.

On (2): I agree that in highly contended scenarios busy-waiting is a choice, however our current goal is to implement protection against very rare but possible contention. I believe that high contention on b-spline cache is quite unlikely to happen in our typical scenarios (parallel processing of faces). When we go to situation where b-spline cache contention is an issue, we will likely need to revise this approach completely.
2012-05-04 15:23   
The changes were implemented according to remarks. The branch CR22939 is ready to be reviewed.
2012-05-04 15:59   
No remarks, please test. Note that I have rebased the fix on current master and cleaned it from irrelevant changes; use branch CR22939_1 (previous branch CR22939 has been removed)
2012-05-05 10:06   
(edited on: 2012-05-10 15:43)
Dear BugMaster,
       Workbench KAS:dev:apn-22939-occt was created from git branch CR22939_1 (and apn-22939-products from svn trunk) and compiled on Linux platform.
       There are not regressions in apn-22939-products regarding to KAS:dev:products-20120415-opt:

       chl 937 A6 - non stable test case
       There are not improvements in apn-22939-products regarding to KAS:dev:products-20120415-opt
       See results in /QADisk/occttests/results/KAS/dev/apn-22939-products_04052012/lin
       See reference results in /QADisk/occttests/results/KAS/dev/products-20120415-opt_13042012/lin
       See test cases in /QADisk/occttests/tests/ED