View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0022850 | Open CASCADE | OCCT:Mesh | public | 2011-12-16 10:09 | 2012-11-16 13:18 |
Reporter | Assigned To | ||||
Priority | urgent | Severity | major | ||
Status | closed | Resolution | fixed | ||
Product Version | 6.5.2 | ||||
Target Version | 6.5.4 | Fixed in Version | 6.5.4 | ||
Summary | 0022850: Not stable fix 22735 | ||||
Description | Case chl/934/C8 is not stable to be fixed in parallel mode. Data race has been detected in the BRepMesh_FastDiscretFace::AddInShape() method. The exact reason consists in modification of shared data by the code of BRep_Builder::UpdateEdge() method called inside AddInShape(). The proposed technical solution is to create a special map containing pairs <TopoDS_Edge(TopoDS_Shape), Standard_Mutex>. This map must be initialized before parallel processing of the face triangulation (for example in BRepMesh_IncrementalMesh class. The BRepMesh_FastDiscretFace class will use the reference to this map in such case). Next it is necessary to disable concurrent access to the corresponding edge from different threads by corresponding mutex from the map before calling an UpdateEdge() method. | ||||
Tags | No tags attached. | ||||
Test case number | chl 934 C8 | ||||
parent of | 0023260 | closed | Regression: Instability in parallel incmesh on Linux. |
|
Fix has been pushed to the Git branch CR22850 Dear Andrey, Please review |
|
Some remarks: 1. Instead of defining a method AssignMapOfMutexes() and replicating twice the same block of code for filling the map, I suggest you add method that constructs that map given a shape, and use it in two places. 2. As soon as method ClearMapOfMutexes() is defined, I suggest you use it in BRepMesh_FastDiscret.cxx instead of direct Clear(), for consistency 3. In BRepMesh_FastDiscretFace::AddInShape(), please use Standard_Mutex::SentryNested to ensure that mutex gets unlocked even if exception is raised in the protected code. In order to correctly sequential execution (when mutex map is empty), use dummy instance of mutex and create sentry with doLock = false in that case. 4. Please double-check that there are no other places in the code where edges or vertices can be modified in parallel threads |
|
Added method BRepMesh_FastDiscret::CreateMutexes() which creates mutexes for sub-shapes of a given shape. Added method BRepMesh_FastDiscret::RemoveAllMutexes() - removes all created mutexes. Methods BRepMesh_FastDiscretFace::Add()and BRepMesh_FastDiscretFace::AddInShape() now accept additional argument. The argument is a reference to BRepMesh_DataMapOfShapeMutex. This map is used to prevent data race in parallel mode. Fix has been pushed to the Git branch CR22850 Dear Andrey, Please review |
|
No remarks, please test |
|
Dear BugMaster, The workbenches KAS:dev:mkv-22850-occt (GIT branch CR22850_1) KAS:dev:mkv-22850-products (GIT master) were compiled on Linux platform and tested. Regression: Not detected Improvements: Not detected Testing case: chl 934 C8 testing case was created for issue. It is OK in this branch. It is OK in current master (KAS:dev:products-20120608-opt). See results in /QADisk/occttests/results/KAS/dev/mkv-22850-products-1_18062012/lin See reference results in /QADisk/occttests/results/KAS/dev/products-20120608-opt_08062012/lin See test cases in /QADisk/occttests/tests/ED N.B. In order to launch testing case you can make use the following instructions http://doc/doku.php?id=occt:certification |
|
The following very simple test in DRAW still crashes on branch CR22840_1 under Windows 7 64-bit: pload ALL box b 1 1 1 incmesh b 0.1 1 The crash is caused by the following code fragment in BRepMesh_IncrementalMesh::Update() method: if (myInParallel) { myMesh->CreateMutexes(S, TopAbs_EDGE); #ifdef HAVE_TBB // mesh faces in parallel threads using TBB tbb::parallel_for_each (aFaces.begin(), aFaces.end(), *myMesh.operator->()); #else // alternative parallelization not yet available for (std::vector<TopoDS_Face>::iterator it(aFaces.begin()); it != aFaces.end(); it++) myMesh->Process (*it); #endif myMesh->RemoveAllMutexes(); // This line crashes at DeleteCriticalSection() API call } Access violation in DeleteCriticalSection() typically means double deletion of the critical section object. Setting a breakpoint in Standard_Mutex::~Standard_Mutex () discovers the reason. 1. Bitwise-copying a temporary Standard_Mutex to the map node and then destroying it results in the first DeleteCriticalSection() call. But critical section data is now in the map node, and the code thinks it is valid: void BRepMesh_FastDiscret::CreateMutexes(const TopoDS_Shape& theShape, const TopAbs_ShapeEnum theType) { for(TopExp_Explorer ex(theShape, theType); ex.More(); ex.Next()) { myMapOfMutexes.Bind(ex.Current(), Standard_Mutex()) ; // This temporary is destroyed immediately after copying the data into the map } } 2. When triangulation completes, myMesh->RemoveAllMutexes() kills the app by calling DeleteCriticalSection(). Conclusions: a. Make copy constructor for Standard_Mutex private to avoid unintentional bitwise copying of WinAPI data that is invalidated immediately after that. b. Try to avoid calling ~Standard_Mutex() several times by managing the mutex objects by pointers. Other ideas are welcome! |
|
I have tried a different approach: moving critical section initialization and de-initialization out of Standard_Mutex constructor/destructor: //============================================= // Standard_Mutex::Standard_Mutex //============================================= Standard_Mutex::Standard_Mutex () { } //============================================= // Standard_Mutex::~Standard_Mutex //============================================= Standard_Mutex::~Standard_Mutex () { } void Standard_Mutex::Attach() { #ifdef WNT InitializeCriticalSection( &myMutex ); #else pthread_mutex_init( &myMutex, 0 ); #endif } void Standard_Mutex::Detach() { #ifdef WNT DeleteCriticalSection( &myMutex ); #else pthread_mutex_destroy( &myMutex ); #endif } BRepMesh_FastDiscrete class was changed accordingly: //======================================================================= //function : CreateMutexes //purpose : //======================================================================= void BRepMesh_FastDiscret::CreateMutexes(const TopoDS_Shape& theShape, const TopAbs_ShapeEnum theType) { for(TopExp_Explorer ex(theShape, theType); ex.More(); ex.Next()) { myMapOfMutexes.Bind(ex.Current(), Standard_Mutex()) ; myMapOfMutexes.ChangeFind(ex.Current()).Attach(); } } //======================================================================= //function : RemoveAllMutexes //purpose : //======================================================================= void BRepMesh_FastDiscret::RemoveAllMutexes() { BRepMesh_DataMapIteratorOfDataMapOfShapeMutex anIt(myMapOfMutexes); for (; anIt.More(); anIt.Next()) { const_cast<Standard_Mutex&>(anIt.Value()).Detach(); } myMapOfMutexes.Clear(); } After this simple correction incmesh DRAW command seems to work. Of course, this is just a prototype, an it is necessary to think how to implement a robust solution instead. |
|
I would prefer the first solution: forbid copy constructor and assign operator, putting pointers to mutexes into map nodes. |
|
Redefined copy constructor and assign operator. Now they creates and initializes new instance of critical section instead of copying. Fix pushed to the branch CR22850_2 Dear Andrey, Please review. |
|
Dmitry: for consistency, assignment operator should destroy myMutex (previously initialized in constructor) instead of overriding it by new temporary object. After revising this version, I agree with Mikhail: it is better to forbid copying of mutexes as it is too uncertain and dangerous. Then you will have to store pointers in the map. I suggest that you create vector of mutexes to avoid allocating each one in the heap individually, and automate destruction. (Further we can think of separating this map and vector to dedicated tool that could be reused in other parts of OCCT.) |
|
Class TopTools_MutexForShapeProvider has been created Class contain methods: TopTools_MutexForShapeProvider::CreateMutexesForSubShapes - Creates and associates mutexes with each sub-shape of type theType in theShape. TopTools_MutexForShapeProvider::CreateMutexForShape - Creates and associates mutex with theShape TopTools_MutexForShapeProvider::GetMutex - Returns pointer to mutex associated with theShape. In case when mutex not found returns NULL. This class used in BRepMesh to avoid data race. Changes have been pushed to the Git branch CR22850_3 Dear Andrey, Please review. |
|
Additional changes has been pushed Dear Andrey, Please review. |
|
Added method RemoveAllMutexes to TopTools_MutexForShapeProvider Dear Andrey, Please review. |
|
Please make copy operator = private in Provider class, and also make copy constructor and operator = private in Mutex (recall previous version of the fix). In BRepMesh_FastDiscretFace::AddInShape(), I suggest to simplify the code where Sentry is constructed to have single call to GetMutex, as follows: // lock mutex to prevent parallel change of the same data Standard_Mutex* aMutex = theMutexProvider.GetMutex(It.Key()); Standard_Mutex::SentryNested (aMutex == NULL ? DummyMutex : *aMutex, aMutex != NULL); It could be also useful to provide constructor in SentryNested from pointer, so that it is armed only if pointer is nun-null. This then will be single line: Standard_Mutex::SentryNested (theMutexProvider.GetMutex(It.Key())); |
|
Fix for remarks have been pushed to Git Dear Andrey, Please review. |
|
The fix looks Ok, please test. Some remarks: (a) in BRepMesh_FastDiscret.cxx, comments for two new methods (at the end of the file) contain wrong names (from previous version). (b) in the comment before class header in TopTools_MutexForShapeProvider.hxx, lines ======== can probably break Doxygen generation (association of comment with class); it is better to avoid this kind of unnecessary lines |
|
Cosmetic changes according to the remark have been pushed |
|
Dear BugMaster, The workbenches KAS:dev:mkv-22850-occt-3 (GIT branch CR22850_3) KAS:dev:mkv-22850-products-3 (GIT master) were compiled on Linux platform. There is cyclic dependency: =====> DRAWEXE:exec.libs Info : ========> DRAWEXE Error : Cyclic dependency detected between: TopExp TopTools Error : Failed during execution Error : ========> DRAWEXE failed Error : Failed during execution Info : Step exec.libs Error : Step exec.libs failed Error : Step exec.tks not done : almost DRAWEXE:exec.libs failed Error : Step exec.link not done : almost DRAWEXE:exec.tks failed Info : ------------------ Process report ------------------ Info : Failed DRAWEXE (exec.libs exec.tks exec.link ) Info : ---------------------------------------------------- |
|
Replaced TopExp_Explorer with TopoDS_Iterator to avoid cyclic dependence Git branch CR22850_3 is ready to be reviewed |
|
Dmitry, I believe if iterator is used then function CreateMutexesForSubshape should be recursive; otherwise please explain how it is intended to work for theShape being face and type = edge. |
|
Sorry, it was a stupid mistake. Changes have been pushed to the git. Dear Andrey, Please review. |
|
Reviewed, please test |
|
Dear BugMaster, The workbenches KAS:dev:mkv-22850-occt-3 (GIT branch CR22850_3) KAS:dev:mkv-22850-products-3 (GIT master) were compiled on Linux platform and tested. Regression: Not detected Improvements: Not detected Testing case: chl 934 C8 testing case was created for issue. It is OK in this branch. See results in /QADisk/occttests/results/KAS/dev/mkv-22850-products-3_03072012/lin See reference results in /QADisk/occttests/results/KAS/dev/products-20120622-opt_22062012/lin See test cases in /QADisk/occttests/tests/ED N.B. In order to launch testing case you can make use the following instructions http://doc/doku.php?id=occt:certification |
occt: master d00cba63 2012-07-06 12:08:21
|
0022850: Not stable fix 22735 Class TopTools_MutexForShapeProvider has been created Class contain methods: TopTools_MutexForShapeProvider::CreateMutexesForSubShapes - Creates and associates mutexes with each sub-shape of type theType in theShape. TopTools_MutexForShapeProvider::CreateMutexForShape - Creates and associates mutex with theShape TopTools_MutexForShapeProvider::GetMutex - Returns pointer to mutex associated with theShape. In case when mutex not found returns NULL. Added method RemoveAllMutexes to TopTools_MutexForShapeProvider Assign operator in MutexProvider, constructor and operator and assign operator in Standard_Mutex now private Replaced TopExp_Explorer with TopoDS_Iterator to avoid cyclic dependence |
Affected Issues 0022850 |
|
mod - src/BRepMesh/BRepMesh_FastDiscret.cdl | Diff File | ||
mod - src/BRepMesh/BRepMesh_FastDiscret.cxx | Diff File | ||
mod - src/BRepMesh/BRepMesh_FastDiscretFace.cdl | Diff File | ||
mod - src/BRepMesh/BRepMesh_FastDiscretFace.cxx | Diff File | ||
mod - src/BRepMesh/BRepMesh_IncrementalMesh.cxx | Diff File | ||
mod - src/Standard/Standard_Mutex.hxx | Diff File | ||
add - src/TopTools/FILES | Diff File | ||
mod - src/TopTools/TopTools.cdl | Diff File | ||
add - src/TopTools/TopTools_MutexForShapeProvider.cxx | Diff File | ||
add - src/TopTools/TopTools_MutexForShapeProvider.hxx | Diff File |
Date Modified | Username | Field | Change |
---|---|---|---|
2011-12-16 10:09 |
|
New Issue | |
2011-12-16 10:09 |
|
Assigned To | => bugmaster |
2011-12-16 10:12 |
|
Assigned To | bugmaster => epv |
2011-12-16 10:12 |
|
Status | new => assigned |
2012-03-22 06:40 |
|
Target Version | 6.5.3 => 6.5.4 |
2012-04-28 12:20 | epv | Assigned To | epv => azn |
2012-05-04 19:37 |
|
Description Updated | |
2012-05-04 19:38 |
|
Assigned To | azn => abv |
2012-06-04 15:26 |
|
Assigned To | abv => ssv |
2012-06-05 12:40 | bugmaster | Project | LPKF => Open CASCADE |
2012-06-05 13:34 |
|
Assigned To | ssv => dbv |
2012-06-07 15:16 |
|
Note Added: 0020658 | |
2012-06-07 15:16 |
|
Assigned To | dbv => abv |
2012-06-07 15:16 |
|
Status | assigned => resolved |
2012-06-07 15:41 |
|
Note Added: 0020660 | |
2012-06-07 15:41 |
|
Assigned To | abv => dbv |
2012-06-07 15:41 |
|
Status | resolved => assigned |
2012-06-13 14:14 |
|
Note Added: 0020687 | |
2012-06-13 14:14 |
|
Assigned To | dbv => abv |
2012-06-13 14:14 |
|
Status | assigned => resolved |
2012-06-14 16:55 |
|
Note Added: 0020694 | |
2012-06-14 16:55 |
|
Assigned To | abv => bugmaster |
2012-06-14 16:55 |
|
Status | resolved => reviewed |
2012-06-14 18:02 |
|
Assigned To | bugmaster => mkv |
2012-06-15 14:11 |
|
Assigned To | mkv => dbv |
2012-06-15 14:12 |
|
Status | reviewed => assigned |
2012-06-18 08:43 |
|
Status | assigned => resolved |
2012-06-18 08:43 |
|
Status | resolved => reviewed |
2012-06-18 12:18 |
|
Assigned To | dbv => mkv |
2012-06-18 18:06 |
|
Note Added: 0020719 | |
2012-06-18 18:07 |
|
Test case number | => chl 934 C8 |
2012-06-18 18:07 |
|
Assigned To | mkv => bugmaster |
2012-06-18 18:07 |
|
Status | reviewed => tested |
2012-06-22 18:59 |
|
Note Added: 0020773 | |
2012-06-22 18:59 |
|
Assigned To | bugmaster => dbv |
2012-06-22 18:59 |
|
Status | tested => assigned |
2012-06-22 19:00 |
|
Note Edited: 0020773 | |
2012-06-22 19:04 |
|
Note Edited: 0020773 | |
2012-06-22 19:20 |
|
Note Edited: 0020773 | |
2012-06-22 19:43 |
|
Note Added: 0020777 | |
2012-06-22 19:46 |
|
Note Edited: 0020777 | |
2012-06-25 09:46 |
|
Note Added: 0020780 | |
2012-06-25 18:36 |
|
Note Added: 0020791 | |
2012-06-25 18:36 |
|
Assigned To | dbv => abv |
2012-06-25 18:36 |
|
Status | assigned => resolved |
2012-06-25 20:07 |
|
Note Added: 0020793 | |
2012-06-25 20:07 |
|
Assigned To | abv => dbv |
2012-06-25 20:07 |
|
Status | resolved => assigned |
2012-06-28 17:16 |
|
Note Added: 0020816 | |
2012-06-28 17:16 |
|
Assigned To | dbv => abv |
2012-06-28 17:16 |
|
Status | assigned => resolved |
2012-06-28 18:44 |
|
Note Added: 0020824 | |
2012-06-28 19:16 |
|
Note Added: 0020825 | |
2012-06-28 19:35 |
|
Note Added: 0020826 | |
2012-06-28 19:35 |
|
Assigned To | abv => dbv |
2012-06-28 19:35 |
|
Status | resolved => assigned |
2012-06-29 11:13 |
|
Note Added: 0020830 | |
2012-06-29 11:13 |
|
Assigned To | dbv => abv |
2012-06-29 11:13 |
|
Status | assigned => resolved |
2012-06-29 11:36 |
|
Note Added: 0020831 | |
2012-06-29 11:36 |
|
Status | resolved => reviewed |
2012-06-29 11:37 |
|
Assigned To | abv => bugmaster |
2012-06-29 12:01 |
|
Note Added: 0020832 | |
2012-06-29 14:28 |
|
Assigned To | bugmaster => mkv |
2012-06-29 15:53 |
|
Note Added: 0020834 | |
2012-06-29 15:54 |
|
Assigned To | mkv => dbv |
2012-06-29 15:54 |
|
Status | reviewed => assigned |
2012-07-03 11:41 |
|
Note Added: 0020845 | |
2012-07-03 11:41 |
|
Assigned To | dbv => abv |
2012-07-03 11:41 |
|
Status | assigned => resolved |
2012-07-03 12:52 |
|
Note Added: 0020846 | |
2012-07-03 12:52 |
|
Assigned To | abv => dbv |
2012-07-03 12:52 |
|
Status | resolved => assigned |
2012-07-03 14:06 |
|
Note Added: 0020847 | |
2012-07-03 14:06 |
|
Assigned To | dbv => abv |
2012-07-03 14:06 |
|
Status | assigned => resolved |
2012-07-03 15:36 |
|
Note Added: 0020848 | |
2012-07-03 15:36 |
|
Assigned To | abv => dbv |
2012-07-03 15:36 |
|
Status | resolved => reviewed |
2012-07-03 16:31 |
|
Assigned To | dbv => mkv |
2012-07-04 18:26 |
|
Note Added: 0020858 | |
2012-07-04 18:28 |
|
Assigned To | mkv => bugmaster |
2012-07-04 18:28 |
|
Status | reviewed => tested |
2012-07-10 14:33 |
|
Changeset attached | => occt master d00cba63 |
2012-07-10 14:33 |
|
Assigned To | bugmaster => dbv |
2012-07-10 14:33 |
|
Status | tested => verified |
2012-07-10 14:33 |
|
Resolution | open => fixed |
2012-07-23 13:01 |
|
Relationship added | parent of 0023260 |
2012-11-09 09:53 |
|
Category | OCCT:Visualization => OCCT:Mesh |
2012-11-16 13:13 | bugmaster | Fixed in Version | => 6.5.4 |
2012-11-16 13:18 | bugmaster | Status | verified => closed |