View Issue Details

IDProjectCategoryView StatusLast Update
0025545CommunityOCCT:Modeling Datapublic2015-05-14 15:32
ReporterRoman Lygin Assigned Tobugmaster  
PrioritynormalSeverityminor 
Status closedResolutionfixed 
Product Version6.8.0 
Target Version6.9.0Fixed in Version6.9.0 
Summary0025545: TopLoc_Location::Transformation() provokes data races
DescriptionUp to 6.8.0 TopLoc_Location::Transformation() uses delayed construction of the gp_Trsf: initially only a null pointer gets stored and the object is dynamically allocated and created upon first call to Transformation().

Therefore accessing the same location from multiple threads creates a data race if the object has not been created yet.

The fix will enforce creation of the object right away, without delay and extra pointer. Not only will this remove data race but also reduce total memory footprint (object size only vs object size + size of pointer) and increase performance (no extra dynamic allocation). This is reasonable given that the object gets created anyway, upon any access to geometry, i.e. via BRep_Tool.
Steps To ReproduceThe below code is inserted in the command OCC25545.
Test case is bugs fclasses bug25545.

/*! Tests data-race when concurrently accessing TopLoc_Location::Transformation().*/
void TopLoc_LocationTest::Multithreading()
{
    size_t n = 1000;
    std::vector<TopoDS_Shape> aShapeVec (n);
    std::vector<TopLoc_Location> aLocVec (n);
    TopoDS_Shape aShape = BRepBuilderAPI_MakeVertex (gp::Origin ());
    aShapeVec[0] = aShape;
    for (size_t i = 1; i < n; ++i) {
        gp_Trsf aT;
        aT.SetTranslation (gp_Vec (1, 0, 0));
        aLocVec[i] = aLocVec[i - 1] * aT;
        aShapeVec[i] = aShape.Moved (aLocVec[i]);
    }

    //have to use an external flag as QCOMPARE() inside lambda does not return from it
    tbb::atomic<bool> anIsRaceDetected;
    anIsRaceDetected = false;

    //concurrently process
    tbb::parallel_for (size_t (0), n, [&](size_t i) {
        if (!anIsRaceDetected) {
            const TopoDS_Vertex& aV = TopoDS::Vertex (aShapeVec[i]);
            gp_Pnt aP = BRep_Tool::Pnt (aV);
            if (aP.X () != static_cast<double> (i)) {
                anIsRaceDetected = true;
            }
        }
    }, tbb::simple_partitioner ());
    QVERIFY (!anIsRaceDetected);
}
Additional information
and documentation updates
The class TopLoc_Location is made thread-safe. Earlier the method Transformation provoked data races.
TagsNo tags attached.
Test case number

Relationships

related to 0024826 closedbugmaster Wrapping of parallelisation algorithms 

Activities

git

2014-12-04 10:34

administrator   ~0034992

Branch CR25545 has been created by Roman Lygin.

SHA-1: 7ea5c1cc3309780e07ceecf0dbba10fbcec612bc


Detailed log of new commits:

Author: Roman Lygin
Date: Thu Dec 4 10:34:20 2014 +0400

    0025545: TopLoc_Location::Transformation() provokes data races

Roman Lygin

2014-12-04 10:41

developer   ~0034993

The fix had to introduce a boolean flag TopLoc_ItemLocation::myIsComputed as otherwise it is impossible to distinguish cases when myTrsf corresponds to computed or not yet computed value. This collision caused regressions on a few cases (e.g. blend bfuseblend A8) and required a rework.

Thus, there is a footprint of 2 bytes (sizeof(short)) vs saved 4 or 8 bytes (sizeof(gp_Trsf*)). To reduce this new footprint a single int field could be used for power and boolean flag. This modification has not been done for the sake of simplicity.

Other notes:
gp_Trsf::Powered() has been made const as it was non-const by mistake in the past and thus caused a compilation error in modified TopLoc_ItemLocation ctor.

msv

2014-12-06 17:57

developer   ~0035095

Dear Roman,

It seems there is a way to go further in this solution and to get rid of myIsComputed flag. (BTW, footprint mentioned in the previous post is not 2 bytes, but 4 bytes, as Standard_Boolean is defined as unsigned int.) I have analyzed the usage of TopLoc_SListOfItemLocation class and its methods and found out that once the chain of item locations is constructed it is not changed (nor power nor datum nor tail) during its life. Yes, there are methods that can turn this statement to false. They are ChangeValue, SetValue, ChangeTail, SetTail. But these methods are not used anywhere. Therefore I propose to remove them from the interface as unused and dangerous.

So, my main idea is to move the computation of transformation into the following constructor of the class TopLoc_SListOfItemLocation:

TopLoc_SListOfItemLocation::TopLoc_SListOfItemLocation
  (const TopLoc_ItemLocation& anItem,
   const TopLoc_SListOfItemLocation& aTail);


This ctor is called only from within the methods Construct and Constructed, i.e. when new chain is constructed from an old one by putting a new item location on top. If at this moment we compute transformation for the new item location already knowing the tail of previous items this transformation will be valid till the end of life cycle of this chain.

Please, tell me if you know any obstacles on this way.

BR,
Mikhail

Roman Lygin

2014-12-06 23:55

developer   ~0035097

Dear Mikhail,

Thank you for your notes.
This was my way of thinking as well when I implemented the first version of the fix. However it did cause a regression on a few cases:
blend bfuseblend A8 A9
boolean bopcommon_simple ZF3 ZI1 ZI2
and others.
After initial triaging I isolated it to the case when there is internal location, two ones on the parent shape and then scaling. Not sure but it could be something like this:
line l 0 0 0 1 0 0
vertex v1 0 0 0
vertex v2 0 0 0
translate v2 1 0 0
edge e l v1 v2
wire w e
trotate w 0 0 0 0 1 0 90
trotate w 0 0 0 1 0 0 90
tscale w 0 0 0 10

The parent shape gets invalid transformation but when you save the shape to .brep and restore back then it gets fine. After triaging it looked like the chain of items was fine (therefore storing/restoring to/from .brep is fine) but the value in one chain node is updated more than needed.

I ran out of time to further triage that and had to fall back to a limited version as explained above. I am not certain if avoiding the flag is really feasible.

Yes, you are right that sizeof(Standard_Boolean) is 4 (I believed it was unsigned short). This decreases the gain but the new version is still better in terms of footprint and performance given that no extra dynamic allocation happens.

If you/your team have spare cycles you might want to further investigate this to reach the best possible fix. If not then I suggest that you integrate the fix and create a follow up tracker to get back to this later. I am afraid I ran out of cycles on this task.

Thank you again,
Roman

git

2014-12-07 14:35

administrator   ~0035102

Branch CR25545_1 has been created by msv.

SHA-1: d18744ea904ec351493863aedebecc357c879fac


Detailed log of new commits:

Author: Roman Lygin
Date: Thu Dec 4 10:34:20 2014 +0400

    0025545: TopLoc_Location::Transformation() provokes data races
    
    Get rid of postponed calculation of transformation.
    Remove unused methods.
    Add command OCC25545 to reproduce the bug with data races.

msv

2014-12-07 14:41

developer   ~0035103

I have reworked the patch of Roman. It has no regressions pointed by him.
Andrey, please review.

Roman Lygin

2014-12-07 16:31

developer   ~0035104

Hello Mikhail,
Many thanks for completing the patch. Great to see you managed to get rid of this extra field. Apparently I missed some obvious stuff.
Roman

abv

2014-12-08 09:41

manager   ~0035114

Mikhail, it's great that you have added DRAW command for testing the fix! Please just correct it so it can be compiled without TBB (use HAVE_TBB macro) and by compilers with no support of C++11 features like lambda functions. Then, I strongly recommend you to add a test case script -- it should be pretty easy and will allow the patch to pass without additional efforts by QA team.

git

2014-12-08 13:24

administrator   ~0035125

Branch CR25545_1 has been updated by msv.

SHA-1: a8682d99e563bd750a9aad6e50f663547a23054e


Detailed log of new commits:

Author: msv
Date: Mon Dec 8 13:23:57 2014 +0300

    - Get rid of C++11 lambda construction
    - make code compilable with no HAVE_TBB defined
    - add test case bugs/fclasses/bug25545

msv

2014-12-08 13:31

developer   ~0035126

Please review new changes.

msv

2014-12-08 13:36

developer   ~0035127

The parallel framework developed in scope of 0024826 is to be applied to the command OCC25545, so as to make it functioning in absence of TBB.

abv

2014-12-08 13:59

manager   ~0035133

No remarks, please test

apv

2014-12-10 17:58

tester   ~0035280

Dear BugMaster,

Branch CR25545_1 has been rejected due compilation errors on all platforms:
Linux:
http://jenkins-test-02.nnov.opencascade.com/user/mnt/my-views/view/CR25545_1/job/mnt-CR25545_1-master_build_occt_linux/1/parsed_console/
Windows:
http://jenkins-test-02.nnov.opencascade.com/user/mnt/my-views/view/CR25545_1/job/mnt-CR25545_1-master_build_occt_windows/1/parsed_console/
MacOS:
http://jenkins-test-02.nnov.opencascade.com/user/mnt/my-views/view/CR25545_1/job/mnt-CR25545_1-master_prepare_build_occt_MacOS/1/parsed_console/

msv

2014-12-10 19:16

developer   ~0035285

Dear apv,
The errors occur due to use of obsolete version of tbb (3.0 on win and lin; and 4.0 on macOS). Please, update environment to use tbb 4.2 or 4.3.

apv

2014-12-17 11:29

tester   ~0035444

Please, process request.

apv

2014-12-25 12:32

tester   ~0035708

Dear BugMaster,

Branch CR25545_1 from occt git-repository (and master from products git-repository) was compiled on Linux, MacOS and Windows platforms and tested.
SHA-1: a8682d99e563bd750a9aad6e50f663547a23054e

Number of compiler warnings:
occt component:
   Linux: 18 (18 on master)
   Windows: 0 (0 on master)
products component :
   Linux: 11 (11 on master)
   Windows: 1 (1 on master)

Regressions/Differences:
Not detected

Testing cases:
bugs fclasses bug25545 - OK
http://occt-tests/CR25545-1-master-occt/Debian60-64/bugs/fclasses/bug25545.html
http://occt-tests/CR25545-1-master-occt/Windows-32-VC10/bugs/fclasses/bug25545.html

Testing on Linux:
Total MEMORY difference: 364789752 / 363562240
Total CPU difference: 46090.75000000002 / 47537.4299999999

Testing on Windows:
Total MEMORY difference: 276830884 / 277181316
Total CPU difference: 33849.90625 / 41681.890625

git

2015-01-26 12:35

administrator   ~0036587

Branch CR25545_1 has been deleted by inv.

SHA-1: a8682d99e563bd750a9aad6e50f663547a23054e

git

2015-01-26 12:36

administrator   ~0036598

Branch CR25545 has been deleted by inv.

SHA-1: 7ea5c1cc3309780e07ceecf0dbba10fbcec612bc

Related Changesets

occt: master ee6bb37b

2014-12-25 15:31:11

msv


Committer: bugmaster Details Diff
0025545: TopLoc_Location::Transformation() provokes data races

Get rid of postponed calculation of transformation.
Remove unused methods.
Add command OCC25545 to reproduce the bug with data races.

- Get rid of C++11 lambda construction
- make code compilable with no HAVE_TBB defined
- add test case bugs/fclasses/bug25545
Affected Issues
0025545
mod - src/gp/gp_Trsf.cdl Diff File
mod - src/gp/gp_Trsf.lxx Diff File
mod - src/QABugs/QABugs_19.cxx Diff File
mod - src/TopLoc/TopLoc.cdl Diff File
mod - src/TopLoc/TopLoc_ItemLocation.cdl Diff File
mod - src/TopLoc/TopLoc_ItemLocation.cxx Diff File
mod - src/TopLoc/TopLoc_Location.cxx Diff File
mod - src/TopLoc/TopLoc_SListNodeOfItemLocation.cdl Diff File
mod - src/TopLoc/TopLoc_SListOfItemLocation.cdl Diff File
mod - src/TopLoc/TopLoc_SListOfItemLocation.cxx Diff File
mod - src/TopLoc/TopLoc_SListOfItemLocation.lxx Diff File
add - tests/bugs/fclasses/bug25545 Diff File

Issue History

Date Modified Username Field Change
2014-12-03 08:25 Roman Lygin New Issue
2014-12-03 08:25 Roman Lygin Assigned To => Roman Lygin
2014-12-04 10:34 git Note Added: 0034992
2014-12-04 10:41 Roman Lygin Note Added: 0034993
2014-12-04 10:41 Roman Lygin Assigned To Roman Lygin => msv
2014-12-04 10:41 Roman Lygin Status new => resolved
2014-12-06 17:57 msv Note Added: 0035095
2014-12-06 17:58 msv Assigned To msv => Roman Lygin
2014-12-06 17:58 msv Status resolved => feedback
2014-12-06 23:55 Roman Lygin Note Added: 0035097
2014-12-06 23:55 Roman Lygin Assigned To Roman Lygin => msv
2014-12-07 14:35 git Note Added: 0035102
2014-12-07 14:41 msv Note Added: 0035103
2014-12-07 14:41 msv Assigned To msv => abv
2014-12-07 14:41 msv Status feedback => resolved
2014-12-07 14:41 msv Steps to Reproduce Updated
2014-12-07 16:31 Roman Lygin Note Added: 0035104
2014-12-08 09:41 abv Note Added: 0035114
2014-12-08 09:41 abv Assigned To abv => msv
2014-12-08 09:41 abv Status resolved => assigned
2014-12-08 13:24 git Note Added: 0035125
2014-12-08 13:30 msv Steps to Reproduce Updated
2014-12-08 13:30 msv Additional Information Updated
2014-12-08 13:31 msv Note Added: 0035126
2014-12-08 13:31 msv Assigned To msv => abv
2014-12-08 13:31 msv Status assigned => resolved
2014-12-08 13:34 msv Relationship added related to 0024826
2014-12-08 13:36 msv Note Added: 0035127
2014-12-08 13:59 abv Note Added: 0035133
2014-12-08 13:59 abv Assigned To abv => bugmaster
2014-12-08 13:59 abv Status resolved => reviewed
2014-12-08 16:41 mkv Assigned To bugmaster => apv
2014-12-10 17:58 apv Note Added: 0035280
2014-12-10 17:58 apv Assigned To apv => msv
2014-12-10 17:58 apv Status reviewed => assigned
2014-12-10 19:13 msv Assigned To msv => Roman Lygin
2014-12-10 19:14 msv Assigned To Roman Lygin => apv
2014-12-10 19:16 msv Note Added: 0035285
2014-12-17 11:29 apv Note Added: 0035444
2014-12-17 11:29 apv Assigned To apv => bugmaster
2014-12-17 11:29 apv Status assigned => feedback
2014-12-25 12:32 apv Note Added: 0035708
2014-12-25 12:32 apv Status feedback => tested
2014-12-26 13:33 bugmaster Changeset attached => occt master ee6bb37b
2014-12-26 13:33 bugmaster Status tested => verified
2014-12-26 13:33 bugmaster Resolution open => fixed
2015-01-19 16:23 bugmaster Target Version 7.0.0 => 6.9.0
2015-01-26 12:35 git Note Added: 0036587
2015-01-26 12:36 git Note Added: 0036598
2015-05-14 15:29 aiv Status verified => closed
2015-05-14 15:32 aiv Fixed in Version => 6.9.0