View Issue Details

IDProjectCategoryView StatusLast Update
0023541CommunityOCCT:Codingpublic2016-07-14 14:38
ReporterRoman Lygin Assigned ToRoman Lygin  
PrioritynormalSeverityminor 
Status closedResolutionfixed 
PlatformanyOSLinux 
Product Version6.5.3 
Target Version6.6.0Fixed in Version6.6.0 
Summary0023541: On Linux OCC links to release mode TBB leading to unspecified behavior
DescriptionConfigure produces makefiles to always link to release mode TBB (libtbb.so, libtbbmalloc.so) even for OCC being built in debug mode.

This leads to unspecified behavior, e.g. sporadic crashes of the application in debug mode. Debugging reveals that the memory gets corrupted when de-allocating neighbor objects, when TBB allocator tries to update lists of free/allocated objects. Given that release mode TBB comes without symbol files it is impossible to debug TBB source code. However using gdb's 'watch' to inspect the memory cell of interest, reveals it happens only once and the call stack leads to TBB allocator. Release mode functions fine.
Also another component (CAD Exchanger SDK in this case) was linked to libtbb_debug.so and libtbbmalloc_debug.so.
So overall there is obvious conflict of symbols when mixing in one application.

Perhaps the code to fix is in configure.ac around the following lines:
         CSF_TBB_LIB="-L$tbb_lib -ltbb -ltbbmalloc"
Steps To ReproduceTry to configure and build OCC in debug mode and verify the link line of TKernel, TKMesh.
TagsNo tags attached.
Test case number

Relationships

related to 0025972 closedibs Threre is no way to link occ with debug tbb library using cmake 

Activities

Roman Lygin

2012-11-20 16:10

developer   ~0022312

Investigating the original issue that led to submitting this bug report revealed the root-cause to be in gcc 4.1.2. More precisely it is in its std::basic_string implementation that uses a static template member _S_empty_rep_storage, which leads to crash with creation in one library and destruction in another library. (Curious minds can find more details on internet).

However incorrectly linked tbb allocator hid the issue misinterpreting it to be an issue in inconsistently linked allocator. Relinking debug OCC binaries to debug TBB allocator (by simple temporary renaming of libtbbmalloc_debug to libtbbmalloc) helped to reveal real issue, as TBB allocator asserted on a first attempt to free an invalid pointer. Thus, ensuring correct linking of TBB libraries by OCC is still a must.

Meanwhile, the issue is downgraded from major/always to minor/random to reflect the severity.

abv

2012-11-20 18:17

manager   ~0022315

Sorry for late commenting: I believe using different names of libraries in different configurations is bad practice, exactly due to possibility to load several different instances of the code at runtime, and I see no good reason to follow it. On Windows, it is permanent trouble with MS VS CRT that follows this way, so why should we replicate the same troublesome approach on Linux?

Roman Lygin

2012-11-20 18:51

developer   ~0022316

Andrey,
Different names for release vs debug libraries is a separate issue here. Let's not bring it here to avoid further complication (for the record, I have the opinion for distinct names ;-)).

What needs to be done here (in 22315) is to make sure OCC links to a proper version of the 3rd party prerequisite (TBB in this case), which already provides certain naming convention, regardless if anyone likes this convention or not.

Roman Lygin

2013-03-16 11:29

developer   ~0023763

One more example of the consequences of this issue - a sporadic deadlock in the code which uses tbb::task_group.
The root-cause is that due to simultaneous use of both libtbb_debug.so (used by debug version CAD Exchanger) and libtbb.so (enforced by OCC), different instances of
static basic_tls<generic_scheduler*> theTLS;
get created. This leads to two distinct tbb masters created in the same thread and further leading to a deadlock. (More details are available upon request).

The issue investigation took me 20(!) person-hours of work (which I obviously prefer to have been spent elsewhere). If this issue leaked into the field and happened at customer site, the cost would be multiples of this, if it could ever be triaged at all.

So please DO consider this issue really severe and get it fixed! The source file to be modified is apparently outside of the git source tree, or please tell me otherwise, so I could offer you a fix.

Thank you,
Roman

P.S. The same applies to any 3rd party library which is linked to by OCC (FreeType, etc) and which can potentially be used by any other library/application which can co-exist with OCC in one process. OCC should offer some user-defined option which library to link with during its build process.

abv

2013-03-16 12:22

manager   ~0023764

Roman, if you wish to propose a fix to correct project files generation, please check WOK sources at http://git.dev.opencascade.org/gitweb/ ; clone URL is gitolite@git.dev.opencascade.org:occt-wok.git

Use scripts collect_binary.* in the root folder to rebuild WOK with your changes in the scripts (I suppose you can keep the binaries the same as you use).

Roman Lygin

2013-03-18 21:14

developer   ~0023786

Thanks, Andrey.
I did not realize WOK sources are available via another git repository (by the way, the correct url is gitolite@git.dev.opencascade.org:occt-wok, i.e. no .git suffix).
The fix has been pushed into the git repository.
Reassigned back to bugmaster.

bugmaster

2013-03-22 12:57

administrator   ~0023856

Fix has been tested and integrated into master of occt-wok repository

Issue History

Date Modified Username Field Change
2012-11-11 14:21 Roman Lygin New Issue
2012-11-11 14:21 Roman Lygin Assigned To => bugmaster
2012-11-12 17:21 bugmaster Target Version 6.5.4 => 6.6.0
2012-11-20 16:10 Roman Lygin Note Added: 0022312
2012-11-20 16:10 Roman Lygin Severity major => minor
2012-11-20 16:10 Roman Lygin Reproducibility always => random
2012-11-20 18:17 abv Note Added: 0022315
2012-11-20 18:51 Roman Lygin Note Added: 0022316
2013-02-26 15:56 abv Target Version 6.6.0 => 6.7.0
2013-03-16 11:29 Roman Lygin Note Added: 0023763
2013-03-16 12:22 abv Note Added: 0023764
2013-03-16 12:22 abv Assigned To bugmaster => Roman Lygin
2013-03-16 12:22 abv Status new => assigned
2013-03-18 21:14 Roman Lygin Note Added: 0023786
2013-03-18 21:14 Roman Lygin Assigned To Roman Lygin => bugmaster
2013-03-18 21:14 Roman Lygin Status assigned => resolved
2013-03-18 23:12 abv Target Version 6.7.0 => 6.6.0
2013-03-22 12:55 bugmaster Status resolved => reviewed
2013-03-22 12:57 bugmaster Note Added: 0023856
2013-03-22 12:57 bugmaster Status reviewed => verified
2013-03-22 12:57 bugmaster Resolution open => fixed
2013-03-22 12:57 bugmaster Assigned To bugmaster => Roman Lygin
2013-04-23 13:35 aiv Status verified => closed
2013-04-29 15:24 aiv Fixed in Version => 6.6.0
2014-01-11 11:58 abv Category OCCT Release:BUILD => OCCT:Coding
2016-07-14 14:38 Roman Lygin Relationship added related to 0025972