View Issue Details

IDProjectCategoryView StatusLast Update
0025972CommunityOCCT:Configurationpublic2017-05-29 16:14
Reportersolomin_s Assigned Toibs 
PrioritynormalSeverityminor 
Status closedResolutionduplicate 
PlatformLinuxOSUbuntu 
Product Version6.8.0 
Target Version7.2.0 
Summary0025972: Threre is no way to link occ with debug tbb library using cmake
DescriptionTwo bugs found when try build occ with debug tbb:

1. cmake files generated by WOK link occ with tbb library named as tbb and tbbmalloc. Cmake configure script allow set two variables:
3RDPARTY_TBB_LIBRARY - path to tbb library contained tbb library.
3RDPARTY_TBB_LIBRARY_DIR - set path to library should be linked.

But found tbb library ignored in other cmake files. For example, in adm/cmake/TKernel/CMakeLists.txt
if(USE_TBB)
  list( APPEND TKernel_USED_LIBS tbb )
endif()
if(USE_TBB)
  list( APPEND TKernel_USED_LIBS tbbmalloc )
endif()

2. Incorrect finding tbb libs when 3RDPARTY_TBB_LIBRARY_DIR is only set.
tbb.cmake:120 find_library (3RDPARTY_${LIBRARY_NAME}_LIBRARY ${LIBRARY_NAME}_debug ...
will be interpreted as
find_library (3RDPARTY_TBB_LIBRARY TBB_debug...

Cmake 3.1.3 (I suppose with older version too) case sensitive.
find_library (3RDPARTY_TBB_LIBRARY tbb_debug ...
work correctly.
Steps To Reproduce1. Generate cmake files using WOK
2. Run cmake and configure occ using debug tbb library.
3. Check adm/cmake/TKernel/CMakeFiles/TKernel.dir/link.txt file (must be -ltbb -ltbbmalloc instead of -ltbb_debug -ltbbmalloc_debug)
TagsNo tags attached.
Test case number

Relationships

duplicate of 0028335 closedbugmaster Open CASCADE Configuration, Cmake - 3rdparty library names present in two places and aren't sync with each other 
related to 0023541 closedRoman Lygin Community On Linux OCC links to release mode TBB leading to unspecified behavior 
related to 0027073 closedbugmaster Community No way to choose different tbb libs for release and debug using CMake 

Activities

git

2016-07-14 14:24

administrator   ~0055909

Branch CR25972 has been created by Roman Lygin.

SHA-1: 91140fac1309458a9d054a2acd2c05c123602f1c


Detailed log of new commits:

Author: Roman Lygin
Date: Thu Jul 14 14:23:55 2016 +0300

    0025972: Threre is no way to link occ with debug tbb library using cmake

Roman Lygin

2016-07-14 14:38

developer   ~0055910

This is a regression of 0023541.

As explained therein this is a severe issue leading to undefined behavior - most likely crashes or hangs. We have been observing both when using tbb::parallel_for invoked from our code.

If needed, here are some details from recent analysis (based on TBB 4.2.0):

Symptom - hang when using nested parallelism with tbb::parallel_for.

The hang is in the form of infinite loop taking place in the destructor ~task_group_context(). In there a wrong branch is taken
as the condition
  if ( governor::is_set(my_owner) ) {
is not true (although it should be).


There is a mix of debug and release TBB libraries:

[rlygin@server scripts]$ ldd ../../CadEx/lin64/gcc4/lib/libCadExCored.so | grep tbb
    libtbb_debug.so.2 => /home/cadex/DevTools/tbb/4.2.0/lin64/gcc4/lib/libtbb_debug.so.2 (0x00007fa424855000)
    libtbb.so.2 => /home/cadex/DevTools/tbb/4.2.0/lin64/gcc4/lib/libtbb.so.2 (0x00007fa423823000)
    libtbbmalloc.so.2 => /home/cadex/DevTools/tbb/4.2.0/lin64/gcc4/lib/libtbbmalloc.so.2 (0x00007fa4235ea000)

[rlygin@server scripts]$ ldd ../../CadEx/lin64/gcc4/bin/ModelCheckerd | grep tbb
    libtbbmalloc_debug.so.2 => /home/cadex/DevTools/tbb/4.2.0/lin64/gcc4/lib/libtbbmalloc_debug.so.2 (0x00007fdc8585a000)
    libtbb.so.2 => /home/cadex/DevTools/tbb/4.2.0/lin64/gcc4/lib/libtbb.so.2 (0x00007fdc84905000)
    libtbbmalloc.so.2 => /home/cadex/DevTools/tbb/4.2.0/lin64/gcc4/lib/libtbbmalloc.so.2 (0x00007fdc846cd000)
    libtbb_debug.so.2 => /home/cadex/DevTools/tbb/4.2.0/lin64/gcc4/lib/libtbb_debug.so.2 (0x00007fdc81341000)


Due to ODR (One Definition Rule) violation there is UB (undefined behavior).

Investigation revealed that there are two copies of governor::theTLS which are instantiations basic_tls<generic_scheduler*>. This may be due to template use, due to intensive use of inlining and optimizations (in release version) and/or perhaps anything else.

What happens during execution:
1. a scheduler is created (in generic_scheduler::create_master()) with address 0x7f86fbcea400.
2. it is recorded in theTLS.set() in governor::sign_on(). At that moment &theTLS = 0x7f86fe18be6c.
3. A task_group_context is created with the field my_owner equal to 0x7f86fbcea400 (as needed).
4. Upon exiting scope, dctor ~task_group_context() is called. The field my_owner is still 0x7f86fbcea400. Then a call if ( governor::is_set(my_owner) ) takes place, where it is
static bool is_set ( generic_scheduler* s ) { return theTLS.get() == s; }.
Inside that get(), this =0x7f86e83690e0 (i.e. != 0x7f86fe18be6c).

Hence, false is returned, and control is passed to the wrong branch in dctor leading to infinite loop.


Applying a patch and rebuilding OCC to link to debug TBB version obviously fixes the issue.

git

2017-05-29 16:14

administrator   ~0066791

Branch CR25972 has been deleted by kgv.

SHA-1: 91140fac1309458a9d054a2acd2c05c123602f1c

Issue History

Date Modified Username Field Change
2015-03-23 15:21 solomin_s New Issue
2015-03-23 15:21 solomin_s Assigned To => bugmaster
2015-04-06 14:00 abv Target Version 6.9.0 => 7.1.0
2016-01-12 18:13 AlexanderZashivalov Relationship added related to 0027073
2016-07-14 14:24 git Note Added: 0055909
2016-07-14 14:38 Roman Lygin Note Added: 0055910
2016-07-14 14:38 Roman Lygin Status new => resolved
2016-07-14 14:38 Roman Lygin Relationship added related to 0023541
2016-11-03 17:38 abv Target Version 7.1.0 => 7.2.0
2016-12-18 10:45 kgv Assigned To bugmaster => ibs
2017-01-09 15:03 ibs Relationship added related to 0028335
2017-04-24 10:16 bugmaster Status resolved => closed
2017-04-24 10:16 bugmaster Resolution open => duplicate
2017-04-24 10:16 bugmaster Relationship deleted related to 0028335
2017-04-24 10:17 bugmaster Relationship added duplicate of 0028335
2017-05-29 16:14 git Note Added: 0066791