View Issue Details

IDProjectCategoryView StatusLast Update
0022719CommunityOCCT:Foundation Classespublic2017-01-22 16:14
ReporterszyAssigned Toabv 
PrioritynormalSeveritymajor 
Status feedbackResolutionno change required 
PlatformAOSL 
Product Version6.5.1 
Target VersionUnscheduled 
Summary0022719: Performance problem of TCollection_AsciiString and OCCT memory manager
DescriptionPost from the Forum - http://www.opencascade.org/org/forum/thread_21696/.
Initial message concerning memory leak is a false statement. But there are several rational ideas from RLN to be considered:
=====
2. TCollection_AsciiString to use std::string vs current OCC implementations of 1990-es. That could make sense, indeed. The only thing to keep in mind is that it must rather be basic_string that would accept allocator object. That allocator object shall implement C++ allocator using Standard::Allocate(), Standard::Free(). This is important to preserve flexibility of using arbitrary memory allocator.
Performance testing must be applied to confirm this new implementation will be any more efficient.

3. Performance.
a. With MMGT_OPT=0 (i.e. system allocator) the test behaves 100-1000x faster than with MMGT_OPT=1 (OCC allocator).
I used Intel Parallel Amplifier to understand the root-cause of this huge slow-down. The root-cause is that as of i=3333 (i.e. ~ 97% of time!) special branch in OCC allocator is used, which is use of mmap() on Linux and CreateFileMapping() on Windows (see Standard_MMgrOpt::AllocMemory()). This imposes huge overhead being called for each call of operator+=().
This could be hint to the OCC team to revisit the work with large blocks.

b. With MMGT_OPT=2 (i.e. TBB allocator) the code behaves ~2x faster than with MMGT_OPT=0. It makes sense to increase n to 1000000 to have meaningful times (mine were ~13sec vs ~25sec). Thus, TBB is a winner in this case ;-).
Windows7, VS2005 SP1.
AFAICT, the reason is efficient work of TBB with large-blocks and their caching, taking benefits of hoarding large blocks and reusing them for numerous iterations before a next larger block is allocated.


4. Custom allocators vs existing system allocators.
I perfectly agree with Denis' vision here in long-term. At some point there must be system allocators that would make attempts to have custom allocators worthless. The only exception would likely be memory regions (allocation of a large block and then deallocation at once). Windows 2008 vs 2003 prove the trend (I intentionally left both charts in the quoted blog post to show the difference). The problem however is that we are far from there yet. OS vendors still do not offer optimal allocators, especially for *threaded* scenarios. Therefore TBB allocator, for instance, is often a winner here. In recent experiments with Salome, TBB gave extra 3x speed-up vs system allocator, especially on large blocks. So there is a lot of demand for that.
...
============
TagsNo tags attached.
Test case number

Activities

kgv

2017-01-13 19:30

developer   ~0062595

This issue does not look relevant anymore.

abv

2017-01-17 09:10

manager   ~0062671

Last edited: 2017-01-17 09:11

Point 2 has been addressed by fix on #11758 -- we are now using plain C functions to do string operations in TCollection_AsciiString, instead of custom tricky code.

Point 4 is realized: now system allocators behave better than OCCT custom allocator (MMGT_OPT=1) and we are using system one by default (MMGT_OPT=0).

Note that system allocator behaves better than TBB on this test case unless really huge number of steps is used:

Nb. steps | System | OCCT | TBB |
-----------------------------------------
100 K | 0 sec | 8 sec | 0.015 sec |
1 M | 2.6 sec | >10 min | 2.2 sec |

However, point 3 is still relevant: if MMGT_OPT is set to 1 (OCCT "optimized" allocator), test case becomes very slow. Setting MMGT_MMAP to 0 makes it a bit worse yet. Thus the problem is still there.

git

2017-01-17 09:33

administrator   ~0062673

Branch CR22719 has been created by abv.

SHA-1: f1dd97dc8418236b583653cd9128f1c41c83de20


Detailed log of new commits:

Author: abv
Date: Tue Jan 17 09:33:45 2017 +0300

    Test command for 22719: Performance problem of TCollection_ASciiString and OCCT memory manager

Issue History

Date Modified Username Field Change
2011-09-05 10:44 szy New Issue
2011-09-05 10:44 szy Assigned To => abv
2011-09-21 11:10 bugmaster Target Version 6.5.2 => 6.5.3
2011-09-22 16:50 szy Status new => assigned
2012-02-09 09:22 abv Target Version 6.5.3 => 6.5.4
2012-02-09 09:24 abv Target Version 6.5.4 => Unscheduled
2013-06-18 17:41 abv Assigned To abv => eap
2017-01-13 19:30 kgv Note Added: 0062595
2017-01-13 19:30 kgv Assigned To eap => abv
2017-01-13 19:30 kgv Status assigned => feedback
2017-01-13 19:30 kgv Resolution open => no change required
2017-01-13 19:30 kgv Target Version Unscheduled => 7.2.0
2017-01-17 08:32 abv Description Updated
2017-01-17 09:10 abv Note Added: 0062671
2017-01-17 09:11 abv Note Edited: 0062671
2017-01-17 09:33 git Note Added: 0062673
2017-01-17 11:22 kgv Summary Performance problem of TCollection_ASciiString and OCCT memory manager => Performance problem of TCollection_AsciiString and OCCT memory manager
2017-01-22 16:14 abv Target Version 7.2.0 => Unscheduled