MantisBT - Open CASCADE
View Issue Details
0031851Open CASCADE[OCCT] OCCT:Data Exchangepublic2020-10-14 08:532020-10-25 20:41
abv 
bugmaster 
normalfeature 
verifiedfixed 
 
[OCCT] 7.5.0 
bugs/xde/bug31851
0031851: Data Exchange, STEP - enable Unicode symbols in STEP export
Current version of OCCT converts any Unicode (non-Ascii) symbol to '?' when exporting strings (e.g. names of shapes in XDE document) to STEP.

This should be corrected - Unicode symbols should be written either in UTF-8 encoding (allowed by modern edition of STEP ISO10303-21, see e.g. http://www.steptools.com/stds/step/IS_final_p21e3.html [^]) or using legacy /X2/ or /X4/ control directives (see 0028454).

Note that this change will break a non-conformant but still usable approach possible now - use of local 8-bit encoding for strings.
pload XDE
ReadStep D [locate_data_file utf8.stp]
WriteStep D $imagedir/utf8_saved.stp
ReadStep T $imagedir/utf8_saved.stp

set arabic [GetName T 0:1:1:1]
if { $arabic != "\u0627\u0644\u0645\u062C\u0633\u0645" } {
   puts "Error: arabic symbols get lost"
}
To convert Unicode string to escaped representation, suitable for placing into pure Ascii Tcl script, online services can be used, for instance
https://www.freeformatter.com/java-dotnet-escape.html [^]
No tags attached.
related to 0028454verified bugmaster Community Data Exchange, STEP reader - names with special characters cannot be read 
has duplicate 0031182verified gka Community Data Exchange - STEP export unicode name as "???" 
child of 0022484closed bugmaster Open CASCADE UNICODE characters support. 
? utf8.stp (442,283) 2020-10-14 08:53
https://tracker.dev.opencascade.org/
Issue History
2020-10-14 08:53abvNew Issue
2020-10-14 08:53abvAssigned To => gka
2020-10-14 08:53abvFile Added: utf8.stp
2020-10-14 08:54abvAssigned Togka => abv
2020-10-14 08:54abvTarget Version7.6.0* => 7.5.0
2020-10-14 08:54abvRelationship addedrelated to 0028454
2020-10-14 08:56abvAdditional Information Updatedbug_revision_view_page.php?rev_id=23772#r23772
2020-10-14 09:11gitNote Added: 0095946
2020-10-14 09:14kgvSteps to Reproduce Updatedbug_revision_view_page.php?rev_id=23774#r23774
2020-10-14 09:31kgvRelationship addedchild of 0022484
2020-10-14 09:46kgvNote Added: 0095950
2020-10-14 10:00gitNote Added: 0095953
2020-10-14 12:14kgvSeverityminor => feature
2020-10-14 12:16gitNote Added: 0095957
2020-10-14 12:18gitNote Added: 0095958
2020-10-14 15:39abvNote Added: 0095977
2020-10-14 15:39abvAssigned Toabv => gka
2020-10-14 15:39abvStatusnew => resolved
2020-10-14 15:42gkaNote Added: 0095978
2020-10-14 15:42gkaAssigned Togka => bugmaster
2020-10-14 15:42gkaStatusresolved => reviewed
2020-10-15 08:24abvRelationship addedrelated to 0030694
2020-10-17 13:22bugmasterNote Added: 0096035
2020-10-17 13:22bugmasterStatusreviewed => tested
2020-10-17 13:31bugmasterTest case number => bugs/xde/bug31851
2020-10-17 13:31bugmasterChangeset attached => occt master ae9f4b64
2020-10-17 13:31bugmasterStatustested => verified
2020-10-17 13:31bugmasterResolutionopen => fixed
2020-10-24 12:41gitNote Added: 0096222
2020-10-24 12:41gitNote Added: 0096223
2020-10-25 20:41abvRelationship addedhas duplicate 0031182

Notes
(0095946)
git   
2020-10-14 09:11   
Branch CR31851 has been created by abv.

SHA-1: 760bb85212c6761ee065667df4c17b11062c58e3


Detailed log of new commits:

Author: abv
Date: Wed Oct 14 09:14:04 2020 +0300

    0031851: Data Exchange, STEP - enable Unicode symbols in STEP export
    
    Class STEPCAFControl_Writer is corrected to avoid replacing non-Ascii symbols by question marks on export to STEP.
    
    Related: DRAW commands dealing with strings in OCAF documents are corrected to pass Unicode symbols as UTF-8.
    
    Off-topic: code saving names of external STEP files in XDE and fetching them back is corrected to preserve Unicode symbols as UTF-8.
    
    Added test bugs xde bug31851
(0095950)
kgv   
2020-10-14 09:46   
Could you please remove redundant .ToCString() / .ToExtString() calls around affected lines?
(0095953)
git   
2020-10-14 10:00   
Branch CR31851 has been updated by abv.

SHA-1: 712730c1ea3f5c8956d311a6d0deee7a19214ccc


Detailed log of new commits:

Author: abv
Date: Wed Oct 14 10:03:48 2020 +0300

    # review remarks

(0095957)
git   
2020-10-14 12:16   
Branch CR31851 has been updated by abv.

SHA-1: a453c85574f19fa2f3b337bcbff3ce109de0a34b


Detailed log of new commits:

Author: abv
Date: Wed Oct 14 12:17:18 2020 +0300

    Test de step_4 E7 corrected (no more replacement of spaces by underscores in names of layers)
    
    On previous commit:
    Related: avoid replacing spaces by underscores in strings on reading from STEP.

(0095958)
git   
2020-10-14 12:18   
Branch CR31851_1 has been created by abv.

SHA-1: 5847fc2de29ea225f646b7ed84a924315655e846


Detailed log of new commits:

Author: abv
Date: Wed Oct 14 09:14:04 2020 +0300

    0031851: Data Exchange, STEP - enable Unicode symbols in STEP export
    
    Class STEPCAFControl_Writer is corrected to avoid replacing non-Ascii symbols by question marks, and spaces by underscores, on export to STEP.
    
    Related: DRAW commands dealing with strings in OCAF documents are corrected to pass Unicode symbols as UTF-8.
    
    Off-topic: code saving names of external STEP files in XDE and fetching them back is corrected to preserve Unicode symbols as UTF-8.
    
    Added test bugs xde bug31851
    
    Test de step_4 E7 corrected (no more replacement of spaces by underscores in names of layers)
(0095977)
abv   
2020-10-14 15:39   
Fix is pushed to branch CR31851 and tested; see Jenkins job CR31851-abv; please review
(0095978)
gka   
2020-10-14 15:42   
Branch CR31851_1 was reviewed
(0096035)
bugmaster   
2020-10-17 13:22   
Combination -
OCCT branch : IR-2020-10-16
master SHA - ae9f4b64cacf0df612944b3694a3bdfa5f1f29cf
a206de37fbfa0bf71bd534ae47192bbec23b8522
Products branch : IR-2020-10-16 SHA - fcb5abe005e152f7f923f4cf6c02acb07c027cdc
was compiled on Linux, MacOS and Windows platforms and tested in optimize mode.

Number of compiler warnings:
No new/fixed warnings

Regressions/Differences/Improvements:
No regressions/differences

CPU differences:
Debian80-64:
OCCT
Total CPU difference: 18027.820000000127 / 18057.130000000117 [-0.16%]
Products
Total CPU difference: 12174.330000000093 / 12182.170000000115 [-0.06%]
Windows-64-VC14:
OCCT
Total CPU difference: 19740.03125 / 19746.828125 [-0.03%]
Products
Total CPU difference: 13564.71875 / 13586.625 [-0.16%]


Image differences :
No differences that require special attention

Memory differences :
No differences that require special attention
(0096222)
git   
2020-10-24 12:41   
Branch CR31851_1 has been deleted by inv.

SHA-1: 5847fc2de29ea225f646b7ed84a924315655e846
(0096223)
git   
2020-10-24 12:41   
Branch CR31851 has been deleted by inv.

SHA-1: a453c85574f19fa2f3b337bcbff3ce109de0a34b