View Issue Details

IDProjectCategoryView StatusLast Update
0031670CommunityOCCT:Data Exchangepublic2020-12-02 17:13
Reporterrobertlipman Assigned Tobugmaster  
PrioritynormalSeverityminor 
Status closedResolutionfixed 
Product Version7.3.0 
Target Version7.5.0Fixed in Version7.5.0 
Summary0031670: Data Exchange - cp1251 Cyrillic characters in STEP file
DescriptionCyrillic characters in STEP files are not always correctly displayed by CAD Assistant. I cannot figure out why sometimes it works and other times it doesn't. This issue might be similar to other OCC issues regarding text characters in STEP files.
Steps To ReproduceTwo STEP files are attached in a zip file. Import them to CAD Assistant. In the Model Browser, block-russian.stp displays the correct characters. russian.stp does not.
TagsNo tags attached.
Test case numberNot required

Attached Files

  • russian-step.zip (510,774 bytes)
  • cadass_step_locale.png (66,888 bytes)
  • cadass_step_win1251.png (78,994 bytes)

Relationships

related to 0028454 closedbugmaster Community Data Exchange, STEP reader - names with special characters cannot be read 
related to 0031589 closedrobertlipman Community Data Exchange - unable to read STEP file containing mangled characters 
related to 0014673 closedbugmaster Open CASCADE Provide true support for Unicode symbols 

Activities

robertlipman

2020-07-16 23:26

reporter  

russian-step.zip (510,774 bytes)

kgv

2020-07-24 12:56

developer  

cadass_step_locale.png (66,888 bytes)

kgv

2020-07-24 12:56

developer  

cadass_step_win1251.png (78,994 bytes)

kgv

2020-07-24 13:03

developer   ~0093288

> I cannot figure out why sometimes it works and other times it doesn't
This is because block-russian.stp is encoded in UTF-8, while russian.STEP is encoded into cp1251. STEP format does not provide information about used text encoding (it is supposed to be UTF-8) and OCCT STEP reader has no logic automatically detecting encodings nor complete list of conversion tables.

So far, it is only possible specifying "System locale" as alternative to "UTF-8" to STEP translator, but this will work only if STEP file is opened on Windows with exactly in the same locale as where it has been written, and will corrupt any other encoding (legacy way to encode text files before UTF-8 become used everywhere).

git

2020-09-30 20:20

administrator   ~0095504

Branch CR31670_1 has been created by dpasukhi.

SHA-1: 66b5d844113f1542fc095b4dd7a28ee36cfad48c


Detailed log of new commits:

Author: dpasukhi
Date: Wed Sep 30 15:54:25 2020 +0300

    0031670: Data Exchange - cp1251 Cyrillic characters in STEP file
    
    Add support of ANSI

git

2020-09-30 20:21

administrator   ~0095505

Branch CR31670_2 has been created by dpasukhi.

SHA-1: 8d1fe9f9d48b578dffe5ec5ab598f53ba8374157


Detailed log of new commits:

Author: dpasukhi
Date: Wed Sep 30 15:54:25 2020 +0300

    0031670: Data Exchange - cp1251 Cyrillic characters in STEP file
    
    Create code pages tables.

git

2020-09-30 20:22

administrator   ~0095506

Branch CR31670_3 has been created by dpasukhi.

SHA-1: 74de4b25616dcb8e0b4eed26474a2a423a707921


Detailed log of new commits:

Author: dpasukhi
Date: Wed Sep 30 15:54:25 2020 +0300

    0031670: Data Exchange - cp1251 Cyrillic characters in STEP file
    
    Add support of ANSI

git

2020-10-02 15:52

administrator   ~0095582

Branch CR31670_1 has been deleted by dpasukhi.

SHA-1: 66b5d844113f1542fc095b4dd7a28ee36cfad48c

git

2020-10-02 15:52

administrator   ~0095583

Branch CR31670_2 has been deleted by dpasukhi.

SHA-1: 8d1fe9f9d48b578dffe5ec5ab598f53ba8374157

git

2020-10-02 15:53

administrator   ~0095584

Branch CR31670_3 has been deleted by dpasukhi.

SHA-1: 74de4b25616dcb8e0b4eed26474a2a423a707921

git

2020-10-02 15:56

administrator   ~0095586

Branch CR31670_nocpp has been created by dpasukhi.

SHA-1: 3f64f459be7dcfe1275297c042652147dc3e946a


Detailed log of new commits:

Author: dpasukhi
Date: Wed Sep 30 15:54:25 2020 +0300

    0031670: Data Exchange - cp1251 Cyrillic characters in STEP file
    
    Add support of ANSI

git

2020-10-02 15:56

administrator   ~0095587

Branch CR31670_cpp11 has been created by dpasukhi.

SHA-1: 12f523fab779672c3187716d251b00a17cc0fc36


Detailed log of new commits:

Author: dpasukhi
Date: Wed Sep 30 15:54:25 2020 +0300

    0031670: Data Exchange - cp1251 Cyrillic characters in STEP file
    
    Add support of ANSI encoding

git

2020-10-02 15:57

administrator   ~0095588

Branch CR31670_table has been created by dpasukhi.

SHA-1: 7e54bf5c1322b36a3c817c695a0050c88f2c4d63


Detailed log of new commits:

Author: dpasukhi
Date: Wed Sep 30 15:54:25 2020 +0300

    0031670: Data Exchange - cp1251 Cyrillic characters in STEP file
    
    Add support of ANSI

git

2020-10-03 18:13

administrator   ~0095701

Branch CR31670_1 has been created by dpasukhi.

SHA-1: 9b63b0de37c9649b92cfa139cf0857b514770164


Detailed log of new commits:

Author: dpasukhi
Date: Wed Sep 30 15:54:25 2020 +0300

    0031670: Data Exchange - cp1251 Cyrillic characters in STEP file
    
    Add support for converting pages from Windows encoding to UTF-8

git

2020-10-05 17:07

administrator   ~0095734

Branch CR31670_2 has been created by dpasukhi.

SHA-1: 295c52b2f6cbf8d2d94667df66199d29e09d30af


Detailed log of new commits:

Author: dpasukhi
Date: Wed Sep 30 15:54:25 2020 +0300

    0031670: Data Exchange - cp1251 Cyrillic characters in STEP file
    
    Add support for converting pages from Windows encoding to Unicode

git

2020-10-05 19:20

administrator   ~0095738

Branch CR31670_2 has been updated forcibly by dpasukhi.

SHA-1: 79d3c6e37822ba6babc05019d85c0f4c1aa8620c

git

2020-10-06 12:07

administrator   ~0095753

Branch CR31670_2 has been updated forcibly by dpasukhi.

SHA-1: c63dda117bf55437cc83b846e165a1017d23cb4b

git

2020-10-06 15:41

administrator   ~0095761

Branch CR31670_3 has been created by dpasukhi.

SHA-1: e52364f53e25d3923e3d2f568534f15960e240d2


Detailed log of new commits:

Author: dpasukhi
Date: Wed Sep 30 15:54:25 2020 +0300

    0031670: Data Exchange - cp1251 Cyrillic characters in STEP file
    
    Add support for converting pages from Windows encoding to Unicode

kgv

2020-10-06 16:13

developer   ~0095763

Last edited: 2020-10-06 16:14

Please raise the patch
- OCCT branch: CR31670_3.

http://jenkins-test-12.nnov.opencascade.com:8080/view/CR31670_2-master-dpasukhi

bugmaster

2020-10-07 16:00

administrator   ~0095787

Combination -
OCCT branch : OCCT-750-BETA
master SHA - 278da162dc52c26c8cfe9d002a6f07db12405194
a206de37fbfa0bf71bd534ae47192bbec23b8522
Products branch : OCCT-750-BETA SHA - d9c364e1137eed3249e5a05befa860c708f243c0
was compiled on Linux, MacOS and Windows platforms and tested in optimize mode.

Number of compiler warnings:
No new/fixed warnings

Regressions/Differences/Improvements:
No regressions/differences

CPU differences:
Debian80-64:
OCCT
Total CPU difference: 18038.89000000012 / 18085.73000000008 [-0.26%]
Products
Total CPU difference: 12182.170000000115 / 12217.490000000116 [-0.29%]
Windows-64-VC14:
OCCT
Total CPU difference: 19726.21875 / 19713.9375 [+0.06%]
Products
Total CPU difference: 13586.625 / 13579.390625 [+0.05%]


Image differences :
No differences that require special attention

Memory differences :
No differences that require special attention

git

2020-10-08 11:01

administrator   ~0095793

Branch CR31670_3 has been deleted by inv.

SHA-1: e52364f53e25d3923e3d2f568534f15960e240d2

git

2020-10-08 11:01

administrator   ~0095794

Branch CR31670_2 has been deleted by inv.

SHA-1: c63dda117bf55437cc83b846e165a1017d23cb4b

git

2020-10-08 11:01

administrator   ~0095800

Branch CR31670_1 has been deleted by inv.

SHA-1: 9b63b0de37c9649b92cfa139cf0857b514770164

git

2020-10-08 11:01

administrator   ~0095805

Branch CR31670_nocpp has been deleted by inv.

SHA-1: 3f64f459be7dcfe1275297c042652147dc3e946a

git

2020-10-08 11:01

administrator   ~0095806

Branch CR31670_cpp11 has been deleted by inv.

SHA-1: 12f523fab779672c3187716d251b00a17cc0fc36

git

2020-10-08 11:01

administrator   ~0095807

Branch CR31670_table has been deleted by inv.

SHA-1: 7e54bf5c1322b36a3c817c695a0050c88f2c4d63

Related Changesets

occt: master baf60a87

2020-09-30 12:54:25

dpasukhi


Committer: bugmaster Details Diff
0031670: Data Exchange - cp1251 Cyrillic characters in STEP file

Add support for converting pages from Windows encoding to Unicode
Affected Issues
0031670
mod - src/Resource/FILES Diff File
add - src/Resource/Resource_ANSI.pxx Diff File
mod - src/Resource/Resource_ConvertUnicode.c Diff File
rm - src/Resource/Resource_DataMapIteratorOfDataMapOfAsciiStringAsciiString.hxx Diff File
rm - src/Resource/Resource_DataMapIteratorOfDataMapOfAsciiStringExtendedString.hxx Diff File
mod - src/Resource/Resource_DataMapOfAsciiStringAsciiString.hxx Diff File
mod - src/Resource/Resource_DataMapOfAsciiStringExtendedString.hxx Diff File
mod - src/Resource/Resource_FormatType.hxx Diff File
mod - src/Resource/Resource_Manager.cxx Diff File
mod - src/Resource/Resource_Unicode.cxx Diff File
mod - src/Resource/Resource_Unicode.hxx Diff File
mod - src/STEPCAFControl/STEPCAFControl_Controller.cxx Diff File
add - tests/bugs/step/bug31670 Diff File
add - tests/bugs/step/bug31670_1 Diff File

Issue History

Date Modified Username Field Change
2020-07-16 23:26 robertlipman New Issue
2020-07-16 23:26 robertlipman Assigned To => gka
2020-07-16 23:26 robertlipman File Added: russian-step.zip
2020-07-24 12:38 kgv Summary Cyrillic characters in STEP file => Data Exchange - Cyrillic characters in STEP file
2020-07-24 12:40 kgv Summary Data Exchange - Cyrillic characters in STEP file => Data Exchange - Windows-1251 Cyrillic characters in STEP file
2020-07-24 12:55 kgv Relationship added related to 0028454
2020-07-24 12:56 kgv File Added: cadass_step_locale.png
2020-07-24 12:56 kgv File Added: cadass_step_win1251.png
2020-07-24 13:03 kgv Note Added: 0093288
2020-07-24 13:03 kgv Relationship added related to 0031589
2020-07-24 13:03 kgv Summary Data Exchange - Windows-1251 Cyrillic characters in STEP file => Data Exchange - cp1251 Cyrillic characters in STEP file
2020-08-26 12:12 gka Assigned To gka => dpasukhi
2020-09-30 20:20 git Note Added: 0095504
2020-09-30 20:21 git Note Added: 0095505
2020-09-30 20:22 git Note Added: 0095506
2020-10-02 15:52 git Note Added: 0095582
2020-10-02 15:52 git Note Added: 0095583
2020-10-02 15:53 git Note Added: 0095584
2020-10-02 15:56 git Note Added: 0095586
2020-10-02 15:56 git Note Added: 0095587
2020-10-02 15:57 git Note Added: 0095588
2020-10-03 15:54 abv Target Version => 7.5.0
2020-10-03 18:13 git Note Added: 0095701
2020-10-05 17:07 git Note Added: 0095734
2020-10-05 19:20 git Note Added: 0095738
2020-10-06 12:07 git Note Added: 0095753
2020-10-06 15:41 git Note Added: 0095761
2020-10-06 16:13 kgv Note Added: 0095763
2020-10-06 16:13 kgv Assigned To dpasukhi => bugmaster
2020-10-06 16:13 kgv Status new => resolved
2020-10-06 16:13 kgv Status resolved => reviewed
2020-10-06 16:14 kgv Note Edited: 0095763
2020-10-06 22:49 abv Relationship added related to 0014673
2020-10-07 16:00 bugmaster Note Added: 0095787
2020-10-07 16:00 bugmaster Status reviewed => tested
2020-10-07 16:16 bugmaster Test case number => Not required
2020-10-07 16:17 bugmaster Changeset attached => occt master baf60a87
2020-10-07 16:17 bugmaster Status tested => verified
2020-10-07 16:17 bugmaster Resolution open => fixed
2020-10-08 11:01 git Note Added: 0095793
2020-10-08 11:01 git Note Added: 0095794
2020-10-08 11:01 git Note Added: 0095800
2020-10-08 11:01 git Note Added: 0095805
2020-10-08 11:01 git Note Added: 0095806
2020-10-08 11:01 git Note Added: 0095807
2020-12-02 16:22 emo Fixed in Version => 7.5.0
2020-12-02 17:13 emo Status verified => closed