MantisBT
Mantis Bug Tracker Workflow

View Issue Details Jump to Notes ] Related Changesets ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0022484Open CASCADE[OCCT] OCCT:Foundation Classespublic2011-05-10 12:402019-03-02 22:55
Reporterepv 
Assigned Tobugmaster 
PrioritynormalSeverityfeature 
StatusclosedResolutionfixed 
PlatformOSAllOS Version
Product Version 
Target Version[OCCT] 6.8.0Fixed in Version[OCCT] 6.8.0 
Summary0022484: UNICODE characters support.
DescriptionOpenCascade uses primarily ASCII encoding and doesn’t fully support Unicode
characters, especially that is concerned working with files and directories. It
makes real restriction on development of modern application that must work with
non-ASCII file names as well. So, Unicode encoding support should be added.
Additional information
and documentation updates
UNICODE characters support in OCCT

The main idea of the implemented improvement is that API is kept untouched. Instead behavior of all functions that accept on input Standard_CString is changed, so that now the strings are assumed to be in UTF-8 encoding.

The constructor of TCollection_ExtendedString is used to convert UTF-8 strings to wide characters, which are then cast directly to wchar_t* and passed to appropriate system functions that take wchar_t* on input; for example, _wopen instead of open.

Note that this change will break backward compatibility with applications which currently use filenames in extended ASCII encoding bound to the current locale. Such applications should be updated to convert such strings to UTF-8 format.

The patch has been implemented for WNT platform only; other platforms remain supporting only ASCII encoding.

The conversion from UTF-8 to wchar_t is made using little-endian approach. Thus, this code will not work correctly on big-endian platforms. It is needed to complete this in the way similar as it is done for binary persistence (see the macro DO_INVERSE in FSD_FileHeader.hxx).
TagsNo tags attached.
Test case numberWill be created
Attached Fileszip file icon OCC22484_v1_epv.zip (204,990 bytes) 2011-05-10 10:43

- Relationships
related to 0024716closedbugmaster Open CASCADE OSD_Path - remove excessive validity checks and allow non-ascii strings 
related to 0025302closedabv Open CASCADE Incorrect locale and unicode support in Draw console 
parent of 0025369closedbugmaster Open CASCADE Visualization, Image_AlienPixMap - handle UTF-8 names in image read/save operations on Windows 
parent of 0025367closedbugmaster Open CASCADE IGES and BRep persistence - support unicode file names on Windows 
parent of 0026380closedbugmaster Open CASCADE OSD_SharedLibrary - handle UTF-8 file paths 
parent of 0027198closedabv Open CASCADE OSD_Environment - use wide characters API on Windows 
parent of 0027675closedbugmaster Open CASCADE Foundation Classes - handle Unicode path to CSF_UnitsLexicon and CSF_UnitsDefinition on Windows 
parent of 0027676closedbugmaster Open CASCADE Foundation Classes - define Standard_ExtCharacter, Standard_Utf16Char using C++11 types char16_t 
parent of 0027838closedkgv Open CASCADE Foundation Classes - support wchar_t* input within TCollection_AsciiString and TCollection_ExtendedString 
parent of 0027880closedbugmaster Open CASCADE Samples - fix handling of Unicode paths within MFC import/export sample 
parent of 0025534closedbugmaster Community TObj_Application unicode path issue. 
parent of 0028110closedapn Open CASCADE Configuration - specify Unicode charset instead of multibyte in project files for Visual Studio 
parent of 0028353closedapn Community Samples - IESample cannot write files to paths with special characters 
parent of 0028454assignedgka Community Data Exchange - Names with Special Characters Cannot Be Read from STEP or IGES Files 
parent of 0029069closedapn Community Samples - handle UNICODE filenames within C++/CLI CSharp sample 
related to 0022125closedbugmaster Open CASCADE TCollection_ExtendedString: conversion from UTF-8 to unicode 
related to 0024943closedbugmaster Open CASCADE Port MFC sample to UNICODE for compatibility with VS2013 
related to 0025308newpdn Open CASCADE TCollection_ExtendedString, NCollection_String - merge classes for string management 
related to 0026514closedbugmaster Open CASCADE OSD_Path can not work with French symbols in file name. 
related to 0028172closedkgv Community Replace Standard_CString file path with Unicode form TCollection_ExtendedString 
child of 0014673assignedabv Open CASCADE Provide true support for Unicode symbols 
Not all the children of this issue are yet resolved or closed.

-  Notes
(0024708)
pdn (developer)
2013-06-10 11:33

Current implementation uses char* to store UTF-8 string. The API of OCC file classes and methods was not modified. The string is analyzed and in case of UFT-8 code converted to be used in appropriate system functions.

Current implementation finalized only for Windows platform.
(0024728)
pdn (developer)
2013-06-11 12:19

The first version of fix put into CR22484.

Please review and send back
(0024739)
kgv (developer)
2013-06-13 10:01

Dear pdn,

I have following remarks to the patch (after preliminary review).
> #ifdef WNT
WNT macro definition is deprecated. Please use appropriate build-in macros instead (_MSC_VER or _WIN32 depending on situation).

> myStream.open( (const wchar_t*) aWName.ToExtString(),ios::in); // ios::nocreate is not portable
Please remove irrelevant comments from copy-pasted lines.

> + TCollection_ExtendedString aFileNameW(aFileName, Standard_True);
> + TCollection_ExtendedString dirNameW(dirName);
If I understood correctly, TCollection_ExtendedString constructor parses

> +/* LD : We do not need this routine any longer : */
> +/* Dont remove a no empty directory */
> +
> +
> +#if 0
Big unidentified piece of commented code.

General remark - this patch broke backward compatibility with applications which currently uses filenames in current locale.
(0024744)
msv (developer)
2013-06-13 12:53

Remarks:
1) It is needed also to make corrections in OCCT Products, like DXF.
2) TCollection_ExtendedString. In ConvertToUnicode2B and ConvertToUnicode3B, the code is stick to little endian. It will be inconsistent for big endian platforms.
3) The following files contain changes not relevant to this bug. It is needed to create another bug for that, and make a version here that contains only relevant changes.
BinLDrivers_DocumentRetrievalDriver.cdl
BinLDrivers_DocumentRetrievalDriver.cxx
4) Why did you add commented routine DeleteDirectory in OSD_WNT.cxx appeared again? Please, remove.
5) The same is with the function SetDeleteDirectoryProc.
6) Remove declaration of the following functions in OSD_WNT_1.hxx:
DeleteDirectory
SetDeleteDirectoryProc
DirWalk
MsgBox
WNT_InitTimer
WNT_StatTimer
_debug_break
7) The changes concerning "locale" in PCDM_RetrievalDriver.cxx are incorrect (return back to an older version).

Concerning general remark of KGV, I agree. We need to clearly declare it in release notes.
(0032000)
git (administrator)
2014-09-23 14:05

Branch CR22484 has been updated forcibly by pdn.

SHA-1: a5aaaf3a2c80ef36b4d33c9e070213fd118d058b
(0032076)
abv (manager)
2014-09-24 12:21

Minor remarks:
- in Draw_Interpretor.cxx, local class TclUTFToLocalStringSentry becomes unused, please remove it
- change in PCDM.cdl is wrong, please revert (these two enums have been removed recently, see 0024180)
(0032078)
abv (manager)
2014-09-24 12:28

I have added description of the patch prepared by msv some time ago (edited) in Additional Information field
(0032079)
git (administrator)
2014-09-24 12:39

Branch CR22484 has been updated forcibly by pdn.

SHA-1: 5a4311328b5af737e0402999152f948a902ad8ce
(0032090)
abv (manager)
2014-09-24 14:35

No remarks, please check building on Linux, Windows 64-bit, then test
(0032129)
apv (tester)
2014-09-25 12:38

Dear BugMaster,

Branch CR22484 was compiled on Linux and following compilation errors were detected:
http://jenkins-test-02.nnov.opencascade.com/user/mnt/my-views/view/CR22484/job/mnt-CR22484-master_build_occt_linux/1/parsed_console/ [^]

1. ../../../../src/FSD/FSD_BinaryFile.cxx:84:8: error: macro names must be identifiers

2. ../../../../src/FSD/FSD_File.cxx:76:8: error: macro names must be identifiers

Branch CR22484 was compiled on Windows and following compilation errors were detected:
http://jenkins-test-02.nnov.opencascade.com/user/mnt/my-views/view/CR22484/job/mnt-CR22484-master_build_occt_windows/1/parsed_console/ [^]

1. ..\..\..\src\FSD\FSD_BinaryFile.cxx(84): fatal error C1016: #if[n]def expected an identifier

2. ..\..\..\src\FSD\FSD_File.cxx(76): fatal error C1016: #if[n]def expected an identifier

3. ..\..\..\src\Message\Message_MsgFile.cxx(217): fatal error C1016: #if[n]def expected an identifier

4. ..\..\..\src\LDOM\LDOMParser.cxx(138): fatal error C1016: #if[n]def expected an identifier

5. ..\..\..\src\BinLDrivers\BinLDrivers_DocumentRetrievalDriver.cxx(182): fatal error C1016: #if[n]def expected an identifier

6. ..\..\..\src\StepFile\stepread.c(86): fatal error C1016: #if[n]def expected an identifier
(0032144)
git (administrator)
2014-09-25 14:29

Branch CR22484 has been updated by pdn.

SHA-1: e3249fb86154833d355b08ea31f2e0ff9d1c5e30


Detailed log of new commits:

Author: pdn
Date: Thu Sep 25 14:29:15 2014 +0400

    Fix for compilation errors and fix for StepFile (avoid objects in pure c code)

(0032145)
pdn (developer)
2014-09-25 14:29

Fixed
(0032224)
apv (tester)
2014-09-26 13:57
edited on: 2014-09-26 13:57

Dear BugMaster,

Branch CR22484 (and products from GIT master) was compiled on Linux, MacOS and Windows platforms and tested.
SHA-1: e3249fb86154833d355b08ea31f2e0ff9d1c5e30

Number of compiler warnings:
occt component:
   Linux: 15 (15 on master)
   Windows: 0 (0 on master)
   MacOS: 193 (193 on master)
products component :
   Linux: 11 (11 on master)
   Windows: 1 (1 on master)

Regressions/Differences:
http://occt-tests/CR22484-master-occt/Windows-32-VC10/summary.html [^]
bugs caf(015) bug170_3

Testing cases:
Not done

Testing on Linux:
Total MEMORY difference: 356246768 / 355468788
Total CPU difference: 45192.93000000016 / 44818.88000000009

Testing on Windows:
Total MEMORY difference: 245490128 / 242252872
Total CPU difference: 34173.109375 / 34303.3125

There are differences in images found by testdiff:
http://occt-tests/CR22484-master-occt/Debian60-64/diff-Debian60-64.html [^]
http://occt-tests/CR22484-master-occt/Windows-32-VC10/diff-Windows-32-VC10.html [^]
Pay attention to: bugs vis bug22796_2

(0032454)
git (administrator)
2014-09-30 13:23

Branch CR22484 has been updated by pdn.

SHA-1: 57a388503997f41b94212f37673af17d427bdf83


Detailed log of new commits:

Author: pdn
Date: Tue Sep 30 13:23:30 2014 +0400

    Fixes for set unicode symbols to OCAF and visualization

(0032455)
pdn (developer)
2014-09-30 13:24

Fixed, please review and retest
(0032492)
abv (manager)
2014-10-01 09:59

No remarks, please test
(0032516)
git (administrator)
2014-10-01 16:29

Branch CR22484 has been updated forcibly by apv.

SHA-1: c59407add5ab4f2225ad5fe740997438e9a7c628
(0032573)
apv (tester)
2014-10-02 14:07

Dear BugMaster,

Branch CR22484 (and products from GIT master) was compiled on Linux, MacOS and Windows platforms and tested.
SHA-1: c59407add5ab4f2225ad5fe740997438e9a7c628

Number of compiler warnings:
occt component:
   Linux: 15 (15 on master)
   Windows: 0 (0 on master)
   MacOS: 196 (196 on master)
products component :
   Linux: 11 (11 on master)
   Windows: 3 (3 on master)

Regressions/Differences:
Not detected

Testing cases:
Will be created

Testing on Linux:
Total MEMORY difference: 398316920 / 397657468
Total CPU difference: 47295.91000000001 / 46596.08000000006

Testing on Windows:
Total MEMORY difference: 279461580 / 279221480
Total CPU difference: 38884.5 / 39387.71875

There are differences in images found by testdiff:
http://occt-tests/CR22484-master-occt/Debian60-64/diff-Debian60-64.html [^]
http://occt-tests/CR22484-master-occt/Windows-32-VC10/diff-Windows-32-VC10.html [^]
Pay attention to: bugs vis bug22149
(0032578)
git (administrator)
2014-10-02 14:16

Branch CR22484 has been updated by abv.

SHA-1: a27569a567728ab3f99677ecb9a1656d02a22d5c


Detailed log of new commits:

Author: abv
Date: Thu Oct 2 14:15:14 2014 +0400

    Definition of Unicode symbol in test corrected

(0032579)
abv (manager)
2014-10-02 14:18

I have corrected test bugs vis bug22149, please check the image
(0032581)
apv (tester)
2014-10-02 14:50

Dear BugMaster,

Updated test-case bugs viz bug22149 has been relaunched on Linux and Windows platforms. Results on both platforms are OK.

http://occt-tests/CR22484-master-occt/Debian60-64/bugs/vis/bug22149.html [^]
http://occt-tests/CR22484-master-occt/Windows-32-VC10/bugs/vis/bug22149.html [^]
(0033047)
shoogen (reporter)
2014-10-13 23:15

I have problems to understand where I need to change the string handling in my program. For example BRepTools::Write still call ofstream.open with an Standandard_CString instead of a wide string.
(0033048)
abv (manager)
2014-10-13 23:25

It seems the solution currently implemented covers only STEP, OCCT binary persistence, and message resources. Reading and writing BREP and IGES seems to still not support Unicode on Windows. Pavel, can you confirm this?
(0033058)
pdn (developer)
2014-10-14 11:52

Yes, you are right. The additional bug will be created to update IGES and BRep
(0033446)
git (administrator)
2014-10-21 16:43

Branch CR22484 has been deleted by inv.

SHA-1: a27569a567728ab3f99677ecb9a1656d02a22d5c
(0034088)
pdn (developer)
2014-11-06 11:59

Please use the following lines to make a test:

box b 10 10 10
set s [encoding convertfrom unicode \x1F\x04\x40\x04\x35\x04\x32\x04\x35\x04\x34\x04]
stepwrite a b $s.stp
# file should be created
stepread $s.stp a *

- Related Changesets
occt: master d9ff84e8
Timestamp: 2014-10-02 11:39:25
Author: pdn
Committer: bugmaster
Details ] Diff ]
0022484: UNICODE characters support

Initial UNICODE (UFT-8) characters support for OCCT file operations

Fix for compilation errors and fix for StepFile (avoid objects in pure c code)

Fixes for set unicode symbols to OCAF and visualization
mod - src/BinLDrivers/BinLDrivers_DocumentRetrievalDriver.cdl Diff ] File ]
mod - src/BinLDrivers/BinLDrivers_DocumentRetrievalDriver.cxx Diff ] File ]
mod - src/BinLDrivers/BinLDrivers_DocumentStorageDriver.cxx Diff ] File ]
mod - src/DDataStd/DDataStd_BasicCommands.cxx Diff ] File ]
mod - src/Draw/Draw_Interpretor.cxx Diff ] File ]
mod - src/FSD/FSD_BinaryFile.cxx Diff ] File ]
mod - src/FSD/FSD_CmpFile.cxx Diff ] File ]
mod - src/FSD/FSD_File.cxx Diff ] File ]
mod - src/LDOM/LDOMParser.cxx Diff ] File ]
mod - src/Message/Message_Msg.cxx Diff ] File ]
mod - src/Message/Message_MsgFile.cxx Diff ] File ]
mod - src/OSD/OSD_Directory.cxx Diff ] File ]
mod - src/OSD/OSD_DirectoryIterator.cxx Diff ] File ]
mod - src/OSD/OSD_Error.cxx Diff ] File ]
mod - src/OSD/OSD_File.cxx Diff ] File ]
mod - src/OSD/OSD_FileIterator.cxx Diff ] File ]
mod - src/OSD/OSD_FileNode.cxx Diff ] File ]
mod - src/OSD/OSD_Path.cxx Diff ] File ]
mod - src/OSD/OSD_Process.cxx Diff ] File ]
mod - src/OSD/OSD_signal_WNT.cxx Diff ] File ]
mod - src/OSD/OSD_WNT.cxx Diff ] File ]
mod - src/OSD/OSD_WNT_1.hxx Diff ] File ]
mod - src/PCDM/PCDM_ReadWriter.cxx Diff ] File ]
mod - src/PCDM/PCDM_ReadWriter_1.cxx Diff ] File ]
mod - src/PCDM/PCDM_RetrievalDriver.cxx Diff ] File ]
mod - src/StepFile/StepFile_Read.cxx Diff ] File ]
mod - src/StepFile/stepread.c Diff ] File ]
mod - src/TCollection/TCollection_AsciiString.cdl Diff ] File ]
mod - src/TCollection/TCollection_AsciiString.cxx Diff ] File ]
mod - src/TCollection/TCollection_ExtendedString.cdl Diff ] File ]
mod - src/TCollection/TCollection_ExtendedString.cxx Diff ] File ]
mod - src/UTL/UTL.cxx Diff ] File ]
mod - src/ViewerTest/ViewerTest_ObjectCommands.cxx Diff ] File ]
mod - tests/bugs/vis/bug22796_2 Diff ] File ]

- Issue History
Date Modified Username Field Change
2011-08-02 11:23 bugmaster Category OCCT:FDC => OCCT:Foundation Classes
2011-08-03 12:17 bugmaster Fixed in Version EMPTY =>
2011-08-03 12:17 bugmaster Target Version => 6.5.2
2011-08-03 12:17 bugmaster Description Updated View Revisions
2011-09-21 12:42 bugmaster Target Version 6.5.2 => 6.5.3
2011-12-05 10:43 abv Relationship added child of 0014673
2012-02-02 10:16 abv Target Version 6.5.3 => 6.5.4
2012-10-21 11:29 abv Target Version 6.5.4 => 6.6.0
2013-02-28 17:05 abv Target Version 6.6.0 => 6.7.0
2013-06-07 20:25 pdn Assigned To bugmaster => pdn
2013-06-07 20:25 pdn Status new => assigned
2013-06-09 20:54 san Relationship added related to 0023457
2013-06-09 20:59 san Relationship deleted related to 0023457
2013-06-10 11:33 pdn Note Added: 0024708
2013-06-11 12:19 pdn Note Added: 0024728
2013-06-11 12:19 pdn Assigned To pdn => msv
2013-06-11 12:19 pdn Status assigned => resolved
2013-06-13 10:01 kgv Note Added: 0024739
2013-06-13 12:53 msv Note Added: 0024744
2013-06-13 12:53 msv Assigned To msv => pdn
2013-06-13 12:53 msv Status resolved => assigned
2013-11-06 15:10 kgv Relationship added related to 0022125
2013-11-06 15:12 kgv Target Version 6.7.0 => 6.7.1
2014-02-13 11:08 kgv Relationship added related to 0024622
2014-04-04 18:16 abv Target Version 6.7.1 => 6.8.0
2014-07-12 17:37 kgv Relationship added related to 0024943
2014-09-10 08:19 kgv Target Version 6.8.0 => 7.0.0
2014-09-23 14:05 git Note Added: 0032000
2014-09-23 14:06 pdn Target Version 7.0.0 => 6.8.0
2014-09-23 14:06 pdn Assigned To pdn => abv
2014-09-23 14:06 pdn Status assigned => resolved
2014-09-24 12:21 abv Note Added: 0032076
2014-09-24 12:21 abv Assigned To abv => pdn
2014-09-24 12:21 abv Status resolved => assigned
2014-09-24 12:28 abv Note Added: 0032078
2014-09-24 12:28 abv Additional Information Updated View Revisions
2014-09-24 12:39 git Note Added: 0032079
2014-09-24 12:40 pdn Assigned To pdn => abv
2014-09-24 12:40 pdn Status assigned => resolved
2014-09-24 14:35 abv Note Added: 0032090
2014-09-24 14:35 abv Assigned To abv => bugmaster
2014-09-24 14:35 abv Status resolved => reviewed
2014-09-24 14:49 apv Assigned To bugmaster => apv
2014-09-24 15:01 kgv Relationship added related to 0024716
2014-09-25 12:38 apv Note Added: 0032129
2014-09-25 12:39 apv Assigned To apv => pdn
2014-09-25 12:39 apv Status reviewed => assigned
2014-09-25 14:29 git Note Added: 0032144
2014-09-25 14:29 pdn Note Added: 0032145
2014-09-25 14:29 pdn Assigned To pdn => abv
2014-09-25 14:29 pdn Status assigned => resolved
2014-09-25 14:30 pdn Status resolved => reviewed
2014-09-25 14:31 apv Assigned To abv => apv
2014-09-26 13:57 apv Note Added: 0032224
2014-09-26 13:57 apv Assigned To apv => pdn
2014-09-26 13:57 apv Status reviewed => assigned
2014-09-26 13:57 apv Note Edited: 0032224 View Revisions
2014-09-30 13:23 git Note Added: 0032454
2014-09-30 13:24 pdn Note Added: 0032455
2014-09-30 13:24 pdn Assigned To pdn => abv
2014-09-30 13:24 pdn Status assigned => resolved
2014-10-01 09:59 abv Note Added: 0032492
2014-10-01 09:59 abv Assigned To abv => bugmaster
2014-10-01 09:59 abv Status resolved => reviewed
2014-10-01 11:28 pdn Relationship added related to 0025302
2014-10-01 13:07 apv Assigned To bugmaster => apv
2014-10-01 13:12 kgv Relationship added related to 0025308
2014-10-01 16:29 git Note Added: 0032516
2014-10-02 14:04 apv Test case number => Will be created
2014-10-02 14:07 apv Note Added: 0032573
2014-10-02 14:07 apv Assigned To apv => pdn
2014-10-02 14:07 apv Status reviewed => assigned
2014-10-02 14:16 git Note Added: 0032578
2014-10-02 14:18 abv Note Added: 0032579
2014-10-02 14:18 abv Assigned To pdn => apv
2014-10-02 14:18 abv Status assigned => feedback
2014-10-02 14:50 apv Note Added: 0032581
2014-10-02 14:50 apv Assigned To apv => bugmaster
2014-10-02 14:50 apv Status feedback => tested
2014-10-03 14:07 bugmaster Changeset attached => occt master d9ff84e8
2014-10-03 14:07 bugmaster Status tested => verified
2014-10-03 14:07 bugmaster Resolution open => fixed
2014-10-13 23:15 shoogen Note Added: 0033047
2014-10-13 23:25 abv Note Added: 0033048
2014-10-14 11:52 pdn Note Added: 0033058
2014-10-14 16:44 kgv Relationship added parent of 0025369
2014-10-14 16:49 kgv Relationship added parent of 0025367
2014-10-21 16:43 git Note Added: 0033446
2014-11-06 11:59 pdn Note Added: 0034088
2014-11-11 12:42 aiv Fixed in Version => 6.8.0
2014-11-11 13:03 aiv Status verified => closed
2015-06-29 20:14 kgv Relationship added parent of 0026380
2015-08-03 20:55 kgv Relationship added related to 0026514
2016-02-22 22:09 kgv Relationship added parent of 0027198
2016-07-13 22:44 kgv Relationship added parent of 0027675
2016-07-14 10:44 kgv Relationship added parent of 0027676
2016-09-04 14:34 kgv Relationship added parent of 0027838
2016-09-19 11:48 kgv Relationship added parent of 0027880
2016-11-02 12:57 kgv Relationship added child of 0025534
2016-11-02 12:58 kgv Relationship replaced parent of 0025534
2016-11-02 12:58 kgv Relationship added parent of 0028040
2016-11-16 10:27 kgv Relationship added parent of 0028110
2016-11-29 07:55 kgv Relationship added related to 0028172
2017-01-13 16:10 kgv Relationship added parent of 0028353
2017-02-13 17:15 kgv Relationship added parent of 0028454
2018-01-29 11:18 kgv Relationship added parent of 0029069
2019-03-02 22:55 kgv Relationship added parent of 0030529


Copyright © 2000 - 2019 MantisBT Team
Powered by Mantis Bugtracker