View Issue Details

IDProjectCategoryView StatusLast Update
0029081CommunityOCCT:Foundation Classespublic2019-07-17 12:38
ReporterBenjaminBihler Assigned Toabv 
PrioritynormalSeverityminor 
Status closedResolutionfixed 
PlatformMinGW-w64OSWindows 
Product Version7.2.0 
Target Version7.3.0Fixed in Version7.3.0 
Summary0029081: Foundation Classes, OSD_OpenStream - handle UNICODE file paths specifically in case of Mingw-w64
DescriptionMingw-w64 does not provide the non-standard Microsoft extension to open ifstreams and ofstreams using wchar_t* file names.

To be able to open the respective streams without the Microsoft extension one would have to subclass std::ifstream and std::ofstream and add an open method to the child classes that accepts wchar_t* file names.

This is exactly what has been done by the boost Filesystem library. If boost was added as an (optional) dependency to Open CASCADE, unicode paths would work also with Mingw-w64.
Steps To ReproduceNot required
TagsNo tags attached.
Test case numberbugs fclasses bug22125, bugs fclasses bug25367_igs

Relationships

related to 0027585 closedapn Community It is not possible to store OCAF documents to paths with special characters in their names 
related to 0022125 closedbugmaster Open CASCADE TCollection_ExtendedString: conversion from UTF-8 to unicode 
related to 0025367 closedbugmaster Open CASCADE IGES and BRep persistence - support unicode file names on Windows 
parent of 0030403 closedbugmaster Community  Foundation Classes - Overwriting Big "BinOcaf" Files Does Not Reduce Their Size 

Activities

git

2017-09-06 11:20

administrator   ~0070263

Branch CR29081 has been created by Zia ul Azam.

SHA-1: 989ba7aa828920acf73cabc4981193d2037bbcbf


Detailed log of new commits:

Author: Zia ul Azam
Date: Wed Sep 6 10:19:37 2017 +0200

    0029081: With Mingw-w64 Unicode Paths Do Not Work
    
    Boost file streams are used on Windows if boost is present to enable
    handling unicode paths also with Mingw-w64.

Author: Zia ul Azam
Date: Wed Sep 6 10:14:53 2017 +0200

    0029081: With Mingw-w64 Unicode Paths Do Not Work
    
    Added boost as an optional external library.

git

2017-09-06 17:10

administrator   ~0070281

Branch CR29081 has been updated by Zia ul Azam.

SHA-1: 0e2fade9099d4302f1208b4d7fa2d3ab888da539


Detailed log of new commits:

Author: Zia ul Azam
Date: Wed Sep 6 16:09:32 2017 +0200

    0029081: With Mingw-w64 Unicode Paths Do Not Work
    
    Added documentation of cmake flags used for using and installing boost.

git

2017-09-06 17:15

administrator   ~0070282

Branch CR29081 has been updated by BenjaminBihler.

SHA-1: deb861a0d8033c44fbbe4ca9dda70db60359c94a


Detailed log of new commits:

Author: Benjamin Bihler
Date: Wed Sep 6 16:14:47 2017 +0200

    0029081: With Mingw-w64 Unicode Paths Do Not Work
    
    Corrected boost header file path.

git

2017-09-06 17:20

administrator   ~0070283

Branch CR29081 has been updated by BenjaminBihler.

SHA-1: 6f965ae8ee9e552f8f4badccb135cbdc0eced6af


Detailed log of new commits:

Author: Benjamin Bihler
Date: Wed Sep 6 16:20:38 2017 +0200

    0029081: With Mingw-w64 Unicode Paths Do Not Work
    
    Harmonized capitalization.

abv

2017-10-04 22:06

manager   ~0071212

Last edited: 2017-10-05 06:41

Hello Benjamin and Zia,

Sorry for delay with reply on this issue.

I have tried to apply these changes and, alas, I cannot make it working.

What I have is:
- Windows 10 Pro 64-bit workstation
- MinGW-W64-builds-4.3.3 downloaded from https://netcologne.dl.sourceforge.net/project/mingw-w64/Toolchains%20targetting%20Win64/Personal%20Builds/mingw-builds/7.1.0/threads-win32/sjlj/x86_64-7.1.0-release-win32-sjlj-rt_v5-rev2.7z
- Boost 1.65.1 downloaded from http://www.boost.org/users/download/ and built with MinGW (target "gcc") using instructions found at http://www.boost.org/doc/libs/1_65_1/more/getting_started/windows.html#or-build-binaries-from-source
- OCCT current master + fix (branch CR29081_1) built using CMake with generator "MinGW Makefiles"

The problem is that Boost streams do not work correctly under MinGW with neither char* (UTF-8) nor wchar_t* (UTF-16) names. At the end, I reduced the problem to this code:

~~~~~
  boost::filesystem::ofstream test1("d:\\\xE6\x9C\x89\xE7\x94\xA8.var1", ios::out);
  if (!test1.rdbuf()->is_open()) { std::cout << "Variant 1 failed" << std::endl; }

  boost::filesystem::ofstream test2(L"d:\\\x6709\x7528.var2", ios::out);
  if (!test2.rdbuf()->is_open()) { std::cout << "Variant 2 failed" << std::endl; }
~~~~~

Both file names represent text "it works" translated to Traditional Chinese using Google.Translate (two hieroglyphs: 有用), first one in UTF-8 and another in UTF-16.

When executed, file opening fails in variant 2; variant 1 produces file with name with UTF-8 string apparently being interpreted as if it were in the current locale.

If I set Boost locale to UTF-8, then variant 2 works just like variant 1.
For setting locale, I tried two variants (seemingly equivalent) found in:
http://www.boost.org/doc/libs/1_62_0/libs/locale/doc/html/default_encoding_under_windows.html
https://svn.boost.org/trac10/ticket/9968

Note that the same code compiled with MSVC 10 using standard streams (i.e. replacing "boost::filesystem" with "std") produces file with the same wrong name as with MinGW in variant 1 (as expected), but produces file with expected correct name "有用.var2" in variant 2.

Thus, from my testing, Boost built for MinGW does not seem to support Unicode path names.

If it works for you, this should be either due to different build of Boost, or some tricks necessary to set it working that I do not know. Please share your knowledge on this.

git

2017-10-04 22:11

administrator   ~0071214

Branch CR29081_1 has been created by abv.

SHA-1: 8acc16a53934d6b96850968915d710314261a3d8


Detailed log of new commits:

Author: Zia ul Azam
Date: Wed Sep 6 11:14:53 2017 +0300

    0029081: With Mingw-w64 Unicode Paths Do Not Work
    
    Added boost as an optional external library.
    Added documentation of cmake flags used for using and installing boost.
    
    Boost file streams are used on Windows if boost is present to enable handling unicode paths also with Mingw-w64.

abv

2017-10-04 22:16

manager   ~0071216

Branch CR29081_1 is the same as CR29081, rebased on current master and with all commits suashed (plus some corrections for building DRAW).

Note that as soon as Boost headers and relevant logic is present in OSD_OpenFile.hxx, there is no need to replicate it whenever streams are used. Instead, we can define typedefs for different kinds of streams used (e.g. Standard_OFStream for either std::ofstream or boost::filesystem::OFStream) in this header, and use them throughout the code. In this case OSD_OpenFile.hxx will be the single place where Boost is mentioned.

BenjaminBihler

2017-10-05 12:54

developer   ~0071230

Hello Andrey,

unfortunately you are right. :-((( I have found this issue: https://svn.boost.org/trac10/ticket/5769 which seems to explain everything. Our solution seems to have worked for us since we have used special characters that are part of our current Windows code page. So the patch is partially beneficial (without Boost it was not even possible to read/write such files), but even with Boost not all unicode paths work with MinGW (they do work with Boost and MSVC).

How to continue? The current patch is still useful for us, but it is not what it has promised to be.

Benjamin

kgv

2017-10-05 13:04

developer   ~0071231

Last edited: 2017-10-05 13:06

> Our solution seems to have worked for us since we have used special characters
> that are part of our current Windows code page.
> So the patch is partially beneficial
> (without Boost it was not even possible to read/write such files),
Working with active CodePage does not require boost - the same could be achieved by patching OSD_OpenFile to convert UTF-8 to CodePage specifically for MinGW
(I think I have written about this workaround somewhere else, but maybe not).
And for opening existing files, one may also use workaround with passing short DOS file path instead of full path (this can be also done using WinAPI functions) - short file names are still generated even on modern Windows systems for NTFS filesystem (can be disabled in system settings).

BenjaminBihler

2017-10-05 15:16

developer   ~0071243

We will look at two other libraries that might offer a solution: https://pocoproject.org/ (Boost Software License) and https://www.guelkerdev.de/projects/pathie/ (GPL).

Or are you already sure that you wouln't accept another optional library dependency even if it was a real solution for the unicode path problem?

git

2017-10-05 18:27

administrator   ~0071246

Branch CR29081_2 has been created by kgv.

SHA-1: e53996ab1c6ca311ca0eb21096964b29f852ed26


Detailed log of new commits:

Author: kgv
Date: Wed Sep 6 11:14:53 2017 +0300

    0029081: With Mingw-w64 Unicode Paths Do Not Work
    
    Define opencascade::std::fstream implementing wchar_t paths support
    on Mingw using __gnu_cxx::stdio_filebuf.

git

2017-10-05 20:03

administrator   ~0071247

Branch CR29081_3 has been created by kgv.

SHA-1: 788adcd33946423e199f28ba92bfb24b7146f6f7


Detailed log of new commits:

Author: kgv
Date: Wed Sep 6 11:14:53 2017 +0300

    0029081: With Mingw-w64 Unicode Paths Do Not Work
    
    OSD_OpenStream now uses __gnu_cxx::stdio_filebuf extension
    for opening UNICODE files on MinGW when using C++ file streams.

kgv

2017-10-05 20:04

developer   ~0071248

Could you please try patch in branch CR29081_3?

abv

2017-10-05 21:02

manager   ~0071249

For me, it works like a charm.

git

2017-10-06 08:12

administrator   ~0071250

Branch CR29081_3 has been updated forcibly by kgv.

SHA-1: 83b2925673dd26295234f4e3e86197bb66091077

BenjaminBihler

2017-10-06 11:13

developer   ~0071253

Kirill, you are brilliant! I cannot see any limitations, it's just great! :-)

git

2017-10-06 19:58

administrator   ~0071274

Branch CR29081_4 has been created by abv.

SHA-1: 3217b03a542e91d8fdbba357ae522bf9dac37911


Detailed log of new commits:

Author: kgv
Date: Wed Sep 6 11:14:53 2017 +0300

    0029081: With Mingw-w64 Unicode Paths Do Not Work
    
    OSD_OpenStream() now uses __gnu_cxx::stdio_filebuf extension for opening UNICODE files on MinGW when using C++ file streams.
    Variant accepting filebuf returns bool (true if succeeded and false otherwise).
    
    Checks of ofstream to be opened made via calls to low-level ofstream::rdbuf() are replaced by calls to ofstream::is_open(); state of the stream is also checked (to be good).
    Unicode name used for test file in test bugs fclasses bug22125 is described (for possibility to check it).

git

2017-10-06 20:49

administrator   ~0071275

Branch CR29081_4 has been updated forcibly by abv.

SHA-1: fc8918ad91d704b213a4c389e3640bc5626d98cf

abv

2017-10-07 07:45

manager   ~0071276

Reviewed with some amendments (for more consistent handling of possible errors) and tested; see Jenkins job CR29081-master-abv. Please consider branch CR29081_4 for integration.

bugmaster

2017-10-09 10:25

administrator   ~0071291

Combination -
OCCT branch : CR29081_4 SHA-1: fc8918ad91d704b213a4c389e3640bc5626d98cf
Products branch : master
was compiled on Linux, MacOS and Windows platforms and tested on optimize mode.

Number of compiler warnings:
No new/fixed warnings

Regressions/Differences/Improvements:
No regressions/differences

CPU differences:
No differences that require special attention

Image differences :
No differences that require special attention

Memory differences :
No differences that require special attention

git

2017-10-14 12:19

administrator   ~0071449

Branch CR29081 has been deleted by kgv.

SHA-1: 6f965ae8ee9e552f8f4badccb135cbdc0eced6af

git

2017-10-14 12:19

administrator   ~0071450

Branch CR29081_1 has been deleted by kgv.

SHA-1: 8acc16a53934d6b96850968915d710314261a3d8

git

2017-10-14 12:19

administrator   ~0071451

Branch CR29081_2 has been deleted by kgv.

SHA-1: e53996ab1c6ca311ca0eb21096964b29f852ed26

git

2017-10-14 12:19

administrator   ~0071452

Branch CR29081_3 has been deleted by kgv.

SHA-1: 83b2925673dd26295234f4e3e86197bb66091077

git

2017-10-14 12:19

administrator   ~0071453

Branch CR29081_4 has been deleted by kgv.

SHA-1: fc8918ad91d704b213a4c389e3640bc5626d98cf

Related Changesets

occt: master fc8918ad

2017-09-06 08:14:53

abv


Committer: abv Details Diff
0029081: With Mingw-w64 Unicode Paths Do Not Work

OSD_OpenStream() now uses __gnu_cxx::stdio_filebuf extension for opening UNICODE files on MinGW when using C++ file streams.
Variant accepting filebuf returns bool (true if succeeded and false otherwise).

Checks of ofstream to be opened made via calls to low-level ofstream::rdbuf() are replaced by calls to ofstream::is_open(); state of the stream is also checked (to be good).
Unicode name used for test file in test bugs fclasses bug22125 is described (for possibility to check it).
Affected Issues
0029081
mod - src/BRepTools/BRepTools.cxx Diff File
mod - src/Draw/Draw_VariableCommands.cxx Diff File
mod - src/OSD/OSD_OpenFile.cxx Diff File
mod - src/OSD/OSD_OpenFile.hxx Diff File
mod - src/StepSelect/StepSelect_WorkLibrary.cxx Diff File
mod - tests/bugs/fclasses/bug22125 Diff File

Issue History

Date Modified Username Field Change
2017-09-04 17:30 BenjaminBihler New Issue
2017-09-04 17:30 BenjaminBihler Assigned To => ziaulazam
2017-09-04 17:30 BenjaminBihler Relationship added related to 0027585
2017-09-06 11:20 git Note Added: 0070263
2017-09-06 17:10 git Note Added: 0070281
2017-09-06 17:15 git Note Added: 0070282
2017-09-06 17:20 git Note Added: 0070283
2017-09-06 17:21 BenjaminBihler Assigned To ziaulazam => mpv
2017-09-06 17:21 BenjaminBihler Status new => resolved
2017-09-06 17:21 BenjaminBihler Steps to Reproduce Updated
2017-09-07 11:44 mpv Assigned To mpv => abv
2017-10-01 19:21 abv Test case number => bugs fclasses bug22125
2017-10-01 19:32 abv Test case number bugs fclasses bug22125 => bugs fclasses bug22125, bugs fclasses bug25367_igs
2017-10-01 19:33 abv Relationship added related to 0022125
2017-10-01 19:33 abv Relationship added related to 0025367
2017-10-04 22:06 abv Note Added: 0071212
2017-10-04 22:07 abv Assigned To abv => BenjaminBihler
2017-10-04 22:07 abv Status resolved => assigned
2017-10-04 22:07 abv Status assigned => feedback
2017-10-04 22:11 git Note Added: 0071214
2017-10-04 22:16 abv Note Added: 0071216
2017-10-05 06:41 abv Note Edited: 0071212
2017-10-05 12:54 BenjaminBihler Note Added: 0071230
2017-10-05 12:55 BenjaminBihler Assigned To BenjaminBihler => abv
2017-10-05 13:04 kgv Note Added: 0071231
2017-10-05 13:06 kgv Note Edited: 0071231
2017-10-05 13:46 kgv Assigned To abv => BenjaminBihler
2017-10-05 15:16 BenjaminBihler Note Added: 0071243
2017-10-05 15:16 BenjaminBihler Assigned To BenjaminBihler => abv
2017-10-05 18:27 git Note Added: 0071246
2017-10-05 20:03 git Note Added: 0071247
2017-10-05 20:04 kgv Note Added: 0071248
2017-10-05 20:04 kgv Assigned To abv => BenjaminBihler
2017-10-05 21:02 abv Note Added: 0071249
2017-10-06 08:12 git Note Added: 0071250
2017-10-06 08:25 kgv Category OCCT:Application Framework => OCCT:Foundation Classes
2017-10-06 11:13 BenjaminBihler Note Added: 0071253
2017-10-06 11:14 BenjaminBihler Assigned To BenjaminBihler => abv
2017-10-06 11:14 BenjaminBihler Status feedback => resolved
2017-10-06 17:38 kgv Summary With Mingw-w64 Unicode Paths Do Not Work => Foundation Classes, OSD_OpenStream - handle UNICODE file paths specifically in case of Mingw-w64
2017-10-06 19:58 git Note Added: 0071274
2017-10-06 20:49 git Note Added: 0071275
2017-10-07 07:45 abv Note Added: 0071276
2017-10-07 07:45 abv Assigned To abv => bugmaster
2017-10-07 07:45 abv Status resolved => reviewed
2017-10-09 10:25 bugmaster Note Added: 0071291
2017-10-09 10:25 bugmaster Status reviewed => tested
2017-10-12 19:00 abv Changeset attached => occt master fc8918ad
2017-10-12 19:00 abv Assigned To bugmaster => abv
2017-10-12 19:00 abv Status tested => verified
2017-10-12 19:00 abv Resolution open => fixed
2017-10-14 12:19 git Note Added: 0071449
2017-10-14 12:19 git Note Added: 0071450
2017-10-14 12:19 git Note Added: 0071451
2017-10-14 12:19 git Note Added: 0071452
2017-10-14 12:19 git Note Added: 0071453
2018-02-20 12:59 aiv Target Version 7.4.0 => 7.3.0
2018-06-29 21:15 aiv Fixed in Version => 7.3.0
2018-06-29 21:19 aiv Status verified => closed
2019-07-17 11:32 BenjaminBihler Relationship added related to 0030403
2019-07-17 12:38 kgv Relationship replaced parent of 0030403