You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

83 lines
3.3 KiB

  1. UNICODE
  2. -------
  3. Log4cplus uses the expression "UNICODE" in at least two not so equal
  4. meanings:
  5. 1. the Unicode standard as defined by the Unicode Consortium
  6. 2. compiler's and/or C++ standard library's support for strings of
  7. wchar_ts and their manipulation
  8. WCHAR_T SUPPORT
  9. ---------------
  10. Log4cplus is aimed to be portable and to have as little 3rd party
  11. dependencies as possible. To fulfill this goal it has to use
  12. facilities offered by the operating systems and standard libraries it
  13. runs on. To offer the best possible level of support of national
  14. character, it has to support usage of wchar_t and it has to use
  15. wchar_t support (especially on Windows) provided by operating system
  16. and standard C and C++ libraries.
  17. This approach to portability has some limittations. One of the
  18. limittations is lacking support for C++ locales in various operating
  19. systems and standard C++ libraries. Some standard C++ libraries do not
  20. support other than the "C" and "POSIX" locales. This usually means
  21. that wchar_t <-> char conversion using codecvt<> facet is
  22. impossible. On such deficient platforms, log4cplus can use either
  23. standard C locale support or iconv() (through libiconv or built-in).
  24. UNICODE AND FILE APPENDERS
  25. --------------------------
  26. Another limitation related to Unicode support is then inability to
  27. write wchar_t messages that contain national characters that do not
  28. map to any code point in single byte code page to log files using
  29. FileAppender. This is a problem mainly on Windows. Linux and other
  30. *NIX systems can avoid it because they do not need to use wchar_t
  31. interfaces to have Unicode aware applications. They usually (as of
  32. year 2012) use UTF-8 based locales. With proper C++ locale setup in
  33. client applications, national characters can come through into log
  34. files unharmed. But if they choose to use wchar_t strings, they face
  35. the problem as well.
  36. *NIX
  37. ----
  38. To support output of non-ASCII characters in wchar_t message on *NIX
  39. platforms, it is necessary to use UTF-8 based locale (e.g.,
  40. en_US.UTF-8) and to set up global locale with std::codecvt facet or
  41. imbue individual FileAppenders with that facet. The following code can
  42. be used to get such std::locale instance and to set it into global
  43. locale:
  44. std::locale::global ( // set global locale
  45. std::locale ( // using std::locale constructed from
  46. std::locale (), // global locale
  47. // and codecvt facet from user locale
  48. new std::codecvt_byname<wchar_t, char, std::mbstate_t>("")));
  49. WINDOWS
  50. -------
  51. Windows do not support UTF-8 based locales. The above approach will
  52. yield a std::locale instance converting wchar_ts to current process'
  53. code page. Such locale will not be able to convert Unicode code points
  54. outside the process' code page. This is true at least with the
  55. std::codecvt facet implemented in Visual Studio 2010. Instead, with
  56. Visual Studio 2010 and later, it is possible to use std::codecvt_utf8
  57. facet:
  58. std::locale::global ( // set global locale
  59. std::locale ( // using std::locale constructed from
  60. std::locale (), // global locale
  61. // and codecvt_utf8 facet
  62. new std::codecvt_utf8<tchar, 0x10FFFF,
  63. static_cast<std::codecvt_mode>(std::consume_header
  64. | std::little_endian)>));