Home page logo

wireshark logo Wireshark mailing list archives

Re: [Wireshark-commits] rev 53819: /trunk/epan/ /trunk/epan/dissectors/: packet-gadu-gadu.c /trunk/epan/: charsets.c charsets.h proto.h tvbuff.c
From: Guy Harris <guy () alum mit edu>
Date: Sat, 7 Dec 2013 14:42:16 -0800

On Dec 7, 2013, at 2:10 AM, darkjames () wireshark org wrote:


User: darkjames
Date: 2013/12/07 10:10 AM

Add new string proto encoding for windows-1250 (ENC_WINDOWS_1250)

- Move windows-1250 to unicode encoding table to charset.c
- Add tvb_get_string_unichar2, tvb_get_stringz_unichar2 functions which recode tvb-string to UTF-8.

Note that


says of a gunichar2 that it is

        A type which can hold any UTF-16 code point[4].

with the footnote:



        [4] surrogate pairs

This means that a gunichar2 can hold either

        1) a character from the Basic Multilingual Plane (BMP) of Unicode:



        2) a surrogate pair:


so those routines can handle only encodings that don't include characters outside the BMP.

This is probably true of most non-Unicode encodings, such as the ISO 8859-n encodings, so it's OK for them, but be 
careful when using them.
Sent via:    Wireshark-dev mailing list <wireshark-dev () wireshark org>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-request () wireshark org?subject=unsubscribe

  By Date           By Thread  

Current thread:
  • Re: [Wireshark-commits] rev 53819: /trunk/epan/ /trunk/epan/dissectors/: packet-gadu-gadu.c /trunk/epan/: charsets.c charsets.h proto.h tvbuff.c Guy Harris (Dec 07)
[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]