Home page logo

wireshark logo Wireshark mailing list archives

Re: 3GPP 23.038 encoding and string length
From: Guy Harris <guy () alum mit edu>
Date: Sat, 28 Dec 2013 14:50:39 -0800

On Dec 24, 2013, at 2:43 AM, Pascal Quantin <pascal.quantin () gmail com> wrote:

r54428 introduced a ENC_3GPP_TS_23_038 encoding type so as to be able to use proto_tree_add_item directly instead of 
manually decoding the string with gsm_sms_char_7bit_unpack() / gsm_sms_chars_to_utf8() functions.
While it is a very good idea (much more easier to use) it raises an interesting issue. With this 7 bits encoding a 
payload of 7 bytes will hold either 7 or 8 characters. This is handled by gsm_sms_char_7bit_unpack() function thanks 
to an extra parameter specifying the number of characters.

Presumably that's the out_length parameter (which doesn't appear to be checked before every character is written to the 
output string); the in_length parameter counts input octets, not output characters.  However, out_length appears 
primarily to be used when extracting into a fixed-length buffer, with the buffer length passed as the out_length 

GSM MAP is encoded using ASN.1 BER, and USSD-String is an OCTET STRING, so BER gives its length in octets, not 
characters, and it's preceded by lengthInCharacters, giving its length in characters.

In that case, we need to make sure we don't process more than the specified number of bytes and don't process more than 
the specified number of characters.  If ({number of characters}*7 + 7)/8 > {number of bytes}, there should probably be 
an expert info reporting an error; we might want to dissect all the characters we can extract from the specified number 
of bytes, at least.  If {number of bytes} < {number of characters}*7 + 7)/8, we might also want to warn that there are 
too many padding bytes, and dissect {number of characters} characters.  In both those cases, a "number of characters" 
count is all that needs to be passed to the string-extractor or item-adder routine; if ({number of characters}*7 + 7)/8 
{number of bytes}, the "number of characters" count should be ({number of bytes}*8)/7 rather than {number of 

For the ETSI TS 102 223 v10.0.0/3GPP TS 11.14 v8.17.0/3GPP TS 31.111 v9.7.0 smart card stuff, however, the text string 
appears to just be a TLV, so you only get a length in bytes; presumably padding should be ignored in that case, and we 
can just use proto_tree_add_item() or tvb_get_string_enc().

Are there cases where only the length in characters is given?
Sent via:    Wireshark-dev mailing list <wireshark-dev () wireshark org>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-request () wireshark org?subject=unsubscribe

  By Date           By Thread  

Current thread:
[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]