From cef52374b49ed74259a2db15fd941b7cfac6c3dc Mon Sep 17 00:00:00 2001 From: Jason Stephenson Date: Wed, 10 Jun 2015 14:38:35 -0400 Subject: [PATCH] LP 1463943: Replace non-ascii Unicode with numeric entities. Unless the client and server agree otherwise, SIP2 output should be in ASCII or MS CP850. Since we have no way to know if the client and server agree otherwise, we'll just replace non-ASCII characters with numeric entities in a similar manner to what we do with the field delimiter. To reproduce this bug: 1) Add an accented or other non-ACII unicode character to the call number associated with a copy. 2) Connect with a SIP client and send an item information message (64) to the server. 3) Your SIP client should do 1 of 3 things: a) It will choke on the response. b) It will display gibberish for the UTF-8 sequence in the respones. c) It will correctly display the call number. After applying this patch, your client should handle the same item information message response with no trouble. If you inspect the call number information, the UTF-8 sequences that potentially caused trouble before, should now be replaced by XML-escaped numeric entities: © for a copyright symbol. Signed-off-by: Jason Stephenson --- Sip.pm | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/Sip.pm b/Sip.pm index 35ba419..67bbed4 100644 --- a/Sip.pm +++ b/Sip.pm @@ -90,6 +90,13 @@ sub add_field { substr($value, $i, 1) = $ent; } + # Unless the client and server agree otherwise, SIP2 output should + # be in ASCII or MS CP850. Since we have no way to know if the + # client and server agree otherwise, we'll just replace non-ASCII + # characters with numeric entities in a similar manner to what we + # do with the field delimiter above. + $value =~ s/([^[:ascii:]])/'&#'.ord($1).';'/eg; + # SIP2 Protocol document specifies that variable fields are from 0 # to 255 characters in length. We'll do a check of the field # length and truncate if necessary. -- 2.11.0