S with combining acute accent (lower and uppercase)
authorDan Scott <dan@coffeecode.net>
Fri, 29 Jul 2011 18:11:05 +0000 (14:11 -0400)
committerDan Scott <dscott@laurentian.ca>
Tue, 7 May 2013 18:36:58 +0000 (14:36 -0400)
We can use the composed codepoint for these instead of going the
decomposed route, for more accuracy and great justice.

Signed-off-by: Dan Scott <dscott@laurentian.ca>
tools/ebooks/prep_ebook_records.py

index cf8e3c6..79b1739 100644 (file)
@@ -378,8 +378,10 @@ def clean_diacritics(field):
         # COMBINING CEDILLA
         tmpsf = tmpsf.replace(u'\xb0c', u'c\u0327')
 
-        # COMBINING ACUTE ACCENT
-        tmpsf = tmpsf.replace(u'\xd4s', u's\u0301')
+        # S WITH COMBINING ACUTE ACCENT (lowercase)
+        tmpsf = tmpsf.replace(u'\xd4s', u'\u015b')
+        # S WITH COMBINING ACUTE ACCENT (uppercase)
+        tmpsf = tmpsf.replace(u'\xd4S', u'\u015a')
 
         # COMBINING BREVE
         tmpsf = tmpsf.replace(u'\xe6i', u'i\u0306')