unicode - Decode function tries to encode Python -
i trying print unicode string without specific encoding hex in it. i'm grabbing data facebook has encoding type in html headers of utf-8. when print type - says unicode, when try decode unicode-escape says there encoding error. why trying encode when use decode method?
code
a='really long string of unicode html text wont reprint' print type(a) >>> <type 'unicode'> print a.decode('unicode-escape') >>> traceback (most recent call last): file "scfbp.py", line 203, in myfunctionpage print a.decode('unicode-escape') unicodeencodeerror: 'ascii' codec can't encode character u'\u20ac' in position 1945: ordinal not in range(128)
it's not decode that's failing. it's because trying display result console. when use print encodes string using default encoding ascii. don't use print , should work.
>>> a=u'really long string containing \\u20ac , other text' >>> type(a) <type 'unicode'> >>> a.decode('unicode-escape') u'really long string containing \u20ac , other text' >>> print a.decode('unicode-escape') traceback (most recent call last): file "<stdin>", line 1, in unicodeencodeerror: 'ascii' codec can't encode character u'\u20ac' in position 30: ordinal not in range(128)
i'd recommend using idle or other interpreter can output unicode, won't problem.
update: note not same situtation 1 less backslash, fails during decode, same error message:
>>> a=u'really long string containing \u20ac , other text' >>> type(a) <type 'unicode'> >>> a.decode('unicode-escape') traceback (most recent call last): file "<stdin>", line 1, in unicodeencodeerror: 'ascii' codec can't encode character u'\u20ac' in position 30: ordinal not in range(128)
Comments
Post a Comment