mercredi 6 mai 2015

Take a Unicode character from within a string and decode it

I'm currently working in Python, and I'm pulling a whole bunch a data from the net, including titles of photos. Some of the strings I'm getting have unicode in them, and I'd like to display it as its original character.

I know that if I type, for example,

print u'\u00a9'

that is will output the right character to the terminal.

However, if I get a string such as:

string = 'Copyright \u00a9 David'

I am not sure how to pull it out.

I managed to pull out the character code with RegEx, but I don't know how to insert it back in without getting an error.

I tried:

char = \u00a9
string = 'Copyright' + u'char' + 'David'

which didn't really work.

I need a way to programatically pull out the code (which I can do with RegEx), and then re-insert into the original string with the u' in front of it.

Aucun commentaire:

Enregistrer un commentaire