Python reading from a file and saving to utf-8
I'm having problems reading from a file, processing its string and saving to an UTF-8 File.
Here is the code:
try: filehandle = open(filename,"r") except: print("Could not open file " + filename) quit() text = filehandle.read() filehandle.close()
I then do some processing on the variable text.
try: writer = open(output,"w") except: print("Could not open file " + output) quit() #data = text.decode("iso 8859-15") #writer.write(data.encode("UTF-8")) writer.write(text) writer.close()
This output the file perfectly but it does so in iso 8859-15 according to my editor. Since the same editor recognizes the input file (in the variable filename) as UTF-8 I don't know why this happened. As far as my reasearch has shown the commented lines should solve the problem. However when I use those lines the resulting file has gibberish in special character mainly, words with tilde as the text is in spanish. I would really appreciate any help as I am stumped....
Process text to and from Unicode at the I/O boundaries of your program using the
import codecs with codecs.open(filename, 'r', encoding='utf8') as f: text = f.read() # process Unicode text with codecs.open(filename, 'w', encoding='utf8') as f: f.write(text)
io module is now recommended instead of codecs and is compatible with Python 3's
import io with io.open(filename, 'r', encoding='utf8') as f: text = f.read() # process Unicode text with io.open(filename, 'w', encoding='utf8') as f: f.write(text)
★ Back to homepage or read more recommendations: