My java application is downloading some files encoded with either UTF-8 or ISO-8859-1 from a bitbucket repository.
I know in advance the charset used in those files.
My app is running fine on my Windows local machine (I use Eclipse JEE with a Tomcat 9 server).
I have deployed this application on a RedHat virtual machine running the same version of Tomcat and
I ended up with unknown characters �
replacing these é/è/à /ù/ï
.
Here is the code I wrote to get this data:
public static String getFileContentFromRepository(String url) throws IOException { HttpURLConnection connection = getConnection(url); connection.connect(); //The following function returns the charset of the file. (Proven to work) Charset repoCharset = getCharset(); InputStream connectionDataStream = connection.getInputStream(); String connectionStreamData = IOUtils.toString(connectionDataStream, repoCharset); connection.disconnect(); return connectionStreamData; }
How can I get the same results on both platforms?
Advertisement
Answer
The problem came from Linux having its default Charset set to UTF-8.
Adding the argument -Dfile.encoding=ISO-8859-1
to $CATALINA_OPTS in Tomcat’s config solved my problem.