My java application is downloading some files encoded with either UTF-8 or ISO-8859-1 from a bitbucket repository.
I know in advance the charset used in those files.
My app is running fine on my Windows local machine (I use Eclipse JEE with a Tomcat 9 server).
I have deployed this application on a RedHat virtual machine running the same version of Tomcat and
I ended up with unknown characters � replacing these é/è/à /ù/ï.
Here is the code I wrote to get this data:
public static String getFileContentFromRepository(String url) throws IOException {
HttpURLConnection connection = getConnection(url);
connection.connect();
//The following function returns the charset of the file. (Proven to work)
Charset repoCharset = getCharset();
InputStream connectionDataStream = connection.getInputStream();
String connectionStreamData = IOUtils.toString(connectionDataStream, repoCharset);
connection.disconnect();
return connectionStreamData;
}
How can I get the same results on both platforms?
Advertisement
Answer
The problem came from Linux having its default Charset set to UTF-8.
Adding the argument -Dfile.encoding=ISO-8859-1 to $CATALINA_OPTS in Tomcat’s config solved my problem.