-
2 Attachment(s)
UTF vs ANSI vs ISO
Hi,
I am wondering if I can get me a little help with understanding .html file formating.
I've recently had a problem with character interpretation when browsing to my online www.brightskies.us web page. None of the "special" (ie 16-bit) characters are interpreted correctly. I don't know what's happened, because this was not a problem in past months.
What's funny is that when I open a copy of my index.html file from the desktop, there's no problem at all, but when that same file is opened from the (Verizon) web server, there's a problem.
See the attached Image#1 for the view of the desktop file in three browsers.
See the attached Image#2 for the view of the website file. It's the same file in all cases.
Pay attention to the Latitude, Longitude and Copyright area...
Here's also the first few text lines of the index.html file (although you can view the source from the website URL).
Code:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<title>Bright Skies Observatory</title>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
</head>
<body bgcolor="#303060">
etc.
This may relate to sometimes but not always seeing funny characters in the web view of ACP's console window - those capital-A's with the caret above them, followed by the right character.
The encoding (from browser view menu) seen by the browser is all "UTF-8" which really surprises me, since that's not the meta-command. This must be the problem, because if I change the browser encoding to "western.." or "ISO-..." then the screens are all okay. But I can't seem to get the browser to respond automatically. What's funny is that when I open a copy of my index.html file from the desktop, there's no problem at all, but when that same file is opened from the (Verizon) web server, there's a problem.
Does anyone know what's going on, and can anyone give me some direction here, please?
-
Well, one thing I learned after some web browser research was that even the servers get into the act when it comes to character encoding - many servers simply do not send character encoding information at all. The best solution to the problem seems to be to use the "& ;" character reference representation for special characters in HTML outside the 32-127 printable ASCII character range. Wikipedia has a nice article called "Character encodings in HTML" that I found particularly informative.
-
Case Closed.
It was some effort to do in modifying about 30 individual web pages, but changing all the special characters to their character reference representation solved the problem.
-
Did you find any of those things in the standard ACP web content? If so, I can change them!
-
Actually, I did. Where I typically notice it is with the "degree" symbol in the web GUI in the console window. As I recall, I've seen it up front in the console log for the latitude/longitude of the observatory. I don't remember about the solved plate coordinates. Again, it depends on whether the browser sets up for Western or UTF-8 encoding.
But if you replace the character "°" with "°" everywhere in ...PrintLine statements, that should at least take care of that problem. I do remember seeing prior forum posts about strange characters.
-
2 Attachment(s)
Here's something to note!
In my previous message, in the second paragraph, the sentence begins, "But if you replace the character ...." In my view of that sentence now, using Firefox, and noting the character encoding in the View menu says Western (ISO-8859-1), what's supposed to be the degree symbol inside the first quotes is actually a capital-A with a caret over it (Image1.png) followed by the degree symbol. When I change the encoding in the browser to Unicode (UTF-8), the degree symbol is correctly portrayed (Image2.png).
-
OK, Thanks. I've added an ER for this. Thanks, it was dumb to use those characters in the first place.
-
1 Attachment(s)
Not dumb. You've got to have a degree symbol in at least a couple places!!
Here's a pix of what I see sometimes. I'm sure you have too. The A-hat character is part of the following degree symbol. Whether you see it depends on three things:
1 That you've copied the degree symbol (in this case) from the character mapping table
2 That the browser interprets the unicode character (UTF-8 vs Windows-8869-1) incorrectly, AND
3 That the server actually passes a charset encoding when the data is sent to the browser.
The whole matter is moot if one uses ° representations throughout.
-
OK, this is related to the other one at http://forums.dc3.com/showthread.php?p=21815#post21815. There I explain what probably happened. I've now linked the two Gemini issues and closed both. I can't repro it as long as the plans are saved in Notepad with ASCII mode not UTF-8.