[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: wee-hooo, more revisions
This issue bugged me enough to look up ftp://ds.internic.net/rfc/rfc18
66.txt to see if my memory really *was* failing....
> the double quote (") is one of the 4 characters (", <, >, &) deemed to have
> special meaning in HTML.
Those are the four entities in printable ascii that have non-numeric
entity names, yes. (9.7.1).
> I'll grant that "s appearing outside of <>s tend
> to come through okay (at least in IE3 and Mosaic, they do; dunno about
> Netscape --- we're not allowed to use that around here :-), but I don't
> believe this is ever guaranteed by the spec.
You don't need to escape it outside of attribute values. See the
last table in section 3.2.1 for satisfactory evidence---that's the
closest you're going to get without reading the DTD.
Inside double-quoted tag attribute values, you need to escape '"'
somehow. (See 3.2.4.) Strictly speaking, you can use '"' bare
inside a single-quoted attribute value:
<IMG SRC=foo.gif ALT='Jay says, "Marca can BITE ME" " " " "" '>
Of course, you'd then have to use the ' notation for single
quotes. All of this is pretty questionable from an SGML point of
view; attributes aren't supposed to contain this kind of information,
and entity processing in attributes is (to the best of my
understanding) an extra non-SGML step required of HTML
implementations.
This whole mess happened because this is Yet Another Way that IMG is
seriously misdesigned. If it was a container, we'd just throw the
ALT text in the PCDATA section and we'd get the right thing. Off the
top of my head, there's no other element in HTML that represents text
in an attribute or requires entity processing in an attribute---URLs
already have a perfectly good quoting scheme.
> The other thing --- what motivated me to mention it at all --- is that the
> fontification for my html mode in Emacs gets seriously confused by singleton
> "s. Maybe I just need a more sophisticated html mode, but I don't find it
> unreasonable for an html mode to expect that the special characters will
> only ever be used in the prescribed ways.
...except that this is and always has been perfectly legal. The
scapegoat here is Emacs's syntax tables, which don't quite capture
the context required to get this right.
Jay