[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Some comments on the spec...
Here is what a fresh (but wizened!) pair of eyes, untainted by
any recent examination of an implementation, or by any recent
discussion on this list :-) saw...
More seriously, feel free to ignore any or all of the
comments. My comments are motivated purely from wanting to see a
coherent (ah, that dirty word) and clean document.
My most substantive comments are about the specification of
in-band and out-of-band lines.
Editorially, I would very much prefer a structure in which
sections, subsections, paragraphs are numbered, with text
significantly polished (words, phrases chosen carefully). This
will give it the look and feel of a spec. The overall
organization is fine, but the text can still be polished up in
many places.
The comments, in sequence:
1. [Definition of in-band lines] If you are going to use the
notion of a line being quoted, you should define it, or have a
forward reference to the place where you are defining it so
that it is clear that this is a technically precise term.
[I am going to argue that there should be no such quoting.]
2. [Definition of Messages] Awkward text. Suggested rephrasing:
An MCP {\em message} is of two kinds. It may consist of a
single message line, which specifies the name of the message,
and a set of keywords, with their values. If any of the keywords are
marked to be {\em multiline}, the message may contain zero
or more {\em message-continuation} lines (specifying the
values for the multiline keywords), followed by a single {\em
multiline-end} message line.
3. [Network Line Translation section]
Here is how I reconstruct what seems to be the desired
computational model.
It seems that the desired structure of an MCP processor is as
follows:
(incoming lines) -> [Splitter]
| |
| ---------------
| |
v v
(in-band network lines) (out of band network lines)
| |
v v
[In-band processor] [MCP assembler]
|
v
(MCP messages)
| | |
v v ... v
[MCP modules]
Incoming lines are examined by the Splitter component, and
divided into in-band network lines or out of band network
lines. The former are fed into an In-band processor (the
semantics of which are not constrained by this specification).
The out of band network lines are fed into an MCP assembler
component which emits MCP messages (with a given structure,
namely, message name, auth id, a set of keywords, with
associated values); these messages are further processed by
MCP module components, whose identity is determined in some
way from the message.
Now, the question is how should the Splitter work. The basic
idea is to examine whether the network line begins with the string
"#$#". Exactly if it is so is the line an out of band network
line.
Now the problem here seems to be that this means that lines
beginning with #$# cannot be considered in band network
lines. This goes against the desire that it should be possible
for any line to be considered an in-band line.
It seems to me that there are three possible solutions to this
problem:
1. Set aside a special protocol:
#$#mcp-quote <authkey> line: <value>
This protocol will call the In-band processor with
<value>.
The advantage of this scheme is that the splitter is
completely trivial. Only in the specialized case of wanting
to have a #$# in-band line is some specialized processing
invoked ... and then it is invoked in the standard way, as an MCP
message.
The disadvantage is that one has to make the (modest) architectural
assumption that the In-band processor is callable from MCP
modules.
The counter to this is that an implementation could, of
course, choose to implement the parsing of this particularly
simple MCP mesage inline, i.e. in the splitter.
2. Set aside a special protocol that does not even use the
keyword value syntax:
#$#" <value>
This effectively provides a built-in interpretation for the
MCP message `"'; <value> is emitted as an in-band line by
the splitter.
This is halfway between 1 and 3.
3. Introduce a new classification of network lines: out-of-band
(those beginning with #$#), quoted in-band (those beginning
with #$") and in-band (the remaining). The quoted inband line
#$"<value>
is processed by the splitter by emitting <value> as an
in-band line. (Of course, <value> could start with #$# or
#$" or whatever.)
The advantage of this scheme is that it is conceptually
and implementationally simple; the disadvantage is that it
requires the introduction of a new category of network line
(quoted in-band).
Any of these three schemes is acceptable to me; I would prefer
(1) and (3) over (2). Between (1) and (3), I confess to a
preference for (1): this does not require the introduction of
any new conceptual category, can considerably simplify the
Network Translation Section, and there is a performance
penalty only in unusual cases. I dont think the architectural
assumption there is onerous. And in general, elegance,
generality and simplicity, which usually translate to smaller
specifications, are greatly to be preferred.
In either case, I would prefer the section rewritten. I would
be happy to take a stab at rewriting it, if people want; it
should take an hour or less.
4. [Message Format]
The first sentence is strictly speaking not correct. It
should be:
An MCP message consists of three parts: the name of the
message, the authentication key, and a set of keywords,
with associated values.
Isnt the authkey supposed to be optional? The BNF says so.
The discussion in the next paragraph "If there is no
asterisk..." seems to be confused; it is not consistent with
the syntactic category <value>. I would recommend rewriting
this section, bringing the productions from the grammar
inline, and using the structure of the grammar (viz, the
productions for <message>) to organize the text. Then the
productions themselves do not have to be paraphrased in text ("A
keyword-value pair is sent across as an identifier, an
asterisk if the value is multiline, and a value." Huh? Sent
across? All we need to do is to specify the BNF
here. Precise, unambiguous, no commentary about structure
needed. The less said in "obvious" places, the fewer chances
to trip up, or introduce different shades of meaning.)
In the paragraph beginning "The real value is sent as a
series of lines..", it may be worth emphasizing that there is
no (MCP-induced) limit on the number of lines that may be
sent as part of a multiline message, just as there is no
limit on the length of a network message line. (Of course all
this means that the MCP assembler can not work in bounded
space --- there may be an unbounded number of multiline
messages, each with unboundedly many lines being assembled
simultaneously --- but that is something MCP 2.1 will have to
live with...)
I suggest replacing the paragraph beginning with: "If there
are any multiline values..." with:
A message with a multiline value must have a keyword named
_data-tag; its associated value must be an
<unquoted-string>, called the {\em data-tag} for the
message. Data-tags are case-sensitive.
Similarly the paragraphs after that can be polished up.
The paragraph beginning "Moreover, implementations..." has
typos (first occurence of "has been sent" should be
deleted).
5. [A note on version ranges]
This suffers from a semantical problem. May be too late to
fix though.
Who is to say that a particular protocol version number
exists?!? Endpoint A initiates contact with EndPoint B and
says it supports protocol foo in the ranges 1.0 through
2.0. EndPoint B says it supports protocol foo in the ranges 1.5
through 2.5. The guy who wrote B was unaware that actually
there is such a thing as an implemented 2.0. (He may be just
dumb or uninformed or may not belong to the political faction
that promulgates 2.0.) The negotiation protocol says: we
agree on 2.0 --- but of course EndPoint B has no
implementation for 2.0! It doesnt help to say that "an
implementation should advertise a range including only those
versions known to be supported". Two problems: The author of
B did not know about 2.0. Second, even if he did, how in the
world can he say that 1.5 and 2.5 are acceptable, but not
2.0.
Sigh. Dont think anything can be done about this at this stage.
6. [Authentication Keys]
Typo at end of paragraph "In choosing an authentication
key...".
7. [mcp-negotiate package]
Probably worth saying explicitly:
An endpoint must ignore any mcp-negotiate-can or
mcp-negotiate-end messages received after the first
mcp-negotiate-end message, and any messages received for a
package for which an mcp-negotiate-can message has not yet
been received.
The paragraph "To avoid some of the significant.." contains a
typo ("protcol").
Consider the sentence:
Note that while the implementation may wait to send any
messages in these packages until receipt of the mcp-negotiate-end
message, it may not wait for this message before advertising
support for edit, superedit, gonzoedit, and any other package it
supports, nor may it opt not to advertise support for these
packages based on the received mcp-negotiate-can messages.
The first part (upto ...", nor") must clearly be part of the
spec. But the second part should not be, since there is no
way for any piece of software to figure out whether or not
this is the case. (If you cant enforce it, dont state it.) An
implementation can be in complete input/output compliance
with the spec while choosing to wait 10sec after the
connection is opened, examining the mcp-negotiate-can
messages received, and then sending mcp-negotiate-can
messages based on that. There is no way to know that it did
not do that. So no point in specifying the second part.
8. [Cord]
I did not look at this carefully; I would prefer the
more symmetric terminology "initiator" and "responder"
rather than "intiattor" and "receiver".
9. [BNF]
Is the ":" a terminator or a separator? Might
<keyval> ::= <key> ':' <space> <value>
be better?
Also, doesnt TAB count as <space>?
Lastly, should the specification place any limit on the number of
keywords in a message? Say, no more than 64K? :-)
Thats all for now..
Best, and Happy Thanksgiving,
Vijay