[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Some comments on the spec...



Here is what a fresh (but wizened!) pair of eyes, untainted by
any recent examination of an implementation, or by any recent
discussion on this list :-) saw...

More seriously, feel free to ignore any or all of the
comments. My comments are motivated purely from wanting to see a
coherent (ah, that dirty word) and clean document.

My most substantive comments are about the specification of
in-band and out-of-band lines.

Editorially, I would very much prefer a structure in which
sections, subsections, paragraphs are numbered, with text
significantly polished (words, phrases chosen carefully). This
will give it the look and feel of a spec. The overall
organization is fine, but the text can still be polished up in
many places. 

The comments, in sequence:

1. [Definition of in-band lines] If you are going to use the
   notion of a line being quoted, you should define it, or have a
   forward reference to the place where you are defining it so
   that it is clear that this is a technically precise term.

   [I am going to argue that there should be no such quoting.]

2. [Definition of Messages] Awkward text. Suggested rephrasing:

    An MCP {\em message} is of two kinds. It may consist of a
    single message line, which specifies the name of the message,
    and a set of keywords, with their values. If any of the keywords are
    marked to be {\em multiline}, the message may contain zero
    or more {\em message-continuation} lines (specifying the
    values for the multiline keywords), followed by a single {\em
    multiline-end} message line. 
 
3. [Network Line Translation section]
   Here is how I reconstruct what seems to be the desired
   computational model.

   It seems that the desired structure of an MCP processor is as
   follows: 

    (incoming lines) -> [Splitter] 
                          |     |
                          |     --------------- 
                          |                   |
                          v                   v
                  (in-band network lines)   (out of band network lines)
                          |                   |
                          v                    v
                  [In-band processor]       [MCP assembler]
                                              |
                                              v
                                            (MCP messages)
                                              |  |     |
                                              v  v ... v
                                             [MCP modules]
                                                                   
   Incoming lines are examined by the Splitter component, and
   divided into in-band network lines or out of band network
   lines. The former are fed into an In-band processor (the
   semantics of which are not constrained by this specification).
   The out of band network lines are fed into an MCP assembler
   component which emits MCP messages (with a given structure,
   namely, message name, auth id, a set of keywords, with
   associated values); these messages are further processed by
   MCP module components, whose identity is determined in some
   way from the message. 

   Now, the question is how should the Splitter work. The basic
   idea is to examine whether the network line begins with the string
   "#$#". Exactly if it is so is the line an out of band network
   line.  

   Now the problem here seems to be that this means that lines
   beginning with #$# cannot be considered in band network
   lines. This goes against the desire that it should be possible
   for any line to be considered an in-band line. 

   It seems to me that there are three possible solutions to this
   problem:

   1. Set aside a special protocol:

     #$#mcp-quote <authkey> line: <value>
   
    This protocol will call the In-band processor with
    <value>. 

    The advantage of this scheme is that the splitter is
    completely trivial. Only in the specialized case of wanting
    to have a #$# in-band line is some specialized processing
    invoked ... and then it is invoked in the standard way, as an MCP
    message. 

    The disadvantage is that one has to make the (modest) architectural
    assumption that the In-band processor is callable from MCP
    modules. 

    The counter to this is that an implementation could, of
    course, choose to implement the parsing of this particularly
    simple MCP mesage inline, i.e. in the splitter. 

  2. Set aside a special protocol that does not even use the
     keyword value syntax:

     #$#" <value>

     This effectively provides a built-in interpretation for the
     MCP message `"'; <value> is emitted as an in-band line by
     the splitter. 

     This is halfway between 1 and 3. 

  3. Introduce a new classification of network lines: out-of-band
     (those beginning with #$#), quoted in-band (those beginning
     with #$") and in-band (the remaining). The quoted inband line
       #$"<value>

     is processed by the splitter by emitting <value> as an
     in-band line.  (Of course, <value> could start with #$# or
     #$" or whatever.)   

     The advantage of this scheme is that it is conceptually
     and implementationally simple; the disadvantage is that it
     requires the introduction of a new category of network line
     (quoted in-band). 

   Any of these three schemes is acceptable to me; I would prefer
   (1) and (3) over (2). Between (1) and (3), I confess to a
   preference for (1): this does not require the introduction of
   any new conceptual  category, can considerably simplify the
   Network Translation Section, and there is a performance
   penalty only in unusual cases. I dont think the architectural
   assumption there is onerous. And in general, elegance,
   generality and simplicity, which usually translate to smaller
   specifications, are greatly to be preferred.

   In either case, I would prefer the section rewritten. I would
   be happy to take a stab at rewriting it, if people want; it
   should take an hour or less.

 4. [Message Format]
    The first sentence is strictly speaking not correct. It
    should be:

     An MCP message consists of three parts: the name of the
     message,  the authentication key, and a set of keywords,
     with associated values. 

    Isnt the authkey supposed to be optional? The BNF says so.

    The discussion in the next paragraph "If there is no
    asterisk..." seems to be confused; it is not consistent with
    the  syntactic category <value>. I would recommend rewriting
    this section, bringing the productions from the grammar
    inline, and using the structure of the grammar (viz, the
    productions for <message>) to organize the text. Then the
    productions themselves do not have to be paraphrased in text ("A
    keyword-value pair is sent across as an identifier, an
    asterisk if the value is multiline, and a value." Huh? Sent
    across? All we need to do is to specify the BNF
    here. Precise, unambiguous, no commentary about structure
    needed.  The less said in "obvious" places, the fewer chances
    to trip up, or introduce different shades of meaning.)


    In the paragraph beginning "The real value is sent as a
    series of lines..", it may be worth emphasizing that there is
    no (MCP-induced) limit on the number of lines that may be
    sent as part of a multiline message,  just as there is no
    limit on the length of a network message line. (Of course all
    this means that the MCP assembler can not work in bounded
    space --- there may be an unbounded number of multiline
    messages, each with unboundedly many lines being assembled
    simultaneously --- but that is something MCP 2.1 will have to
    live with...) 

    I suggest replacing the paragraph beginning with: "If there
    are any multiline values..." with:

      A message with a multiline value must have a keyword named
      _data-tag; its associated value must be an
      <unquoted-string>, called the {\em data-tag} for the 
      message. Data-tags are case-sensitive. 

   Similarly the paragraphs after that can be polished up. 

   The paragraph beginning "Moreover, implementations..." has
   typos (first occurence of "has been sent" should be
   deleted). 


 5. [A note on version ranges] 
    This suffers from a semantical problem. May be too late to
    fix though. 

    Who is to say that a particular protocol version number
    exists?!? Endpoint A initiates contact with EndPoint B and
    says it supports protocol foo in the ranges 1.0 through
    2.0. EndPoint B says it supports protocol foo in the ranges 1.5
    through 2.5. The guy who wrote B was unaware that actually
    there is such a thing as an implemented 2.0. (He may be just
    dumb or uninformed or may not belong to the political faction
    that promulgates 2.0.) The negotiation protocol says: we
    agree on 2.0 ---  but of course EndPoint B has no
    implementation  for 2.0! It doesnt help to say that "an
    implementation should advertise a range including only those
    versions known to be supported". Two problems: The author of
    B did not know about 2.0. Second, even if he did, how in the
    world can he say that 1.5 and 2.5 are acceptable, but not
    2.0.

    Sigh. Dont think anything can be done about this at this stage.


  6. [Authentication Keys]
    Typo at end of paragraph "In choosing an authentication
    key...". 

  7. [mcp-negotiate package]
    Probably worth saying explicitly:

    An endpoint must ignore any mcp-negotiate-can or
    mcp-negotiate-end messages received after the first
    mcp-negotiate-end message, and any messages received for a
    package for which an mcp-negotiate-can message has not yet
    been received. 

    The paragraph "To avoid some of the significant.." contains a
    typo ("protcol").

    Consider the sentence:

     Note that while the implementation may wait to send any
     messages in these packages until receipt of the mcp-negotiate-end
     message, it may not wait for this message before advertising
     support for edit, superedit, gonzoedit, and any other package it
     supports, nor may it opt not to advertise support for these
     packages based on the received mcp-negotiate-can messages. 

    The first part (upto ...", nor") must clearly be part of the
    spec. But the second part should not be, since there is no
    way for any piece of software to figure out whether or not
    this is the case. (If you cant enforce it, dont state it.) An
    implementation can be in complete input/output compliance
    with the spec while choosing to wait 10sec after the
    connection is opened, examining the mcp-negotiate-can
    messages received, and then sending mcp-negotiate-can
    messages based on that. There is no way to know that it did
    not do that. So no point in specifying the second part. 
 
  8. [Cord]
     I did not look at this carefully; I would prefer the
     more symmetric terminology "initiator" and "responder"
     rather than "intiattor" and "receiver". 

  9. [BNF]
      Is the ":" a terminator or a separator? Might 
       <keyval> ::= <key> ':' <space> <value>
      be better? 
   
      Also, doesnt TAB count as <space>? 
  
Lastly, should the specification place any limit on the number of
keywords in a message? Say, no more than 64K? :-)
 
Thats all for now..

Best, and Happy Thanksgiving,

Vijay