[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

*Proposed change* multiline (tagged) data



On Fri, 8 Aug 1997, Ken Fox wrote:
> I strongly oppose this proposal to eliminate _data-tag.  In my parser, this
> would greatly _complicate_ things.  Now, there is a single data tag and #$#*
> is dispatched directly to the relevant "pending request" to add the line to
> the approproate data element.  When an end is received, the message can be
> dispatched.  Less complicated state needs to be kept around.   In the new
> way, the "request" can't be "assembled" until all the end-element lines have
> been received.

I guess Erik said it best:
> incidentally, i also suspect that the discussion of multiline field
> data tags will never be resolved to everyone's satisfaction, and we
> should think about how to break the deadlock.

So here is my proposal for breaking the deadlock in such a way that both 
"newstyle" and "oldstyle" multiline data may be used, but without dummy 
"" values or the _data-tag keyword.

[As an aside: I don't think the new way adds much complexity to the parser, 
though I did find it convenient to make datatags have a more easily 
recognizable form (by always starting them with some character, say 
'*').  With this modification, it required an hour of coding to switch 
from my old style parser to a new style one.  But whatever.]

First, I would note that the old way of doing things leaves out authkeys on 
continuation and end lines; perhaps this was discussed before I joined 
the list, but it seems like an important omission.  Somewhat more 
generally, there is no way in the current spec to pass additional 
data (key/value pairs or otherwise) on continuation lines.  So I propose 
that additional data be allowed between the continued tag; that is, in 
the current spec's form

#$#*this-would be legal: now with: "the continued" data: at the end (to EOL).

Currently, only
#$#*this-would data: at the end (to EOL).
is allowed.

Before I say more about continuation lines, note that which method 
(oldstyle, one data-tag per line; or newstyle, one data-tag per value)
is preferrable depends on whether you view the initial line as what is 
being continued, or whether you view the individual values as what are 
being continued.  In the first case, the entire line is lost if the end 
line is never received; in the second case, inidividual values may be 
resolved (and possibly even dispatched) even if one or more end lines 
are never received.

In either scheme, the dummy values can be done away with and the grammar 
kept regular (a list of {key delimiter value}) and the extraneous 
_data-tag keyword avoided by "assigning" the data tag to the values.  
That is, if you favor the first interpretation above (line continuation), 
then the tagged line would be

#$#multline-data-follows foo* = datatag bar* = datatag
(here datatag is the tag for the entire line)

while if you favor the second (value continuation) it would be

#$#multiline-data-follows foo* = datatag1 bar* = datatag2
(here datatag1 and datatag2 refer to different values)

In this way, each philosophy may be accomodated by a single kind of start 
line.

To continue the data, one needs to refer to the datatag and (because now 
that may not uniquely identify the value being continued) the value.  
Plus one may want additional information in the continuation line, to 
specify things like authkeys.  So continuation lines would look like

#$#* tag= the_tag_being_continued possibly = "other key/values"
     the-last-one = The continuation of the multiline value.

Finally, the message would end with a message indicating this datatag 
(and possibly also which value) has ended.  With line continuation, only 
a single end message is required.  With value continuation, a separate 
end message for each value could be sent, or an end message for several 
values at once:

#$#end each tag which is ending


Here is an example using line continuation ("old style"):

#$#edit-verb authkey=1234 code*=xj103 object=#10 verb=3 verbnames*=xj103
#$#* tag=xj103 authkey=1234 code = A line of code here.
#$#* tag=xj103 authkey=1234 verbnames = A verbname here.
...
#$#end tag=xj103

And here is the same example using value continuation ("new style"):

#$#edit-verb authkey=1234 code*=xj103 object=#10 verb=3 verbnames*=mopdt
#$#* tag=xj103 authkey=1234 code = A line of code here.
#$#* tag=mopdt authkey=1234 verbnames= A verbname here.
...
#$#end xj103 mopdt

(or each value could be ended individually using #$#end xj103 and 
#$#end mopdt )


In this way, both styles can be accomodated with a single syntax.  The 
necessary changes to the currently proposed BNF grammar are small:

<message-continue> := '*' [<space> <authkey>] <keyval>* <space> 
                                      <simple-key> <space> <line>
<message-end> := 'end' <space> <datatag>*


Btw, the current spec is missing spaces; this can be fixed by changing:
<message> := <message-name> [<space> <auth-key>] <keyval>*
<keyval> := <space> <key> <space> <value>

although personally I also like not including the delimiter as part of 
the key and also allowing arbitrary whitespace between keys and values, 
which would involve these changes:

<keyval> := <space> <key> <space>* <delimiter> <space>* <value>
<simple-key> := <ident>
<multiline-key> := <ident> '*'
<delimiter> = ':' (though I prefer '=')



michael
brundage@ipac.caltech.edu