[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Sip] Is body handling NP-complete?
I would note that if this does end up in the specification, it can carry
a rider that implementations should exhibit behaviour as demonstrated by
the algorithm, not that they need to implement the algorithm itself.
However I leave it to the working group to ascertain whether they prefer
this as a means of specification.
Regards
Keith
> -----Original Message-----
> From: sip-bounces at ietf.org [mailto:sip-bounces at ietf.org] On
> Behalf Of Paul Kyzivat
> Sent: Tuesday, September 30, 2008 2:53 PM
> To: Dale.Worley at comcast.net
> Cc: sip at ietf.org
> Subject: Re: [Sip] Is body handling NP-complete?
>
> Dale,
>
> I like the thorough treatment. It seems plausible to me.
>
> One implication of this is that body parts need to be
> processed twice in a content-type+content-disposition+context
> dependent way: once to extract references, and again for
> final effect. (Hmm - is extracting of references dependent on
> the disposition and context?)
>
> If you imagine that various content types are handled by
> plugins, then the plugins will each require two entry points
> - one for each of these processing modes. While that is
> certainly possible, it might mean that plugins that already
> exist for non-sip applications (e.g. email or html) may not
> be reusable for sip. (I am not familiar with the content-type
> plugins for other protocols such as email. But I presume that
> email plugins may be useful in some cases, at least for MESSAGE.)
>
> Do plugins for email have this complexity?
>
> An alternative is to do all the processing in one pass, and
> then at the end check if all body parts have been processed.
> BUT that means all the side effects of processing will have
> been realized before discovering that the message is in
> error. That is certainly not ideal.
>
> > This example also points out that the requirement that references
> > appear before the parts they reference is needed to make
> the problem
> > tractable. Otherwise, there could be circular dependencies between
> > the choices for different multipart/alternatives, which might be
> > impossible to analyze by an algorithm as simple as a tree-walk.
>
> This requirement is satisfied by lots of simple cases. But it
> means that it is impossible to describe any kind of circular
> structure. I'm not familiar enough with the use of
> multipart/related to know if it is a restriction that is
> tolerable there. Perhaps someone can comment on that.
>
> Thanks,
> Paul
>
> Dale.Worley at comcast.net wrote:
> > Referring back to last March, in
> >
> http://www.ietf.org/mail-archive/web/sip/current/msg22426.html, I was
> > contemplating whether the rules for testing a message body required
> > searching through combinations of choices regarding which
> alternatives
> > of multipart/alternatives are chosen. As I then conceptualized the
> > problem, the question was "Is there a choice of
> alternatives such that
> > all the constraints for 'successful processing' can be
> met?" It turns
> > out that if you phrase the question that way, the problem is
> > NP-complete, which means in practice that any algorithm to solve it
> > must do backtracking.
> >
> > The new formulation of the problem attempts to avoid this
> by dividing
> > the problem into three phases: (1) assemble a tree which
> represents
> > the dependencies of super-parts' being processable upon that of
> > sub-parts, (2) walk the tree, tagging parts from the bottom
> up as to
> > whether they are processable, (3) if the overall tree is
> processable,
> > selecting for processing the correct subset of parts as directed by
> > the tags in the tree. As long as the size of the tree is
> commensurate
> > with the number of body parts, the algorithm is inherently
> linear, and
> > clearly it doesn't involve backtracking.
> >
> > In regard to step (1), the tree is assembled:
> >
> > - the root node represents the message body
> > - a node for a multipart/alternative part has one child for each
> > component
> > - a node for a multipart/mixed part has one child for each
> > component
> > - a note for a multipart/related part is not automatically given
> > children for its components
> > - a header in a part that refers to another component generates a
> > child of that part that represents the referred-to component
> > - similarly, a reference within a part that refers to another
> > component generates a child of that part that represents the
> > referred-to component (This is how multipart/related nodes get
> > children.)
> >
> > Of course, if a component is referred to multiple times,
> the component
> > is represented by multiple nodes in the tree, one node for
> each of its
> > roles.
> >
> > However, there are subtleties I might be overlooking, so I
> am going to
> > revisit the embedding of the NP-complete problem "3-satisfiability"
> > into MIME to see what it reveals about my sketch of an algorithm.
> >
> > Referring to the previous message:
> >
> > The reduction is as follows:
> >
> > http://en.wikipedia.org/wiki/NP-complete
> > http://en.wikipedia.org/wiki/Reduction_%28complexity%29
> > http://en.wikipedia.org/wiki/3-satisfiability#3-satisfiability
> >
> > Choose an instance of the 3-satisfiability problem,
> that is: a set of
> > boolean variables A, B, C, etc. and a formula composed of:
> > a series of "clauses" joined by AND, consisting of
> > a series of variables and their negations
> joined by OR.
> > The problem is to determine if there is any assignment
> of truth values
> > to the variables which makes the formula true, that is,
> which makes
> > true one of the variables or negations in each of the clauses.
> >
> > For example, choose 4 variables A, B, C, and D, and the formula
> >
> > (A OR B OR C) AND (B OR NOT-C OR D) AND (NOT-A OR NOT-B
> OR NOT-D)
> >
> > This can be satisfied by setting A=TRUE, B=FALSE,
> C=TRUE, and D=FALSE.
> >
> > Construct a multipart/mixed with each component labeled
> > handling=required. Within that, for each variable,
> have a part which
> > is multipart/alternate with 2 parts. If processing
> chooses the first
> > part of the part corresponding to a variable, that will
> model the
> > variable being TRUE. If processing chooses the second
> part of the
> > part corresponding to a variable, that will model the
> variable being
> > FALSE.
> >
> > For example, our body starts:
> >
> > multipart/mixed
> > multipart/alternative;handling=required
> > first part corresponding to A (not
> yet specified)
> > second part corresponding to NOT-A
> > multipart/alternative;handling=required
> > first part corresponding to B
> > second part corresponding to NOT-B
> > multipart/alternative;handling=required
> > first part corresponding to C
> > second part corresponding to NOT-C
> > multipart/alternative;handling=required
> > first part corresponding to D
> > second part corresponding to NOT-D
> >
> > For each clause of the formula, add an additional body
> part, marked
> > "Content-Disposition: by-reference; handling=required".
> Give the part
> > a 'cid' URL. The ability of the recipient to
> successfully process
> > this part will model making the corresponding clause TRUE.
> >
> > For example, our body is now:
> >
> > multipart/mixed
> > multipart/alternative;handling=required
> > first part corresponding to A (not
> yet specified)
> > second part corresponding to NOT-A
> > multipart/alternative;handling=required
> > first part corresponding to B
> > second part corresponding to NOT-B
> > multipart/alternative;handling=required
> > first part corresponding to C
> > second part corresponding to NOT-C
> > multipart/alternative;handling=required
> > first part corresponding to D
> > second part corresponding to NOT-D
> > part (not otherwise
> specified);by-reference;handling=required;cid=clause1
> > part (not otherwise
> specified);by-reference;handling=required;cid=clause2
> > part (not otherwise
> > specified);by-reference;handling=required;cid=clause3
> >
> > We now want to arrange things so that one of the last 3
> parts can be
> > successfully processed only if we choose for processing
> one of the
> > earlier parts that corresponds to a variable value that
> will make the
> > corresponding clause true. This is done by embedding forward
> > references in the variable-corresponding parts that refer to the
> > clause-corresponding parts.
> >
> > For example:
> >
> > 1 multipart/mixed
> > 2 multipart/alternative;handling=required
> > 3 first part corresponding to A
> > references cid:clause1
> > 4 second part corresponding to NOT-A
> > references cid:clause3
> > 5 multipart/alternative;handling=required
> > 6 first part corresponding to B
> > references cid:clause1
> > references cid:clause2
> > 7 second part corresponding to NOT-B
> > references cid:clause3
> > 8 multipart/alternative;handling=required
> > 9 first part corresponding to C
> > references cid:clause1
> > 10 second part corresponding to NOT-C
> > references cid:clause2
> > 11 multipart/alternative;handling=required
> > 12 first part corresponding to D
> > references cid:clause2
> > 13 second part corresponding to NOT-D
> > references cid:clause3
> > 14 part;by-reference;handling=required;cid=clause1
> > 15 part;by-reference;handling=required;cid=clause2
> > 16 part;by-reference;handling=required;cid=clause3
> >
> > In order to process this body, the recipient must
> choose exactly one
> > of each pair of bodies in the multipart/alternative
> body parts. This
> > choice is equivalent to choosing a set of values for
> the variables.
> > The references embedded in the chosen parts refer to
> the body parts
> > that correspond to the clauses which are made true by the
> > corresponding variable's value. In order to
> successfully process the
> > final body parts, the recipient must have chosen a set
> of variable
> > values that makes all of the clauses true.
> >
> > The resulting tree is:
> >
> > 1 -+- 2 --+- 3 ---- 14
> > | |
> > | +- 4 ---- 16
> > |
> > +- 5 --+- 6 --+- 14
> > | | |
> > | | +- 15
> > | |
> > | +- 7 ---- 16
> > |
> > +- 8 --+- 9 ---- 14
> > | |
> > | +- 10 --- 15
> > |
> > +- 11 -+- 12 --- 15
> > | |
> > | +- 13 --- 16
> > |
> > +- 14
> > |
> > +- 15
> > |
> > +- 16
> >
> > Though the number of nodes is greater than the number of
> body parts,
> > it's limited approximately by the number of appearances of
> variables
> > in the original boolean formula, so it's linear in the
> number of body
> > parts. So in building the tree, we don't have exponential effort.
> >
> > In the new approach, we walk the tree, tagging each node as
> to whether
> > it is processable. In the above model, we assume that all
> the simple
> > parts are processable, so all nodes labeled 2 through 16 are marked
> > processable. And as we tag the nodes for processability, we choose
> > for each multipart/alternative node which child will be processed
> > (namely, the last child that is processable). In this tree, the
> > choices are 2 -> 4, 5 -> 7, 8 -> 10, and 11 -> 13.
> >
> > Where things get interesting is deciding if node 1 is processable.
> > All of its children are processable, of course. But the
> crux of the
> > construction is how the "by-reference;handling=required"
> constraints
> > on nodes 14, 15, and 16 are handled. In this algorithm, we check
> > whether each part is referenced by a part that will be processed.
> > Fortunately, all references to these nodes must have been
> tree-walked
> > before node 1 is examined, so can determine unambiguously
> whether used
> > references point to those nodes. In this case, we have chosen to
> > process 2, 4, 5, 7, 8, 10, 11, and 13, which means that
> there are used
> > references to 16, 16, 15, and 16. This leaves 17
> unreferenced, and so
> > node 1 is determined to be unprocessable.
> >
> > Note that the new algorithm avoids being NP-complete essentially by
> > making choices as soon as they are seen. In this model,
> values NOT-A,
> > NOT-B, NOT-C, and NOT-D are chosen outright. Since the example
> > formula is not satisfied, the body that encodes it is not
> processable.
> >
> > This example also points out that the requirement that references
> > appear before the parts they reference is needed to make
> the problem
> > tractable. Otherwise, there could be circular dependencies between
> > the choices for different multipart/alternatives, which might be
> > impossible to analyze by an algorithm as simple as a tree-walk.
> >
> > Dale
> > _______________________________________________
> > Sip mailing list https://www.ietf.org/mailman/listinfo/sip
> > This list is for NEW development of the core SIP Protocol Use
> > sip-implementors at cs.columbia.edu for questions on current sip Use
> > sipping at ietf.org for new developments on the application of sip
> >
> _______________________________________________
> Sip mailing list https://www.ietf.org/mailman/listinfo/sip
> This list is for NEW development of the core SIP Protocol Use
> sip-implementors at cs.columbia.edu for questions on current sip
> Use sipping at ietf.org for new developments on the application of sip
>
_______________________________________________
Sip mailing list https://www.ietf.org/mailman/listinfo/sip
This list is for NEW development of the core SIP Protocol
Use sip-implementors at cs.columbia.edu for questions on current sip
Use sipping at ietf.org for new developments on the application of sip