RE: [Nea] RE: Detecting Compromised Endpoints
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Nea] RE: Detecting Compromised Endpoints



One of the most common objections to NEA (and similar
systems) is "If the endpoint is compromised, you can't
trust anything it says." This issue continues to come
up again and again. Several people on the NEA list have
asked to have the charter address it, saying that it is
out of scope for NEA (except that it will be addressed
in the overall security analysis).

The reason I started a separate thread on the topic
of Detecting Compromised Endpoints was so that we could
have a reasoned discussion of the problem and then
refer future people who raise the issue to this thread.
I said this in the first email on this thread. Thanks
to all who have contributed to the thread. I think it
has largely run its course.

I will now draft some language on this for the charter
and submit this for the mailing list to consider. I'll
use a separate thread so we can keep this thread just
for discussion of the technical problem of how and
whether compromised endpoints can be handled.

Thanks,

Steve 

> -----Original Message-----
> From: Thomas Hardjono [mailto:thardjono at signacert.com] 
> Sent: Friday, May 26, 2006 12:14 AM
> To: nea at ietf.org; Frank Yeh Jr
> Subject: RE: [Nea] RE: Detecting Compromised Endpoints
> 
> Thanks Frank - yes, you are spot on.
>  
> My understanding as to how the discussion drifted to trusted 
> hardware arose from some questions in the IETF.
>  
> Basically, some folks at the IETF were suggesting that we 
> don't bother with NEA because (a) any posture agent on the 
> client can be modified by malware (to then forge posture 
> reports); and (b) They don't believe in trusted hardware or 
> even the possibility of secure systems.
>  
> The funny thing is that virtually all protocols invented in 
> the IETF is subject to changes by malware in (a).  This 
> includes IPsec VPN clients, routing code, NAT boxes, and even 
> the entire IP stack.
>  
> Perhaps we should just give-up on the Internet :)
>  
> cheers,
>  
> /thomas/
>  
> 
>  
> ________________________________
> 
> From: Frank Yeh Jr [mailto:fyeh at us.ibm.com] 
> Sent: Thursday, May 25, 2006 5:57 PM
> To: Paul Sangster
> Cc: nea at ietf.org
> Subject: RE: [Nea] RE: Detecting Compromised Endpoints
> 
> 
> 
> Greetings,
> 
> I am a little confused by this thread as it seems to me to be 
> out of scope with the NEA work. Perhaps I added to the 
> confusion by mentioning in a previous post that NEA could 
> leverage trusted hardware. For that, I am sorry (well, sort of).
> 
> So my impression of the NEA charter is that the procotol will 
> be about format, aggregation, and transport of Posture 
> information. There does not seem to be anything related to 
> establishing trust of the various components that use the 
> procotol. However, it would be valid to use the NEA protocols 
> to enable a single collector/validator pair that would attest 
> to the integrity of all other components. I do not see why 
> this should be part of the NEA protocols themselves.
> 
> This thread probably has to do with the many TCG-aware people 
> participating in NEA, but I think that is confusing the 
> issue. However, since I said TCG, I would point out that the 
> TNC specs and the PTS spec are separate. Why then are we 
> talking about combining these functions in NEA?
> 
> Regards,
> Frank Yeh
> Corporate Security Strategy Team
> IBM
> Tivoli Software
> 
> 
> "Paul Sangster" <Paul_Sangster at symantec.com> wrote on 
> 05/25/2006 03:39:02 PM:
> 
> > 
> > Discussions like these make me wonder what the automotive safety
> > engineers must have discussed when the idea of the: seat 
> belt, air bag,
> > crumple zones ... were being considered :).  How many lives 
> do we need
> > to save to make it useful ...
> > 
> > Hopefully we wouldn't pass on doing NEA because its not a 
> panacea for
> > security (its not.)  NEA itself doesn't try to solve the problems of
> > this thread, we're merely discussing if they might be orthogonal
> > mechanisms to help prevent the collector from mis-reporting the
> > situation.   Putting a spotlight on the desire to report 
> configuration
> > should help bring focus to protecting the collection/reporting
> > mechanisms.
> > 
> > So with that said, let go into the interesting dialog.  
> Clearly you've
> > raised very interesting and challenging issues that most 
> (all?) of our
> > other standards couldn't live up to (fortunately, there are threat
> > models to save the day by scoping them out :))
> > 
> > > -----Original Message-----
> > > From: Marcus Leech [mailto:mleech at nortel.com] 
> > > Sent: Thursday, May 25, 2006 11:20 AM
> > > To: Stephen Hanna; nea at ietf.org
> > > Subject: RE: [Nea] RE: Detecting Compromised Endpoints
> > > 
> > > 
> > > 
> > > 1) What needs to be measured?
> > > 
> > >    All the trusted code on the endpoint needs to be measured.
> > >    That includes all the software and firmware in the Trusted
> > >    Computing Base, any code whose compromise would potentially
> > >    cause the system to violate its security policies. This
> > >    includes the BIOS, the boot loader, the OS kernel, the
> > >    kernel drivers, etc. If the operating system is well written,
> > >    it should not include application code since that code
> > >    cannot cause the system to violate its security policies.
> > > 
> > > The essential problem is that for any *practical* 
> operating system for
> > >   purposes of general-purpose computing, there *is* 
> application code,
> > >   the compromise of which, can cause the system to violate 
> > > its security
> > >   policies.
> > 
> > Sure, privileged code also runs in applications so the 
> kernel doesn't
> > need to contain everything.  Ideally OSes would limit privileges
> > (ability to bypass some security check) more granularly 
> (e.g. Solaris
> > 10's privilege model) but many do not or the granularity 
> isn't used so
> > point taken.
> > 
> > I think one of the areas of discussion around the use of TCG
> > verification is how to enable product vendors (or 3rd party 
> "integrity"
> > oriented vendors) to define trust dependency trees so that 
> verifiers can
> > determine what set of component measurements it needs to 
> verify to feel
> > safe trusting the requesting system for a particular 
> purpose.  For many
> > use cases, only portions of the machine need to be checked (e.g when
> > your just trusting a browser session so need to check the kernel,
> > privileged apps and things relied upon by the brower.
> > 
> > I don't think this is an unsolvable problem, just a hard 
> one.  We need
> > an ongoing dialog to understand what is running on the 
> requesting system
> > that needs to be trusted and traverse its trust dependency 
> tree.  For
> > NEA this might be the kernel plus the set of privileged 
> apps and some
> > security configuration information or it might involve checking a
> > special security partition + hypervisor (depending on the system
> > architecture.)
> > 
> > > 
> > > The only thing the hash does is to assert that the code is 
> > > free from any
> > >   *known* vulnerabilities.  Once the system is compromised, 
> > > it can continue
> > >   to provide an idyllic view to the TPM/PCR machinery.
> > 
> > The hash just records that a particular image of code was run on the
> > system (so it could later or immediately be verified [*]) 
> in a way that
> > can't be hidden.  The TPM only allows "extends" to the PCR 
> so trusted
> > apps can effectively add to the values but not remove prior entries.
> > The meta-data about each extend is stored in a measurement 
> log and is
> > used for verification.  A rootkit might be able to alter 
> the measurement
> > log but to hide its existance but it wouldn't be able to 
> change the PCRs
> > to erase its measurement.
> > 
> > The way the TPM is used it has the capability of signing 
> the PCRs with a
> > private key for a particular attestation to a verifier.  
> The verifier
> > needs an equivalent measurement log (history of the PCR) to 
> verify that
> > it correctly recomputes the aggregate current value stored 
> in the PCR(s)
> > and can verify the meta-data makes sense and is considered valid by
> > policy.  I'm not sure how malware that was measured into a PCR could
> > hide its presence without resetting the PCR back to a desired value.
> > 
> > [*] - Background: the model allows for a local verifier 
> that checks code
> > measurements against a set of known valid measurements 
> before allowing
> > it to run. This effective prevents code with known 
> vulnerabilities from
> > running and establishes a trustworthy base to build upon as 
> long as you
> > can trust the verifier (done via chain of trust.)  As 
> Marcus correctly
> > points out this doesn't catch unknown vulnerabilities.  The TCG
> > technology that I know of doesn't do behavioral checking 
> although such
> > technologies do exist and watch for zero-day attacks as we 
> speak.  You
> > could imagine that these could be integrated and will 
> evolve over time.
> > 
> > > 
> > > 
> > > 3) How does the Posture Validator get that list of PCR values
> > >    for known good configurations?
> > > 
> > >    Typically from a subscription service that maintains a
> > >    list of PCR values for various versions of software from
> > >    all the major vendors. The decision about which software
> > >    versions are "good" is a policy decision generally left
> > >    to the network owner.
> > > 
> > > You have to assume that a compromised system would be 
> able to maintain
> > >   a consistently-idyllic view using some similar type of 
> > > "subscription service".
> > >   Essentially, a compromised system can continue to update 
> > > the view of the
> > >   world it wishes to present to the TCG subsystems.  The 
> > > *reality*, however,
> > >   will be quite different.
> > 
> > Same comment as above.  How does the malware reset the PCRs 
> back to the
> > idyllic values so future attestations comes up clean?  Most 
> of the PCRs
> > do not allow "set" type of operations, so this means the malware
> > couldn't use the TPM signed set of PCR values to spoof the 
> verifier (nor
> > could it use the TPM's cert since it lacks access to the 
> private key to
> > quote (sign) the incorrect pseudo-PCRs.)
> > 
> > >  
> > > 
> > > 4) Won't this prevent me from using my favorite OS?
> > > 
> > >    That depends on your network's policy. If you don't like
> > >    their policy, you're free to shop around. Remember that
> > >    the TPM is an optional component and comes disabled. You
> > >    can just leave it disabled. Or you can run your favorite
> > >    OS within an untrusted application. Or you can have an
> > >    untrusted application forward packets from your favorite OS.
> > >    Anyway, those subscription services aren't cheap. Trusted
> > >    boot will probably be mainly deployed in high-security
> > >    environments.
> > > 
> > > There's also the scenario of a NAT-like device in a 
> > > TCG/TPM/NAP/NAC/whatever
> > >   world.  You have a "policy-compliant" system act as a proxy 
> > > for a pool
> > >   of non-compliant nodes.  "Fixing" that kind of scenario 
> > > tends to produce
> > >   a system that is unusable for ordinary use.  Perfectly 
> > > secure, and utterly
> > >   unusable.
> > 
> > Not sure I follow.  Are you referring to the tunneling 
> attack discussed
> > in the TNC's IF-T specification or a true proxy?  In the true proxy
> > case, the proxies should show up in the list of 
> measurements required by
> > the verifier and the verifier's policy DB shouldn't allow this
> > configuration.  For the nested tunnel attack, there are already
> > suggested countermeasures and work is underway for more (not
> > unsolvable.)
> > 
> > Was your comment about "fixing" the proxy case meaning that 
> systems need
> > to be able to run such software or else they aren't useful? 
>  Maybe you
> > could explain more about your thinking?
> > 
> > > 
> > > 5) How does this let the server detect endpoint compromise?
> > > 
> > >    If a trusted software component on the endpoint has been
> > >    compromised, that will show up as a PCR with an unrecognized
> > >    value. Because the TPM hashes the software before it's run
> > >    and won't let the software zero the PCR (and because the
> > >    hash is preimage resistant), the compromised software can't
> > >    get the TPM to report a valid PCR value. And it can't
> > >    replay an old PCR value because of the nonce.
> > > 
> > > That only applies to "compromised on disk".  The state of 
> > > said software
> > >   at some time T after it has begun executing *cannot* be 
> attested-to
> > >   by anything.  If that were possible, then it would 
> imply having a
> > >   general solution to the halting problem.  In fact, even the 
> > > known-good
> > >   PCR values are only attesting to the fact that the software 
> > > is known to
> > >   be resistant to all the vulnerabilities we *know about* in 
> > > the software.
> > >   With the emergence of zero-day exploits of 
> vulnerabilities, I wonder
> > >   how useful this will actually be.
> > 
> > Without trying to solve the halting problem you can 
> introduce mechanisms
> > to help raise the security level of the system.  Is this 
> useless to try
> > unless they are foolproof?  There are an endless list of security
> > mechanisms which can't address this issue (bug in an IKE 
> implementation
> > leaks phase 2 keys, buffer overflow of a privileged daemon 
> and now we
> > can change TLS trust anchors, timing attack against 
> processor caches and
> > you can learn OpenSSL private keys ...)  I agree it's a 
> hard problem but
> > hope we don't let it stop progress.  People are working 
> (and shipping)
> > technologies to try to catch zero-day exploits and these could be
> > integrated with other system integrity protection features 
> to raise the
> > bar as best we can without waking Mr Turing :).
> > 
> > If a system measured all software as its loaded and check 
> it was clean
> > against known vulnerabilities, recorded the measurement in a TPM (or
> > tamperproof environment) and created a measurement log, used AV and
> > behavioral techniques to watch for viruses in running code, 
> ... and was
> > able to produce an attestation of all of this and the
> > configurations/databases it used in a non-spoofable way, 
> this seems to
> > be better then we have today.  Add in support for separate 
> protection
> > domains to isolate security mechanisms (and logs) and we seem to be
> > raising the bar.
> > 
> > Of course this isn't TNC, but its something orthogonal that 
> people in
> > the industry are working on and seems to fit in with what is being
> > planned.  Such integrity mechanisms could offer a Posture
> > Collector/Verifier pair to fit into the NEA model.
> > 
> > >    
> > > 6) Aren't there ways to attack this system?
> > > 
> > >    Yes. You can break the system if you can successfully attack
> > >    any one of these: the TPM (with physical attacks or by finding
> > >    flaws in it), the CAs that signed the TPM's certificates (or
> > >    their operators), the trusted software that's listed in the
> > >    known good configs, the cryptographic algorithms and protocols
> > >    involved, or the Network Enforcer or the NEA server or their
> > >    operators. There's also a clever attack described in Appendix A
> > >    of the TNC IF-T document. Countermeasures for that attack are
> > >    given in that document.
> > > 
> > >    I have omitted some attacks for which countermeasures are
> > >    available. If you have other attacks you're interested in,
> > >    please raise them and I'd be glad to address them.
> > > 
> > > 7) Won't there inevitably be vulnerabilities in the 
> trusted software?
> > > 
> > >    Yes. That's true today. It will probably always be true. 
> > > The best way
> > >    to handle it is to reduce the number of 
> vulnerabilities by making
> > >    the trusted software as small and secure as possible. 
> Beyond that,
> > >    identify vulnerabilities as soon as possible (preferably before
> > >    exploits are available), prepare fixes, and then remove 
> > > the old buggy
> > >    software from the list of known good configurations (or 
> > > move it to a
> > >    list of configurations that require quarantine and mandatory
> > >    remediation).
> > > 
> > > The problem is that *real* generally-useful computing 
> > > environments are still
> > >   going to have very large components that have to be trusted 
> > > to behave
> > >   correctly at all times.   If you have an operating system 
> > > that has all
> > > those
> > >   desirable properties, then you don't need any of this 
> > > TPM/TNC/NEA/TCG goop.
> > 
> > Ah, but this means that we do need TPM/TNC/NEA/TCG goop :).  Clearly
> > big, complex software will have security bugs that will get 
> fixed and
> > patches released.  Hopefully NEA will help get the lack of 
> those patches
> > recognized and distributed more quickly (being more centrally
> > controlled) but will need new measurement policies 
> provisioned quickly. 
> > 
> > > 
> > > The raison d'etre for this stuff is precisely because 
> > > operating systems today
> > >   are vulnerable due to a large number of factors, including 
> > > broken architectures,
> > >   programming language design, human frailty, etc, etc.  And 
> > > it is precisely because
> > >   of that that this stuff can't be made to work, for any 
> > > sufficiently-rigorous
> > >   definition of the word "work".
> > > 
> > > The bad guys are getting cleverer, and a large fraction of 
> > > the "good guys" haven't
> > >   actually done any detailed analysis of all this 
> > > TCG/TNC/TPM/NEA stuff yet.
> > 
> > Having an NEA WG just helps bring exposure to this area so 
> more detailed
> > analysis can be performed and we can improve the situation.  The
> > TCG/TNC/TPM specs are public so can be reviewed.  Their have been a
> > number of good cryptographers and security people who have 
> reviewed the
> > TCG stuff but it's a work in progress so hopefully this 
> momentum will
> > continue.
> > 
> > >   I fully expect that there will be gaping holes found within 
> > > two years.
> > 
> > That's why we have version 2 :).  History is certainly on your side
> > (SSHv1, SSLv1, Kerberos v4, ...) have all needed updates to improve
> > based on holes, I'm sure this will be the same.
> > 
> > > 
> > > I'd be perfectly delighted to be proven wrong.
> > 
> > Sorry but you'll still have a job for a few more years :).
> > 
> > > 
> > > 
> > > _______________________________________________
> > > Nea mailing list
> > > Nea at ietf.org
> > > https://www1.ietf.org/mailman/listinfo/nea
> > > 
> > 
> > _______________________________________________
> > Nea mailing list
> > Nea at ietf.org
> > https://www1.ietf.org/mailman/listinfo/nea
> 
> 
> 

_______________________________________________
Nea mailing list
Nea at ietf.org
https://www1.ietf.org/mailman/listinfo/nea




Note: Messages sent to this list are the opinions of the senders and do not imply endorsement by the IETF.