This Summer, former NSA contractor Edward Snowden became famous for revealing PRISM, a confidential mass surveillance program run by the US agencies to eavesdrop on phone conversations.
On July 31st the Guardian revealed some more documents leaked by Snowden, including what appears to be top secret PowerPoint presentations from 2008 and 2010, pitching a global Internet surveillance tool called Xkeyscore.
These documents are meant to back up a statement that Snowden had made earlier:
"I, sitting at my desk, certainly had the authorities to wiretap anyone, from you, or your accountant, to a federal judge, to even the President if I had a personal email."
Since then, the NSA issued several official answers (including Black Hat's keynote), and much have been said and speculated in the media about it, making the whole thing somewhat blurry. (more or less successful vulgarization attempts of the technical aspects and the fact that political opinions tainted the debate did not help).
Based on the leaked material analysis, and deductions, this FAQ aims at shedding light on it all, and providing concrete answers, in a clear-cut way.
The Guardian provided all slides from the 2008 presentation, but only barely readable excerpts from the 2010 presentation. We feel there are slight differences between the slides themselves and the Guardian's analysis of the situation; overall, we can say with a high degree of certainty that Xkeyscore:
Looks for certain items in those sessions (so called "meta-data"), and indexes them in a distributed database. Indexed items picked from TCP sessions include:
The TCP sessions eavesdropped on are fully logged (the presentation explicitly says "full-take data is stored"), indexed by the metadata above, and based on a configurable set of filters, kept for a certain amount of time on various databases, in a pyramid fashion (i.e.: less data is kept longer, most data is kept for 3 days).
NSA analysts can query the database in various ways. Examples appearing on slides in the two presentations include:
Beyond this "selector-based queries" (selectors are criteria to search the db, like in queries listed above), analysts seem to be able to mine for anomalies, to uncover previously unseen intelligence.
One has to keep in mind that the first presentation ages back from 2008; back then, HTTPS was not very well spread, and services like Facebook, Twitter, and webmails would not all default to HTTPS.
In 2013, it would however seem that eavesdropping on HTTP only traffic is useless: a typical user will resort to Gmail/Yahoo/Hotmail for emails, Facebook for social/chat, which all default to HTTPS, and google for searching, which defaults to HTTP when not logged, but can easily be forced to HTTPS.
At this point, we see only 3 ways to effectively eavesdrop on HTTPS traffic:
Obtaining the private keys of the targeted domains (facebook.com, google.com...). It would imply that companies like Facebook and Google gave their keys away to the NSA, either willingly, or under legal (and other) pressure. Or that the keys were stolen.
Performing a Man-in-the-middle attack on sessions to targeted domains. For the attack to be successful, it means that the NSA possesses valid certificates for domains such as google.com and facebook.com. It implies that certificate authorities (Symantec/VeriSign, GoDaddy...) either willingly signed the NSA keys for that, or got hacked. Or that NSA analysts found collisions in the signing algorithm.
Breaking the RSA or AES encryption itself. As of this writing, no academic document on cryptography proves or hints at the fact that this may be possible in 2013.
None of these hypotheses can be confirmed, nor whether HTTPS and other encrypted traffic is actually decoded.
If HTTPS (and SSL in general) traffic is indeed not decrypted, you should enforce the use of HTTPS enabled services whenever possible on the corporate network, and set it up for your own web services - especially services granting remote access to tools in your various business units. Proper certificate usage (algorithm and key length) should be enforced. On FortiGates, you can configure application control HTTPS, to only accept HTTPS requests.
Now, if HTTPS/SSL traffic is being eavesdropped on by the NSA, it means one of the 3 hypotheses above is valid.
In case hypothesis 1 is valid, there is nothing you can do to avoid communication to external services being spied on (Facebook, Google...). But to avoid that communication to your own services is spied on, you would have to make sure your domains' private keys are stored in a very safe place. And not given out to your Certificate Authority, in order to outsource management of the certificates.
In case hypothesis 2 is valid, it means the NSA has a tendency to operate Man-in-the-middle attacks, armed with valid certificates. To avoid being a victim of one of such, use Pre-Shared Keys (PSK) rather than certificates for your VPNs. If you run websites, connect to them as a normal client, and check if the certificate's fingerprint is what it should be. Avoid md5 at all cost for your certificate fingerprints (risks of collisions).
In case hypothesis 3 is valid, for your VPNs and websites, you may use a longer key to slow down cryptanalysis, and use newer cryptographic standard like Elliptic Curve (EC) and SHA-3 (Keccak) whenever available. For reference, we think that in terms of confidence in algorithms: