Encryption and Decryption
After the Paris attacks a lot is being said about encryption (and decryption) on mobile devices. A lot of it is bordering on utterly fantastic nonsense. People are taking the hype written into marketing material seriously. That is not a good place for this debate to be.Let's start at the beginning, sometime after Sept 11, 2001, a massive electronic surveillance machine was set up. The machine is able to download large quantities of data in real-time from all manner of devices. How exactly it does has been discussed by Snowden et al... but frankly I don't care about that. As mobile devices are the most common means of communication, a lot of data collected is from these devices. What happens next should be quite familiar to those of you that work with "big data".
The data collected is archived and sorted into two bins, "relevant" and "irrelevant". The term is defined from the perspective of the national security mechanism, it has nothing to do with local security mechanism (such as police agencies).
In the interest of conserving resources, "Irrelevant" data is largely discarded after some level of pattern analysis. The "relevant" data is obviously stored and processed till as much predictive information as possible is squeezed out of it.
In the "relevant" data, is a small subset of encrypted communications. As only the communication (and not its meta data) is encrypted, it is possible to perform a certain level of analysis on it. For example, Abu Jihad sends an encrypted message to his fedayeen Abu-soon-to-be-dead. The message itself is encrypted, but the fact that both the Abu-whatevers use an unregistered phone is in the meta data. A simple sort can catch conversations between unregistered numbers.
Once unregistered numbers that talk to each other are identified, Then it is possible to look for patterns in the correlated meta-data sets and build up a relational database between such numbers. A clustering algorithm can tell you if there is anomalously high frequency of communication between two or more points in this database.
Once you have identified the target(s) - you can start the decryption. It doesn't make sense to start the decryption before you do all the necessary filtering because decryption is a resource intensive process.
So now - we talk about encryption and decryption.
Decryption is a lot easier when you know what kind of encryption is used. As indicated before - there are two main ways of encryption - a Vernam cipher, or a public-private key pair. The ways of dealing with a Vernam cipher are well discussed in books, it is rarely used except by major intelligence and military services. This is because the Vernam cipher uses unique secret keys. These are expensive to generate and one has to maintain the security of the entire secret key distribution chain. Only a major intelligence or military service has the budget to do things like that.
I want to focus on the issue of decrypting the more commonly used public-private key system (the RSA or the Digital Encryption Standard).
Assume for a moment that Abu-whatevers are talking to each other almost every 12 hours. If the message between them is intercepted then you have 12 hours to decrypt it. Without getting into too much detail, breaking the DES involves prime factorization (yes - that thing you learn in middle school and later forget). In DES - a key pair (i.e. a unique pair of primes) is distributed between the two ends of the communication. One key (so called public key) is given to the transmitter, and the second (private/secret key) is given to the receiver, and a unique product of these two primes is used to encrypt the message. The exact encryption process is something computationally inexpensive (ex. addition or subtraction or some combination of the two).
(leaving out the pesky details) If you have a large repository of prime numbers, you can generate all manner of products and iterate to see if they cause your message to be turned into plain text (there was a good brute force example of this at one the talks in a PyCon). Given how labor intensive this is - you can have an AI do this (even a simple neural network will do wonderful things). However this is where the exact details of the encryption software come into play. Some software has a maximum number of tries it allows before it locks/destroys the encrypted information.
So to decrypt a message between two potential targets, you are actually limited by the number of tries the software allows. These limits can be defeated by a backdoor - essentially something that suspends the attempt counter on the software, but there is no guarantee that a hostile agency will not get access to that door and do something to the message. So such backdoors are usually avoided (see all the back and forth about encryption on the latest IoS, Android etc...).
Hope this helps you all in framing a more meaningful debate on the issues at hand.
ps. cracking the communication after the event can help in understanding the critical dynamics in the event and identifying organizational structures. Without this information you won't be able to distinguish between Abu Jihad and Abu-soon-to-be-dead.