Monday, November 23, 2015

Encryption and Decryption

After the Paris  attacks a lot is being said about encryption (and decryption) on mobile devices. A lot of it is bordering on utterly fantastic nonsense. People are taking the hype written into marketing material seriously. That is not a good place for this debate to be.

Let's start at the beginning, sometime after Sept 11, 2001, a massive electronic surveillance machine was set up. The machine is able to download large quantities of data in real-time from all manner of devices. How exactly it does has been discussed by Snowden et al... but frankly I don't care about that. As mobile devices are the most common means of communication, a lot of data collected is from these devices. What happens next should be quite familiar to those of you that work with "big data".

The data collected is archived and sorted into two bins, "relevant" and "irrelevant". The term is defined from the perspective of the national security mechanism, it has nothing to do with local security mechanism (such as police agencies).

In the interest of conserving resources, "Irrelevant" data is largely discarded after some level of pattern analysis. The "relevant" data is obviously stored and processed till as much predictive information as possible is squeezed out of it.

In the "relevant" data, is a small subset of encrypted communications. As only the communication (and not its meta data) is encrypted, it is possible to perform a certain level of analysis on it. For example, Abu Jihad sends an encrypted message to his fedayeen Abu-soon-to-be-dead. The message itself is encrypted, but the fact that both the Abu-whatevers use an unregistered phone is in the meta data. A simple sort can catch conversations between unregistered numbers.

Once unregistered numbers that talk to each other are identified, Then it is possible to look for patterns in the correlated meta-data sets and build up a relational database between such numbers. A clustering algorithm can tell you if there is anomalously high frequency of communication between two or more points in this database.

Once you have identified the target(s) - you can start the decryption. It doesn't make sense to start the decryption before you do all the necessary filtering because decryption is a resource intensive process.

So now - we talk about encryption and decryption.

Decryption is a lot easier when you know what kind of encryption is used. As indicated before - there are two main ways of encryption - a Vernam cipher, or a public-private key pair. The ways of dealing with a Vernam cipher are well discussed in books, it is rarely used except by major intelligence and military services. This is because the Vernam cipher uses unique secret keys. These are expensive to generate and one has to maintain the security of the entire secret key distribution chain. Only a major intelligence or military service has the budget to do things like that.

I want to focus on the issue of decrypting the more commonly used public-private key system (the RSA or the Digital Encryption Standard).

Assume for a moment that Abu-whatevers are talking to each other almost every 12 hours. If the message between them is intercepted then you have 12 hours to decrypt it. Without getting into too much detail, breaking the DES involves prime factorization (yes - that thing you learn in middle school and later forget). In DES - a key pair (i.e. a unique pair of primes) is distributed between the two ends of the communication. One key (so called public key) is given to the transmitter, and the second (private/secret key) is given to the receiver, and a unique product of these two primes is used to encrypt the message. The exact encryption process is something computationally inexpensive (ex. addition or subtraction or some combination of the two).

(leaving out the pesky details) If you have a large repository of prime numbers, you can generate all manner of products and iterate to see if they cause your message to be turned into plain text (there was a good brute force example of this at one the talks in a PyCon). Given how labor intensive this is - you can have an AI do this (even a simple neural network will do wonderful things). However this is where the exact details of the encryption software come into play. Some software has a maximum number of tries it allows before it locks/destroys the encrypted information.

So to decrypt a message between two potential targets, you are actually limited by the number of tries the software allows. These limits can be defeated by a backdoor - essentially something that suspends the attempt counter on the software, but there is no guarantee that a hostile agency will not get access to that door and do something to the message. So such backdoors are usually avoided (see all the back and forth about encryption on the latest IoS, Android etc...).

Hope this helps you all in framing a more meaningful debate on the issues at hand.

ps. cracking the communication after the event can help in understanding the critical dynamics in the event and identifying organizational structures. Without this information you won't be able to distinguish between Abu Jihad and Abu-soon-to-be-dead.

Monday, November 16, 2015

How did the Paris attackers communicate without detection?

There are some questions about the manner in which the Paris attackers communicated with the ISIS operational HQ during and prior to the attacks. During the Bombay 2008 attacks, the terrorists were using sat-phones - talking directly with an ABHQ in Pakistan.

There are two basic protocols for secret communication - a dead drop or a live drop.

In a dead drop, the information can be with or without encryption, however - it is cumbersome process, poorly suited for real-time evolving needs.

In a live drop, the information is encrypted with either a DES (public+private key pair) or a Vernam cipher (with associated key distribution). The possibility of detection is high, so you have to ensure that neither end is compromised.

If the communication is compromised, then the element of deniability is lost - such as the Pakistani ABHQ during the Bombay 2008 attacks learned.

As the NSA downloads a vast variety of electronic communication, the only way to really escape it are to either find a non-downloaded stream, or hide in the noise, or up the level of encryption.

Examples of non-downloaded streams apparently are PS2 game based channels. In games such as Gears of War or Call of Duty etc... the players can communicate with each other. The players can exchange messages with words like "target","kill", etc... without raising suspicion. There would be little point in downloading these streams as the amount of nonsense in them would simply create false positives and chew up valuable processor time of the NSA's AIs.

An example of hiding in the noise is to stick to non-IMEI numbered phones or one-time use phones. These are typically bought by poorer users who can't afford expensive data plans. The use is infrequent. Among immigrants (like Syrian, Iraqi and Afghan) the phones get limited use and numbers in these states are called relatively infrequently as the refugees attempt get in touch with relatives. Given the high cost of the call, the calls are brief and screening them is resource intensive. This is a place where a communication could be hidden (theoretically speaking).

An example of upping the encryption is to use a multi-layered RSA - encrypt communications within communications. Again this is feasible for anyone to do, public key and private key generators are available, these can be downloaded onto any smart phone and used. There is a drawback - people can still hear you transmitting and even if they can't understand what you are saying - they will pick up changes in the occurrence of the transmissions. If you wanted to use a Vernam cipher, you would have to distribute the keys first. That is a separate but important challenge.

Whoever handled the communications for the Paris attack knew what they were doing as they appear to have defeated the electronic surveillance regime in Europe and the Middle East.

Given the ethnic tensions in Europe, and the poor nature of border surveillance, there is a very high likelihood that the attacks will keep happening until the core communication network is broken. 

The possibility of failure in that regard is real. India for example has consistently failed to break the communication  networks of the ISI inside India. The ISI can typically piggyback any terror strike on its networks and dramatically improve the chances of its success. This is why Indian security planners constantly have to work on ways to keep pressure on the ISI and its channels in India.

I don't know if there will be a victory in what is now a very Eurocentric war on terror. The Belgians have identified possible nodes in Molenbeek.  There are other places where that can happen in Europe (Amsterdam, London, certain parts of Manchester etc.. come to mind) but hopefully if the communication network in Molenbeek can be disrupted - a temporary reprieve will be obtained until whoever put that network in place regenerates it (or if you believe in this CryptoParty stuff - the network self-assembles - again don't know what the limiting timescale is for something like that).