Almost every day we check our bank account from our smartphones, we buy products in Amazon, we send private messages to our family and colleagues, and so on. We do this because we know that it’s safe, that no one can steal our credit card number or read our messages.
Information security is the discipline which ensures that this data is safe using a lot of complex techniques. But all these techniques are built on top of some basic principles, which are called the CIA triad. It’s not a government agency, it’s an acronym for Confidentiality, Integrity and Availability, the three basic principles on which information security is built. If we get to understand these principles and why are they important, it’s going to be easier to understand more complex techniques.
We’re going to explain security “from the upside down”, focusing on what each of these principles are and how they can help us to secure communications in order to keep Eleven safe from the evil Hawkins National Lab.
Integrity refers to assure that the data remains the same as when it was created during its whole lifecycle, by detecting any unexpected change to its contents. This means that we need to ensure that if an unauthorized entity deliberately changes some parts of that data or if its contents are modified due to, for instance, a network failure, we detect the issue in order to act accordingly.
Let’s see why we need to ensure this with an example. Imagine that Dustin, who has a new clue about Will’s whereabout, sends a message to Mike which says: “Meet me in the AV room this afternoon”. The bad people of the Hawkins National Lab are intercepting all the messages and see a chance to put a trap and catch Eleven, so they intercept Dustin’s message and change its contents to: “Meet me in Mirkwood this afternoon”.
Given that Mike can’t check whether the message has been changed from the original Dustin’s message, they will trust the message and go to Mirkwood, where an army will be waiting to catch Eleven. But we don’t want them to catch her, do we?
To solve this problem, we need a way to check if the message has been changed or not in transit (on its route from the sender, Dustin, to the receiver, Mike). This is where hashing functions can help us. A hashing function has the following properties:
- Given an input it returns always the same output.
- Different inputs produce different outputs (no collisions).
- It’s not possible to get the original input from the output of the function (non-invertible).
In plain English, they produce a kind of “mathematical abstract” of an input message. The power behind it is that Dustin (the sender) can send this “abstract” along with the message, and Mike (the receiver) can check the message contents by applying the same hash function as Dustin and comparing both “abstracts”. Remember that we said that a hash function returns always the same output for the same input message, so if the “abstracts” match, the message hasn’t been changed.
Do we really solve the problem? Sadly, the answer is no. If the evil guys know the hash function used by Dustin, they can modify the message and generate a new “abstract” for this modified message, replacing Dustin’s one. When Mike checks the message it matches, so, again, he will go straight to the trap. Nobody said that solving the problem was easy.
If we take a look at why the evil guys are able to change the message, it is because they can use the same hash function as Dustin, so they can generate a valid “abstract” of their modified message. So we need to avoid them from using the same hash function as Dustin, for instance, using a hash function which takes some parameter that the Hawkins Lab can’t know. How do we do this?
First, we need to know some basics about public key cryptography. Juan wrote an awesome post about it in case you want to know more. For the sake of this post, we only need to know the following properties:
- An entity (like Dustin) has a private key, which is a secret that only he knows.
- Associated with his private key, there is a public key which is distributed to everyone.
- A message encrypted with the private key can only be decrypted with the associated public key and vice versa, a message encrypted with a public key can only be decrypted with the associated private key.
With this in mind, Dustin can create the “abstract” of the message using a hash function which takes his private key. We say that Dustin digitally signs the message, and from now on we will call this “abstract” the message signature. When Mike receives the message, it checks the message signature with Dustin’s public key (remember, only the public key can decrypt something encrypted with the private key).
Our enemies of the Hawkins Lab can’t do anything now. They can’t know Dustin’s private key (it’s a secret), so they have no way of modifying the message and generate a valid signature. With this approach, we’re assuring that the message hasn’t been modified and that only Dustin could have sent it.
Bonus track: certificates
Public key based techniques work because we can associate a public key to an identity like Dustin. Certificates are the way that these public keys are distributed to the world (a public key, as it name says, is public for everyone). A certificate contains the public key and the identity, and are digitally signed by a trusted authority which verifies the identity. We call them certificate authorities (CA). We do this because someone can send a certificate which says that it contains Dustin’s public key, but it really contains the Hawkins National Lab public key (identity theft), so we build a chain of trust around the identities of the certificates.
Hold on, a certificate is digitally signed by a CA, which in fact needs to have a certificate in order to check that signature, which in fact is signed by another CA. (that’s called a certificate chain). Does it ever ends? Yes, at some point, there is a CA whose certificate is selfsigned (signed with its own private key). This CAs are blindly trusted by everyone (you could imagine what could happen if their private keys are stolen).
Confidentiality means keeping the data unavailable for unauthorized entities. It’s strongly related to privacy, avoiding that someone who shouldn’t read some data can read it without permission.
The problem (again)
While we were discussing the concept of integrity, we ended up with a solution in which Dustin digitally signs his message to avoid the Hawkins National Lab from modifying its contents without noticing that the sender of the message is not Dustin anymore.
But, even if they can’t change its contents, Dustin’s original message is still readable for them. So they know that Mike and Eleven are going to the AV room in the Hawkins Middle School, why not put the trap in the school, then? Holy s***, after all our efforts, Mike and Eleven are going again into a trap. We have to do something to prevent it.
This time the problem is easier to solve. We need to “change” the message in some way that only the affected parts (Dustin and Mike) can read its original contents, while for the evil guys it’s not more than a meaningless hodgepodge. In fact, we need to encrypt the message, something that has been done since the origin of mankind thanks to the cryptographic algorithms.
Symmetric-key encryption is a way to encrypt messages based on a shared secret (a key) which allows to both encrypt and decrypt messages. The most representative example is the Caesar’s cipher, which consists in setting a number as key, and then replace each letter in a message by the letter which is that number of positions down in the alphabet.
|Plain||Meet me in the AV room this afternoon|
|Encrypted||Tlla tl pu aol HC yvvt aopz hmalyuvvu|
Given so, Dustin and Mike can set a shared known secret (for instance ‘7’) that only them know and then Dustin encrypts the message using this key before sending the message to Mike. When Mike receives the message, it uses the key to decrypt the contents and get the original message, while the evil guys aren’t able to read the contents of the message because they don’t know the key.
Obviously, in the real life, more complex algorithms are used whose keys are more difficult to guess, but it’s the same idea.
The downside of this approach is that if any other knows the shared key he can decrypt the message. We’re not ensuring that only Mike reads the message, if Lucas knows that shared key he can also decrypt the message.
Again, as with the Integrity problem, public-key encryption is going to make the trick. Instead of having a shared secret between sender and receiver, we use the public-key properties to encrypt the message in a way that only the receiver can decrypt it.
We know that Mike has a private key that only he knows. Dustin knows Mike’s public key, which is available for everyone. If Dustin encrypts a message with Mike’s public key, based in the public-key properties that message can only be decrypted with the associated private key, which is Mike’s private key, something that only Mike has. Voilà! Using this approach we can be sure that only Mike can read the message.
Putting all together
To solve the integrity and confidentiality problems together, we mix both solutions. Dustin signs the message with his private key and then encrypts the message and the signature with Mike’s public key, ensuring that only Mike will receive the message. When Mike receives the message, he decrypts its contents with his private key and checks the signature with Dustin’s public key, ensuring that the message comes from Dustin and hasn’t been modified.
Finally we’ve defeated the evil guys of the Hawkins National Lab, and know El is safe. Hurray!
This one is quite obvious and more related to a system perspective. In order to accomplish its purpose, the information must be available when needed. This means that when Mike wants to read the message, he has to be able to access to the message. For example, in order to keep it safe, we can store the message in a safe-deposit box, fill it with concrete and throw it to the upside down where the ‘Demogorgon’ will protect it. While the message is safe, no one, including Mike, can read the message, so it’s not a valid solution.
We have seen how the CIA triad is the base for solving security problems, using a communication scenario as example. We have built a completely safe communication environment by ensuring these principles. In the real world security is more complex, but these are the basics to understand everything.
Before I go, there is an special message signed with my private key, so you can be sure that no one has modified it. Greetings!
gpg --keyserver pgp.mit.edu --recv-keys C64E7115D0E27F68
echo "-----BEGIN PGP MESSAGE----- owEBSAG3/pANAwAIAcZOcRXQ4n9oAcsYYgBZ/J57RnJpZW5kcyBkb24ndCBsaWUK iQEcBAABCAAGBQJZ/J57AAoJEMZOcRXQ4n9o8c0IAK6g5x35vBjlDukUCP9fywJ5 AzgC7skLQ/4EBg3oqQXe4OHqGN/mmFXggrO4CuS5MGdbAizeouESSL/wjc5p7kwH l+3n1D20qhhEJYNZaESeD6UK5SmyaqjAaBhQ8D34ghtNAOjXTunjJj0ETXw4xP/i UOePVP3uJzqngZ5+S7Alf0+mcWRqfYaT/KI3h2rC6IlPPOJlQysBpPa8UEFVWEgQ ZAqm9t6btAOcRSMl9HIA3MLs7OH5XWplnV8h1hEAmclJO3VXEY5jeEwkGUfegbu8 3ZEZ/rsVUw54cAbtFCF3NZhE7uvkb7KVf6I5zxzu0WL73a0huGaBCwQM0QLYHfY= =okM2 -----END PGP MESSAGE-----" | gpg
Diagram icons by Icons8 (ic8.link/50302, ic8.link/17932)
I started getting in touch with Big Data, using Spark and the Hadoop ecosystem, working with cloud providers. Now, as a Big Data Developer, I work with IaaS, Terraform, Ansible, Scala, Spark… I am a vocational Computer Scientist and I enjoy learning new technologies and discovering new things.