The Diffie-Hellman key exchange algorithm
The Diffie-Hellman key exchange was the first widely used method of safely exchanging keys over an insecure channel. This post discusses about the DH algorithm mathematically, it's security and vulnerabilities.
29 October 2020 • 8 mins read
Introduction
The Diffie-Hellman key exchange was the first widely used method of safely exchanging keys over an insecure channel. The DH is used for securely calculating the shared keys for safe communication. It is widely used in various security protocols like TLS, SSH etc.
Before understanding the actual algorithm, let's understand the need of such an algorithm.
Need of a key exchange method
Let's say a person A wants to communicate with a person B whom he/she has never met before, over an insecure channel. How would they prevent the intruders on the channel from getting hold of their messages ?
The most obvious and common way is to encrypt the message and then send it to person B. So let's say person A uses a key for encrypting the message. But person B does not know the key which is used for encryption by A.
If A encrypts the message without telling B the key , then the intruder won’t be able to read it, but neither will the B! So the important question is:
How can person A securely exchange information with B if A hasn't had the opportunity to share the key ahead of time?
This is where the Diffie-Hellman key exchange algorithm comes into the picture. The DH key exchange was the first publicly-used method for solving this problem. It was published in 1976 by Whitfield Diffie and Martin Hellman.
How does the DH exchange work ?
The DH exchange algorithm uses modular arithmetic as it's core to calculate the shared key from the public and private keys.
The mathematics may seem a bit intimidating at first sight but it's easy when you understand the concept behind it.
Algorithm
So let's take our previous example of person A and B wanting to securely communicate with eachother.
Initially A and B both agree upon two numbers called as the generator and . Generally is a very large prime number that is atleast bits long and is a smaller number which is a primitive root of .
A number is a primitive root of if the powers of modulo i.e.upto
are unique and in the range to .
Where '' is the modulo operation.Formally, A number is a primitive root of if the positive powers generate all nonzero congruence classes modulo .
For example, let . Then is not a primitive root, since,
As the value 2 repeats for both and , thus is not a primitive root of . Let's take another example where and
As all the values are unique and in range to , is a primitive root of . Now that a bit of math is clear let's proceed to the next step.
Once and and decided, A and B choose their respective private keys(secret numbers). Let's say that person A chooses a number and B chooses .
and are the private keys and are never sent independently across the insecure channel.
Now A and B calculate their public keys which can be sent across the insecure channel.
A calculates it's public key as
And similarly B calculates it's public key as
These keys are sent through the channel i.e. A sends it's public key to B and B sends it's public key to A.
4) After receiving public keys of eachother, A and B simultaneously calculate the shared key which will be used for communication. A calculates the shared key as from B's public key as
B calculates the shared key as from A's public key as
Hang on a minute. How are these keys one and the same? These keys don't seem to be equal though.
Here's where the magic of modular arithmetic comes in. By basic modular arithmetic, we can say that
We can extend this to multiplication as it is nothing but repititive addition!
Can we extend this to exponentiation? Yes! Exponentiaion is nothing but repititive multiplication! So,
Using this result we can simplify the and values as follows :
and
And there we go!! Without ever sharing their private keys, both A and B were able to generate a shared key which is completely unknown to the intruders on the channel!
Example
What's better than learning an algorithm through an example? Let's continue our previous example of person A and person B. Feel free to work through the example on your own to get a better grasp on the algorithm.
I'll be using smaller values of , and just for the sake of easy calculation. Practically, , and have very large values to prevent brute force calculation of and .
Let's assume A and B choose and the primitive root (we proved that is a primitive root of above).
Let's also assume A chooses and B chooses .
Now A calculates it's public key as
so
B calculates it's public key as
so
A sends to B and B sends to A via the channel.Now A and B calculate the shared key simultaneously as follows :
And voilà! A and B generated the same shared key without ever sharing their private keys. This is the Diffie-Hellman key exchange algorithm.
With the generated shared key, A can encrypt the message and send it to B. Only B can decrypt the message as only B knows the shared key apart from A.
Practical considerations
For Diffie-Hellman to be a practical key exchange algorithm, we need an efficient
way to find primitive roots and perform fast exponentiation.
You can read more about finding primitive roots here.
We can use binary exponentiation to calculate the value of for very high values of in time.
Security of DH method
Why is the DH algorithm secure? Suppose a third party intruder, C, intercepts
, , , and . If C can compute and , he/she easily compute just as A and
B did. Then C must compute satisfying
.
This computation
is known as taking a discrete logarithm of and there are no fast algorithms for
doing it. When p is large, say 100s of digits, calculating a discrete logarithm is
effectively impossible.
The Logjam attack
The most powerful known mechanism for finding the solution of the discrete logarithm problem is the number field sieve algorithm.
In 2015, a research team calculated the most commonly used 512 bit DH exchange numbers in TLS.
The attack affects any server that supports DHE_EXPORT ciphers, and affects all modern web browsers. 8.4% of the Top 1 Million domains were initially vulnerable.
They also suggested that the NSA may already have such capabilities. They quoted that :
“A close reading of published NSA leaks shows that the agency’s attacks on VPNs are consistent with having achieved such a break.”
Is it safe then?
Yes. Despite the Logjam attack the DH method is quite safe when the value is atleast bits long and is correctly and effectively implemented.
The Logjam attack is not about the weakness of the DH method but it is about the weak implementation of the algorithm.
This article explains the best practices to implement the Diffie-Hellman key exchange method very well.
Applications of DH key exchange
The main purpose of the Diffie-Hellman key exchange is to securely generate shared secrets. These secrets can then be used with symmetric-key algorithms to transmit information in a protected manner. Symmetric algorithms are used to encrypt the data as they are more efficient than public key algorithms.
The Diffie-Hellman key exchange is frequently implemented in security protocols such as TLS, IPsec, SSH, PGP, and many others.
This is used in SSH mainly to establish a connection between the remote server and the client. The communication is often carried out in sessions so that even if an intruder brute forces the private keys, the client and server generate new keys for new sessions.
Conclusion
The Diffie-Hellman exchange is one of the most widely used and safest algorithms for key sharing / generating keys.
While it may be vulerable to smaller values, but it is almost impossible to break it for higher values of .
And thus it will continue to be a major part of security protocols used for communication over insecure channels.
Thank you for reading this post. Feel free to comment below if you liked this post or have any suggestions or improvements.