The project I am currently working on for the Advanced Free Software Tools course, Vox Anonymus, strives to offer a web-based (through Django) anonymous voting platform, giving users the ability to change their votes after having cast it.
In order to ensure anonymity, while retaining the ability to modify the vote, and also keep voters from cheating (the mantra “trust, but verify” comes to mind), an individual “vote key” is generated for each voter.
This vote key is then used as the password in a Password-Based Key Derivation Function (PBKDF for short) to generate a fairly strong key, which is hard to reverse-engineer back to the vote key. This generated key is then used to link a voter, with votes in various topics. It is a bit technical, and this is not the purpose of this post, so, moving along, quickly:
The vote key must be generated by Vox Anonymus, in order to disable a dishonest user from cheating the system. There are some pretty serious restrictions on the key as well, most of the regarding entropy. So, in order to create a good vote key, we need som pretty good random numbers.
Random numbers are tricky. You can’t generate them, since that would imply that there was some form of structured process to “generate” them. And how could a structured process create randomness? Especially, how could a structured process create randomness, and guarantee that said randomness for the same situation, state and input, won’t output the same randomness again?
There are some rather cool Pseudo-Random Number Generators (PRNG) available, but if an adversary got hold of the exact state of the PRNG at the time it generated a random sequence, the adversary could generate the same sequence, and thus get a hold on the key.
On GNU/Linux systems there is most of the time the /dev/random pool, and you can install the Entropy Gathering Daemon to improve it further, but using that in Vox Anonymus is not an option, as I intend to keep it as system independent as possible.
The solution, then, is not to look within, but to look outwards. Enter http://www.random.org/. Sure, there are no proofs that these numbers are random, and as such, should they be used? If the site is for real, then gathering randomness from the static in the airwaves are just as good, if not better, than any PRNG implementation I could include in the project.
So the procedure for fetching random numbers from random.org is as follows:
- Determine the external IP number of the server on which Vox Anonymus runs
- Check the quota for “my” IP number on random.org
- If there are enough numbers left, fetch the needed numbers, otherwise wait a couple of minutes
Step number one is completely retarded. I determine “my” external IP number by querying http://www.whatismyip.org/ for it… And I have found this to be the only? robust way of actually determining the external IP number through Python.
Please correct me if I’m wrong (and I don’t see how I couldn’t be).
We have finally arrived at the reason for this post: urllib2. I found a rather good resource, http://www.voidspace.org.uk/python/articles/urllib2.shtml, which showed me most of what I needed to know.
There where however some problems: random.org offers a http interface, based on query strings (GET requests). At the same time, the creator of random.org asks that people writing scripts to automate the process of fetching random numbers please include their email address in the User-Agent header, so that they can be sent a notification in the event that their scripts are behaving badly.
I spent quite some time figuring out how to create a request going through GET, but adding the User-Agent header. The final outcome was rather easy, so I guess I am somewhat of an idiot for not figuring it out earlier, but hopefully this will help someone else:
import urllib
import urllib2
values = {'foo': 1, 'bar': 2}
data = urllib.urlencode(values)
url = 'http://www.example.org/?' + data
header = {'User-Agent': 'wildcard@example.org'} # don't do this, only for random.org
req = urllib2.Request(url, data, header)
try:
response = urllib2.urlopen(req)
return response.read()
except urllib2.HTTPError, e:
# do something
except urllib2.URLError, e:
# do something else
Thus ends the adventures in Python-land, for now anyway