It would be beneficial to have your eMail-Address on your homepage when you want to give people the opportunity to contact you. In my case, I actually want to give people that opportunity1. However, we all know that there are villains out there who harvest emails from websites in order to sell them to spammers. Hence, you would want to obfuscate your eMail-Address in such a way that it cannot be read from the HTML source code of the website easily. It goes without saying that naive obfuscation techniques like
Unfortunately, I realized that while you can copy and paste the above, you will actually copy the mangled version with all the unicode direction-changing characters in it. In the end, I think a user would now actually have to copy it letter by letter as well.
I think this is at least marginally harder for a spam crawler to process. The method also has its problems, at least in its current form. You better make sure that this eMail address does not have anything following it in the same line. However, I just mean this to be a proof of concept, someone can probably use CSS instead of those crazy unicode characters in order to make it work as an inline element. Maybe, some time, I will even do that myself. In fact, expect an update to this post.
What does this do? It uses the unicode characters for right-to-left override and the one for left-to-right override to switch between directions. Imagine we have the string
import random, time, re random.seed(time.time()) RTL = '&#x202e;' LTR = '&#x202d;' def distort(e_mail): e_mail = [ '&#x%04X;' % ord(letter) if not re.match("[A-Za-z0-9]",letter) else letter for letter in reversed(e_mail) ] mode = random.getrandbits(len(e_mail)-1) default, opposite = LTR, RTL r = "" while e_mail: r += e_mail.pop() if mode & 1 == 1: r += opposite default, opposite = opposite, default e_mail.reverse() mode >>= 1 if default == RTL: r += LTR return r
email@example.com. We first write the string
usrfrom the left. Then, we write
moc.from the right, meaning we write the letter
mfrom the right, then the letter
ofrom the right, etc. The current string is
usr.com. Next, we write the string
@dofrom the left, which will then be
firstname.lastname@example.org. Finally, we write the letter
mfrom the right and we get
email@example.com. Here is what the code did, it has a couple more changes in direction, but essentially it's the same concept:
Oh yeah, and I replaced non-alphanumeric characters such as
>>> distort("firstname.lastname@example.org") 'u&#x202e;mo&#x202d;sr&#x202e;c&#x002E;m&#x202d;&#x0040;&#x202e;od&#x202d;'
@by the unicode HTML entity as well, just for kicks. Go ahead, paste that into an HTML document. You can mark the address with the mouse (although it's a bit awkward). For example, this is my email address:
- That does not go without saying: In some cases it is very important to discourage people from contacting you directly. For instance, it could be that you are usually not the right person to ask, but you get flooded with requests that you have to manually redirect elsewhere: It would be better if people got discouraged from contacting you while finding it very easy to contact the person that can actually help them. [↩]
- Famous enodings are ROT13, Base64, etcetera. [↩]