Update: Sometimes I DO over think a problem and a solution. Which is odd because SSL is also one of my (supposedly!) strong points. Skip to the comments below for something that Andrew Nacin pointed out. đ
âââââââââââ
Part of my professional life is to think about topics like data leakage. Thatâs when you do something and, without realizing it, you transmit information that you hadnât intended to.
For example, my company may have an internal web page with this URL:
And on that page is a link to a NY Times DealBook blog posting as a reference. One of the readers in my company clicks on that link without hesitation. Why wouldnât they click? Thatâs what the link is there for.
When Dealbook processes their web access logs, theyâll see a URL as the HTTP referer (Iâm spelling it correctly after this) that the company or person who clicked that link may not want them to see.
How to prevent sensitive referrers from being sent from your WordPress blog?
- Install and configure YOURLS (svn revision 703). Get that working with a short domain, itâs easy to do.
- Install and activate my short Force Javascript Redirection YOURLS plugin. [download id=â1âł]The useful bit is only one line.
- Install my WordPress Convert Links to Yourls plugin but donât activate it yet.[download id=â3âł]
- Modify two lines in that WordPress plugin for your configuration. Sorry, Iâm not up to making an options page (yet).
- Active that WordPress plugin.
And poof! the Tin Foil Hat is in place. Any links in your post content or comment text will have their links sent to your very own link shortner and the remote site will only see the short link as the referrer.
Read on to see how it works.
How does the YOURLS part work again?
When you visit a web page your web browser transmits the originating web page where you linked as the HTTP referrer. The browser is programed so if you visit a site that redirects you via a 301 or 302 HTTP status code, the referrer gets transmitted again by your web browser to the new web server.
Thatâs why when you use YOURLS and have a short link on your web page, the ultimate destination still sees the original HTTP referrer URL.
That link shortener uses a 301 status code to forward your browser to the correct long link. It only redirects via JavaScript is the headers were not successfully sent. That 301 code means the target web server seeâs where you came from.
But if the JavaScript method is used, then the status code is 200. The web browser goes to the short URL, the JavaScript send you to the target URL and the short URL is now the referrer that the destination sees.
Thatâs what my YOURLS plugin does. It uses a filter and tells YOURLSÂ âDonât figure it out, just use the JavaScript method for redirectionâ.
That will obscure the destination web server from seeing the original referrer link. All theyâll see is the short link and who cares if they see that?
To install my YOURLS plugin, extract the zip file to the YOURLS user/plugins directory and activate it in the YOURLS admin page.
Okay, so what about that WordPress plugin?
Iâm thinking of two areas in WordPress that could contain links and thatâs the post content and the comment text. Both of those are easy to filter in WordPress and Iâve created a plugin that does the following:
- Finds all the links in the post and comments and pop them into an array.
- Send each one of those links to YOURLS to get the short link.
- Substitutes each of the original links with the corresponding short link.
- Returns the modified output as the post or comment.
If you donât have it already, you will need to add the PHP cURL extension to your web server.
This plugin doesnât modify the links in the database, it just processes them via a filter. That way the HTML for the links sent to the readers web browser gets the short links.
When the someone clicks any of the short links they are sent to the YOURLS web server which then uses JavaScript to redirect them. Again, the target web server sees that short link as the referrer and not the original post.
How do you configure the WordPress plugin?
Install the Convert Links to YOURLS plugin and modify the two variables around line 51 of convert-links-yourls.php:
$mh_api_url = 'http://yyy.yy/yourls-api.php'; $mh_signature = 'XXXXXXXXXX';
Update those two with the information from your own YOURLS configuration.
If you donât update it, or somethings is wrong with your YOURLS setup then you wonât get the URLs substituted and the original URLs will remain intact. You wonât get an error message either so make sure your settings are correct.
So whatâs the catch?
I donât store the result of the URL shortening so each time the post/comment field is to be displayed, each URL gets sent to the shortner again. Thatâs not very efficient and results in your link shortner getting a request every time the link is filtered.
For a small WordPress site, thatâs not a problem. Small installations donât generate enough traffic to make your web server break a sweat.
But for web sites that create their own Slashdot effect (and you know who you are), it would compound the number of hits on your server. That would be bad during a self made DoS. Using a caching plugin will probably reduce those requests but I havenât really checked to see.
The first time links are shortened this adds a small but noticeable lag while the URLs are being processed. Once the links are in the YOURLS database then the web pages zip like before.
On the link shortener side, if the URL is not in the database then it getâs added and the short URL is returned.
If the URL already exists in the database then that old short link is sent back without creating a duplicate short link.
Or more accurately, thatâs what happened up to YOURLS 1.5.1-gamma svn revision 703 and prior. With 704 and up when the link is already shortened then the YOURLS server is returned without the short link URL.
Between the 703 and current (as of this writing) 712 revisions not that many files have been updated so Iâll report the issue on the wiki.
You can get the 703 revision using this command
$ cd yourls-root $ svn co -r 703 http://yourls.googlecode.com/svn/trunk .
DUDE. Seriously, you worry about this stuff?
No, not particularly.
Data leakage via HTTP referrers really isnât a problem for me and if I were really concerned I would run all manner of privacy plugins and use Tor. This is really just an exercise and it was an interesting problem.
In figuring out this one solution (there are others) I was able to learn about a cool regex, how to populate that regex result into an array (using the same command) and learn more about the link shortner software that I use.
I not only got to write a small WordPress plugin but also see how Ozhâs sample plugins work. Thatâs some serious Cool Beans there and Iâm having a great time.
Download the code and take a look, itâs all GPL 2. Later on Iâll add the license and readme.txt files. Itâs not complicated and Iâm always looking to make improvements and get feedback. Itâs a great way to learn new things.
But now if youâll excuse me, I need to go visit the super market. They have a sale on Reynolds Wrap and I want to re-line one of my baseball caps.
You can never really know when a good Tin Foil Hat can come in handy.
