Renewing my SSL certificates was on my to do list for months and today I’m at home recuperating from a fever that kept me up all night. Since my web server is now patched it is a good time to get new SSL certificates. So I contacted StartSSL and did the deed.
WordPress and SSL has always irked me because just putting a certificate on the web server and using the https URL would still give you elements that are loaded via http (not SSL) and your browser’s address bar would look like this.
See that yellow warning triangle over the lock? It irks me. It does. It’s a personality flaw, a blemish, an imperfection. It loudly announces to the world that I’m Doing It All Wrong™. I see that on my site and I hang my head in shame.
OK it’s not really that big a deal. I could play with WordPress SSL plugins but part of my background is configuring applications on servers and Apache2 has a useful module called mod_substitute.
I have two configuration files for my site. One is for the http version and the other is for SSL. It’s like two separate virtual hosts with the same directories.
After I enabled mod_substitute I added these lines to my SSL config.
<Location /> AddOutputFilterByType SUBSTITUTE text/html Substitute "s|href="http://blog.dembowski.net/|href="https://blog.dembowski.net/|" Substitute "s|href='http://blog.dembowski.net/|href='https://blog.dembowski.net/|" Substitute "s|src=' http:|src='|" Substitute "s|src="http:|src="|" </Location> # NOTE: Remove the space before the http above
I’m using the alternate delimiter “|” because I don’t want to escape out the URL slashes.
That’s probably too many lines. The first two Substitute lines replaces any URLs of mine from http:// to https://. The next two are for any reference that load elements using plain “http:”. I don’t substitute those with “https:” but instead make those URLs “//” without an explicit protocol.
Doing that gets this image in my browser’s address bar.
Green is good. Order is restored.
Why didn’t I use a WordPress HTTPS plugin?
Because I’m lazy and not feeling well. Also using mod_substitute lets me filter the HTML output after WordPress has generated it but before it is sent to the web browser. That gives me more confidence that I’ll get all of the URLs that I want to change.
I’m only using this trick on the SSL version of my site. It’s not a perfect solution and I’m curious to find what this breaks. I had to disable Jetpack’s Photon option because some of my images were not being sent to that CDN properly and there may be other thing as well.
This is not something for everyone (if you’re on a shared host for example) but if you can load Apache2 modules and restart your web server then this may work for you too.
Update: Using (.*) instead of “blog” works for my other vhosts as well. Nope, that breaks LOTS. reverting back.
Toby says:
The only thing that is missing, is a rule that will not also change the canonical tags!
What’s the best way to solve this?
December 11, 2015 — 4:46 pm
Jan Dembowski says:
@Toby: What’s the HTML for the canonical tags? The
mod_substitute
code can be modified to replace anything but in order to do that I’ll need a sample of what you mean.Currently I’m using nginx which has a different method for that.
December 13, 2015 — 9:23 pm
Tobias says:
So for http the canonical would point to itself – it should only rewrite anything apart from canonical tags.
I assume it could be achive by a pretty weird regex within the rule?
i was also thinking about doing a dummy replace first so that it will skip the replacemet if it runs through all rules sequentially.
December 14, 2015 — 3:07 pm
Jan Dembowski says:
@Tobias:
Perhaps, but I think if you set the Site URL and WordPress Address to a
https://
URL then I think it would sort itself out.December 16, 2015 — 7:41 pm
Toby says:
The question is – in general – how to rewrite all urls apart from the canonical tag.
Example 1: Canonical should point to HTTP and be indexed by Google instead of https while a user should only navigate within HTTPS once a HTTPS url is opened.
Example 2: no protocols should be contained in the HTML – all absolute protocol-urls should be rewritten to // instead of http:// or https:// – however not for the canonical tag as a // canonical reference is pretty useless.
December 17, 2015 — 5:53 am
Jan Dembowski says:
I don’t agree with your examples and using
mod_substitute
is probably not the best answer for you. 😉The canonical URL for this post is this:
You could add
mod_substitute
rules after the ones above to replacewith
And that should work for you. But again, I disagree with your examples.
Google has already said that https pages will get a small boost. All web browsers should be able to handle the 301 redirect to a valid https and there is no search engine penalty for that 301 redirect. All of my http requests to this site get sent to the https version.
Yes, with enough traffic an encrypted site becomes noticeably slower than a clear text version. But by enough traffic I mean thousands of page hits per second. That’s a scalability problem and with a CDN you can solve that one.
December 17, 2015 — 8:02 am