This weekend I found some strange files floating around my Dreamhost directories. I was hacked! My web page is pretty simple and non-dynamic, but I have a bunch of other sites that I host for friends and other side projects, and the current guess is that the hackers got in through an outdated version of Wordpress or the super old installation of Mediawiki that was hanging around.
(Worth noting that the folks at Dreamhost gave some top-notch customer service, helping me figure out where the hack came from, what it had done, and cleaning up the damage. Yeah, they’re a budget host, but I’ve always been pretty impressed as what they let you do and how they help you out when you need it.)
On top of updating all of the old web apps and setting them to automatically update in the future, I started looking into making the whole shebang less dynamic so it would be less of a target for hackers.
It was a fun little project to do the conversion, and I’ve detailed it below the cut for anyone who’s interested. Either way, the upside is that this whole thing has me more excited about writing and blogging in general, so hopefully I can ride this enthusiasm to up my posting schedule beyond the meager 2-3 posts/year that it’s become.
So, yeah. The blog has moved away from Wordpress and is now using Octopress. I would in no way recommend Octopress for most people, but it’s fantastic if you’re comfortable with code and version control. The site you’re seeing right now is not dynamic; it’s just a bunch of static HTML files that gets generated every time I make a new post. No database, no formatting engine, and no comments. But it should load faster, be easily portable to other hosts, etc.
The first step up was getting my posts out of my Wordpress database and into the flat-file system that Octopress uses. The built-in conversion scripts that Octopress inherits from its underlying Jekyll engine (lord, talking about software projects quickly makes you sound ridiculous) worked pretty well for a first-pass at this. When it builds the site, Octopress just parses the files in a certain directory and turns them into posts. Something that’s not very well documented is that it uses the file’s extension to determine formatting. I haven’t pushed very far, but it looks like it recognizes
.textile. The Wordpress importer made a bunch of HTML files (since Wordpress is authored in HTML by default), but I had been using Textile for my blog, so it was just a matter of changing all their file extensions with a quick bit of shell scripting.
(I actually wasted a TON of time playing with automatic Textile->Markdown tools before I discovered that Jekyll handled Textile natively. My preferences in lightweight markup languages have shifted over the years, so I’ll probably be using Markdown from here on out. Thankfully, another advantage of the flat-file is that it’s easy to mix-and-match formatting styles as I go. Heck, maybe I’ll use raw HTML just for kicks once.)
Right, so the new blog wasn’t going to have comments, but I wanted to preserve the old comments people had made. I looked into a lot of options for this, but in the end just ended up writing a little Python script that pulled them out of the database and applied some formatting to them. I then had to manually append them to their appropriate posts; I probably could have figured out some clever way of cross-referencing the Wordpress database with the Octopress files, but my blog is pretty damned small and it wasn’t too hard to figure out which comments went where.
The script is super ugly (just meant as a quick one-off hack to get the job done) but if anyone else might find it useful, here it is:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107
Oddly enough, Octopress doesn’t have great support for posting a bunch of links in the sidebar. I took Balaji Sivaraman’s blogroll plugin and modified it. Previously, it looked at a directory where each YAML file represents a different link. I really didn’t want to make a new file every time I wanted to add a link to the sidebar, so in my version, there’s a
_sidebar directory in which each file represents a section of the sidebar.
1 2 3 4
The number at the front of the file name is parsed out to determine the ordering in the sidebar, and the rest of the filename gets titlecased to become the section label. Inside each file are multiple YAML documents representing each link. Now to add a new link, it’s just modifying the file instead of having to add a new one.
I had to finally sit down and get comfortable with Git. Octopress is pretty much entirely based around using Git to manage the blog, to deploy, etc. Git’s main competition (though I don’t think that’s the right word in this context… the other software playing in their space?) is Mercurial, and I’ve been using that a lot recently in developing Angel. It’s weird – the concepts are very similar, but Git comes at it from such a different angle. It’s been slow going so far as my brain adjusts to the Git way of doing things.
Mercurial still makes more intuitive sense to me, and I imagine it will be that way for a while, but GitHub’s popularity makes it hard to not at least try to get comfortable using it. So this proved a good motivator to buckle in and get to it.
Finally, I don’t do it often, but I wanted to be able to write a blog post on the go when I had to. Usually when I travel these days, I don’t bring my laptop with me. For most of what I need on the road, my iPad serves me just fine, has way better battery life, is lighter, is harder to break, and doesn’t have to be taken out of my bag at airport security.
But, as is much lamented by the technoscenti, it’s hard to do development things like source control and arbitrary scripting on the iPad, and thus far nobody’s coded up an app that streamlines Octopress deployment from iOS.
Some would give up. I saw it as a challenge.
Luckily for me, so did Dennis Wegner.
He had a pretty clever set up which included monitoring a Dropbox folder that mapped into his blog’s repository, with different directories for Drafts, Queued Posts, and the normal Published ones. He also had to do some trickery because it had to run through his home Mac mini. Rather than trusting some home machine to always be on, I figured I’d try and leverage an existing server.
I didn’t want to tempt fate at Dreamhost, though, by setting up the Linux Dropbox daemon to run forever on my account there. (Shared hosting; I don’t want to be a bad neighbor.) Thankfully, a friend of mine from college has a server set up that pretty much just serves as the technical playground for our old nerd crew. (This server is called “the bear.”) I’d do all my hosting there if I didn’t love Dreamhost’s panel and One-Click installs so much, but for something that doesn’t need a whole lot of space, it’s great.
SO – my Dropbox account is now syncing on the bear, which also has a copy of the blog’s Git repository, symlinking its relevant directories into the Dropbox. These files could change one of two ways:
They get updated from Dropbox, meaning I edited them on my iPad (or some other machine where it was easier to get to Dropbox than syncing Git). In this case, the watcher script (more on that in a second) notices the change, automatically commits the change to Git, regenerates the site, and deploys the static files to hosting.
I push to the Git repository on the server from my laptop, which I’ll likely only do after running a deployment from it. In this case, the server repository automatically does an update on its working directory, thus updating the Dropbox files.
This does require a slight bit of maintenance on my part to remember that I either need to pull from the server (if I’ve updated the site via Dropbox magic) or to push to the server (if I’ve updated locally). That’s pretty similar to my usual workflow of pecking away at a project from multiple computers with source control as my go-between, so it’s not too bad.
Making the the Dropbox watcher script ended up being kind of fun. Once again, I modified someone else’s work, in this case I think adding a tiny bit more magic than was there previously. The original version of the script checked the
_drafts folder for any file with a
published attribute of
true, then renamed it appropriately (so its filename matches the date) and moved it into the
_queue folder. Then it would check the
_queue folder for any posts with a published date set in the past, and move them into the
_posts folder. If anything was set to be published, it rebuilds and deploys the site.
Pretty snazzy, but I added some more folds to it.
If it comes across a
.txtfile in the _drafts folder, it renames it to the Octopress scheme using today’s date and the first few words of the file as a temporary title. It also sticks some boilerplate YAML frontmatter (including
published: false) in so that its attributes can be parsed by the rest of the script, in addition to giving it a
It also checks the
_postsfolder for anything with a
published: falseattribute and moves it back to the
If it changes or moves any files, it does so through Git and automatically commits to the server repository.
The first point lets me use Drafts or Nebulous Notes to make a quick start for a post and then just drop it into the
_drafts directory on Dropbox without having to worry about the YAML frontmatter, the correct filename scheme, etc. Later, once I revise it and give the post a proper title, the system will be renaming it anyway.
The second point lets me effectively withdraw a post after it’s been published, in case I made some screwup or just didn’t mean it to go out yet.
The third addition just ensures that everything stays in sync across Dropbox, the server, and my home machine (through push/pulls).
(I also made sure that it would only parse the metadata (for example
date: *) in the designated area at the front of file, as I found when trying to deal with this very post that matching that regex multiple times could spell strange doom for the script.)
The watcher script runs as a cron job every 5 minutes.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187
Now I got a fancy new blog, and I had some fun doing it. (Yes, this is my idea of fun. I’m a professional nerd, what do you want? No, I won’t fix your computer. Did you reboot it, yet?)