techie

How I Moved My WordPress Blog from Dreamhost to a Free EC2 Instance

Posted in techie on November 30th, 2011 by Aviv Ben-Yosef – 6 Comments

Just recently my Dreamhost plan, the one this blog is hosted on, expired and I had to renew it. Seeing the amount of money they took charge me realize that surely I can find something cheaper than >$10/month. After some snooping around I’ve settled on moving my WordPress blog to EC2. This is my story.

Disclaimer: this worked for me. If you lose your blog, too bad. Backup is your friend, my friend. You need some devops-chops to follow along.

After being tipped of by a couple of friends, I decided to look into setting up my site on EC2. Basically, my blog is really small (the WordPress export file is about 2MB), and it’s not like I get tons of traffic. That alone means that for a year now I can use the EC2 free tier, making this blog cost pretty much nothing.

Initial setup

First step was to register and AWS account. I created a micro instance, which is enough for most blogs and free for a year. The AMI (image) of the instance I used was Bitnami’s prebundled WordPress image, which you can read more about here. Do make sure to create your instance on an EBS and not instance store. That means that the data will be persistent. Change the instance’s security group to allow connections to port 80 (HTTP) and port 22 (SSH) from any IP.

You can assign a static IP to your instance for free! Just allocate a new Elastic IP on the EC2 console and attach it to your instance. Note that Elastic IPs cost nothing while attached, by if they aren’t attached your bill will start growing (after all, fresh IPs are a rare resource).

So you’ve got your instance up, eh? You can point your browser to your instance’s public DNS name, like so http://ec2-something-something.com/wordpress and see the default WordPress hello page. Go to /wordpress/wp-admin to login as admin (the default bitnami user/password are user/bitnami). Here you can start and setup your blog again.

Importing

On your original blog, you can use the export utility and then import all your posts and comments to the new machine. Easy as pie. If you have a lot of plugins and configurations, you might want to search for one of the many plugins that do that for you. If you’re like me and only have a couple of plugins and one theme, installing them manually takes about 3 minutes.

Moving WordPress to the root of the site

If you’d like to move WordPress to the root of your site (/ instead of /wordpress), remove the path from the General settings page and then SSH to the machine. Replace the first two lines of the file /opt/bitnami/apps/wordpress/conf/wordpress.conf with:

Alias / "/opt/bitnami/apps/wordpress/htdocs/"

Then, go to /opt/bitnami/apache2 and do:

sudo ./bin/apachectl restart

Gotchas

Got permalinks? You’ll need to:

chmod g+rw /opt/bitnami/apps/wordpress/htdocs/.htaccess

Want to receive email notification for new comments etc.? You’ll need to do:

sudo apt-get install sendmail && sudo ln -s /usr/sbin/sendmail /usr/bin/sendmail

If you have attachments in any of your posts, you might need to fix the URL after the final move.

Testing

Make sure all your links, widgets etc. are working before making the final move. Post something, add a comment.

Going live

In the General settings panel, change the URL of the blog to your domain. Go to wherever you were hosted before, find the DNS panel and change the DNS entries for your blog. Danger: This is an “expert” step, and if you don’t know what it means, I recommend grabbing someone with more knowledge. Change the A records for your domain to point at the elastic IP you gave your instance. That’s it! Wait a bit for DNS propagation and everything should be working!

Backups

I found out about the backup to dropbox plugin, which simply uploads all your blog to dropbox! Sweet, awesome and easy for small blogs!

Costs

So, using 1 micro instance with an EBS store of 10GB is free for the first year. Given nothing crazy in terms of network, you shouldn’t be paying at all for the first year, maybe about $1 a month. After that year passes, the instace and EBS store start kicking in. If you’re in it for the long run, like me, paying for the instance a year in advance costs about $9/month (3 years is like $7), and the EBS store costs $0.1 per GB, meaning $1. That’s about $10-$8/month, depending on how you pay for the instance. A great save compared to Dreamhost!

You should susbcribe to my feed and follow me on twitter!

Shell Hackery: The Use of “cd .”

Posted in Programming, techie on August 4th, 2011 by Aviv Ben-Yosef – 1 Comment

I have a nasty habit of going over my bash history every once in a while. Usually I sort commands by frequency to find stuff I can automate/alias. Last time I came across “cd .” and thought I’d write up a little explanation of why I find this seemingly useless command useful.

So what does it do? “cd .” literally means “change directory to the current directory”, which sounds like a no-op. The point is that sometimes the current directory is no longer the current directory! Let’s start with an example.

Say I have a git repository on my_repo/ and on its master branch there’s a my_repo/folder directory and on its bugfix branch that directory doesn’t exist. No Imagine I have a terminal window open after performing the following command:

cd my_repo/folder # now on branch master

And now, while that terminal is open I need to switch to the bugfix branch for a few minutes, do my thing and return to it. If I switch branches using a different terminal or some GUI tool, what becomes of my terminal’s shell? When I switched to the bugfix branch, git essentially removed that directory the shell was in, and when I returned to the master branch, the directory was put back into place.

So, one might expect that after switch back and forth between branches and returning to my original terminal, simply executing “ls -l” will show that everything is ok. But it won’t. What I would actually see when running “ls -l” is that the current directory is empty!

Oh no! Are all our files lost? Nope. They’re right there in my_repo/folder, but our shell doesn’t know that. To understand why, we need to dig a bit deeper. When a unix process accesses any file or directory, it obtains a file descriptor to it. That includes a shell’s current directory – all throughout its lifetime, it has an open fd of the current dir. You can see that by running lsof -p [your shell pid].

When process A holds an open fd to a file/directory and process B removes that directory, what should happen? Unix doesn’t have that file locking mechanism windows does. What it does do is remove the file from anywhere except still holding it somewhere til process A finishes working with it. What this means is that if, for example, you’ve got a file open in some software and accidentally “rm”ed the file, you can still recover the file because it’s held somewhere by the open program. You can see an example for restoring files this way on linux here.

Back to our problem! Our shell process is now sitting with its current directory actually being some phantom directory that is no more. That means that even after we checked out the master branch again and the directory was already there, no one updated our shell regarding that. It does know it’s in “my_repo/folder”, though.

That means that in order to quickly get our terminal back to being useable (say, we want “ls” to actually show stuff) we can, of course, be all lame, close the shell and open a new one. Or, we can “refresh” the file descriptor to the current directory. How?

cd .

Hope you learned something new!

You should subscribe to my feed and follow me on twitter!

Using Chef to Automatically Configure New EC2 Instances

Posted in Programming, techie on March 7th, 2011 by Aviv Ben-Yosef – 4 Comments

This is a follow up post to my post about using Puppet to get the same result. In the comments to that post I was told by a few people that chef can make my life easier and I decided to give a try. Here’s what I came up with.

In this post, as in the previous one, our goal is to be able to start a new EC2 instance with one command, which will in turn be created and started with Apache running.

First of all, instead of having to set up our own server to tell the newly created instances what to do, we are going to use a hosted chef server on Opscode’s server. The hosting is free for 5 nodes, and so you can try this out without having to pay them. Go to Opscode’s site and register a new user, then also add a new organization.

On our system, we need to start by installing chef. You will also want to install the dependencies needed to make chef talk with EC2 (these are not installed automatically when installing the gem because they’re optional):

gem install chef net-ssh net-ssh-multi fog highline
view raw install.sh This Gist brought to you by GitHub.

Now, we need to setup a chef repository. This repository will contain our cookbooks (libraries that contain recipes, which are scripts for doing stuff, like installing apache) and roles (which map recipes to nodes), among other stuff. To get it run:
git clone git://github.com/opscode/chef-repo.git
view raw clone.sh This Gist brought to you by GitHub.

In the repository create a .chef directory. Now back on Opscode’s site, you need to download 3 files: your organization’s validator key, your user’s key and a generated knife.rb. Once installed, copy them all to the .chef directory:
cp USERNAME.pem ORGANIZATION-validator.pem knife.rb .chef
view raw cp.sh This Gist brought to you by GitHub.

These will be used by the new instances to connect to Opscode and identify themselves as truly being created by you (this saves us from having to hack an awkward solution for this to work on Puppet). Add to your knife.rb file your AWS credentials:
knife[:aws_access_key_id] = "Your AWS Access Key"
knife[:aws_secret_access_key] = "Your AWS Secret Access Key"
view raw knife.rb This Gist brought to you by GitHub.

We will now fetch the apache2 cookbook, which will allow us to install apache on our instances by adding a single configuration line. To download an existing cookbook, do the following:
knife cookbook site vendor apache2
view raw download.sh This Gist brought to you by GitHub.

You can see what other cookbooks are made available by looking around here. Now, we’ll create a role for our instances. Create the file roles/appserver.rb with this data:
name "appserver"
description "An application server"
run_list(%w{
recipe[apache2]
})
view raw appserver.rb This Gist brought to you by GitHub.

And to update our Opscode server with the new cookbook and role:
knife cookbook upload apache2
knife role from file roles/appserver.rb
view raw upload.sh This Gist brought to you by GitHub.

We’re getting really close now! You should have a security group define in AWS that has port 22 (SSH) open, for knife to be able to connect to it and configure it, and port 80 (HTTP) for our Apache to be available. I called mine “chef”. You will also need to decide with AMI (image) to use, you can find a list of AMIs supplied by Opscode here. And now, to create an instance with one command line, as promised:
knife ec2 server create "role[appserver]" --image ami-f0e20899 \
   --groups chef --ssh-user ubuntu --ssh-key my-key
view raw create.sh This Gist brought to you by GitHub.

This will take a while, as knife will create the instance, connect to it, install ruby, chef itself, apache etc. Once it says it has finished simply copy the public DNS of the newly created image (it should be printed once knife finishes) and open it in your browser. My, what a sense of accomplishment one gets from seeing the string “It works!”

I find this a lot easier, cleaner, stream-lined and fun. I’m still learning the ropes with chef, but it has already surprised by being easy to change, being completely git-integrated and by Opscode’s fast support (even for non-paying customers). You can dig further in these links.

You should subscribe to my feed and follow me on twitter!

Using Puppet to Automatically Configure New EC2 Instances

Posted in Programming, techie on December 19th, 2010 by Aviv Ben-Yosef – 12 Comments

Note: I posted an update about doing the same with chef here.

This is a quickie techie post that summarizes a few hours of learning that I wish someone else had put up on the web before me. I assume some knowledge about Puppet, and recommend the Pro Puppet book and heard good stuff about Puppet 2.7 Cookbook.

So, I wanted to be able to configure via Puppet the way our new instances should be configured, and then be able to easily spawn new instances that will get configured by said puppet. The first part is installing puppetmaster. I decided to manually setup an EC2 instance that will act as the puppet master:

aptitude install puppetmaster
echo "127.0.0.1 puppet" >> /etc/hosts

Under /etc/puppet/manifests/site.pp we place the “main” entry point for the configuration. This is the file that is responsible for including the rest of the files. I copied the structure from somewhere where the actual classes were put under /etc/puppet/manifests/classes and import it in site.pp. Do note that currently this setup only supports a single type of node, but supporting more should be doable using external nodes to classify the node types.

import "classes/*"

node default {
        include default_node
}
view raw site.pp This Gist brought to you by GitHub.
class default_node {
  package { 'apache2':
    ensure => installed
  }
  service { 'apache2':
    ensure => true,
    enable => true,
    require => Package['apache2'],
  }
}

Auto-signing new instances

A common problem with puppet setups is that whenever a new puppet connects to the puppet master it hands it a certificate which you then have to automatically sign before the puppetmaster will agree to configure it. This is problematic in setups like mine where I want to be able to spawn new instances with a script and don’t hassle with jumping between the machines right after the certificate was sent and approving it. I found two ways to circumvent this:

1. Simply auto-signing everything and relying on firewalls

In case you can allow yourself to firewall the puppetmaster port (tcp/8140) to be only accessible to trusted instances, you do not actually need to sign the certificates, you can tell puppet to trust whatever it gets and leave the security in the hands of your trusty firewall. With EC2 this is extremely easy:

  • Setup a security group, I’ll call mine “puppets”
  • Add a security exception to the puppetmaster that allows access to all instances in the “puppets” group
  • Create all puppet instances in the “puppets” security group
  • Configure puppet to automatically sign all requests: echo “*” > /etc/puppet/autosign.conf

I decided to go with this solution since it’s simpler and less likely to get broken. I didn’t see it documented anywhere else. The downside is that you’ve got to have your puppetmaster on EC2 too.

2. Automatically identifying new instances and adding them

This is a solution I saw mentioned a few times online. Using the EC2 API tools write a script that gets the DNS names of all the trusted instances you’ve got and write them. Once you have this getting it to run with a cron job every minute will do the trick. This can be done with sophisticated scripts, but for my (very initial) testing, this seemed to work:

* * * * * ec2-describe-instances | grep ^INSTANCE | awk '{print $4}' > /etc/puppet/autosign.conf
view raw cron This Gist brought to you by GitHub.

Getting new instances to connect to the master

The last piece of the puzzle. Since we use Ubuntu, we could simply use the Canonical-supplied AMIs. These support user-data scripts that are executed as root once the system boots. Below is a simple script that does this:

  1. Update the instance
  2. Add the “puppet” entry to DNS – puppet expects the master to be accessible via “puppet” DNS resolution. This little snippet gets the current IP of the master via our DNS name and writes it to /etc/hosts
  3. Install & enable puppet and voila!
#!/bin/bash

set -e -x

# Needed so that the aptitude/apt-get operations will not be interactive
export DEBIAN_FRONTEND=noninteractive

apt-get update && apt-get -y upgrade

# Find the current IP of the puppet master and make "puppet" point to it
puppet_master_ip=$(host my_puppet_master.company.com | grep "has address" | head -1 | awk '{print $NF}')
echo $puppet_master_ip puppet >> /etc/hosts

aptitude -y install puppet

# Enable the puppet client
sed -i /etc/default/puppet -e 's/START=no/START=yes/'

service puppet restart

Once all of this is up and running, creating a new instance is as easy as:

ec2-run-instances -g puppets –user-data-file start_puppet.sh -t m1.small -k key-pair ami-a403f7cd

Happy puppeting!

You should subscribe to my feed and follow me on twitter!

Taking DRY Further

Posted in pragprowrimo2010, Programming, techie on November 3rd, 2010 by Aviv Ben-Yosef – 2 Comments

After learning to spot basic DRY violations, such as code you’ve just copied from somewhere, it’s time to learn how to use DRY to drive a lot more in your system.

DRY can be used extensively in your code base to alert you of problems waiting to happen. For example, similar code structures in different parts of the code are usually a DRY violation. This violation causes coupling which in turn will make it harder to change the code. The good scenario is that you have to tediously go through all the repetitions and change them to accommodate the change. The worse case is you forget to change one and introduce inconsistency in your code.

Train yourself to note these feelings of “yeah, I’ll have to do that again here”. Lucky for me, whenever I do something too many times I automatically start referring to it as “the dance”. So, once I hear myself saying to a teammate that “I’m doing the add-view-dance” I know it’s time to do something about it.

But DRY need not apply only to your code. Actually, it certainly shouldn’t! Copying is bad, even if you do it in a configuration file, or “just in that script”. It’s our tendency to disregard these “minor” parts of our system as not worthy of our attention, but these are places that are at least as likely to cause problems as our code. Taking the time to make the deployment scripts tidier will pay off once you decide to leave that server at home and deploy your app on EC2.

DRY used correctly will make driving changes in your app a breeze, be it a configuration change, a DB change or a real feature you need to add. The saved cycles of work saved by not having to dig around the code and look for other places your forgot to update, or hunt the bugs caused after making a change pay off immensely for the time spent on making things the good way.

Let’s take this one step further. Do you document your app? For example, do you supply wiki pages with explanations for installing the system, executing it or making configuration changes? There isn’t a coder out there that hasn’t been bitten by an outdated guide that was updated to reflect the way things should be done now. This is a subtle DRY violation – the is repetition between the description of the process in the wiki and the real process, as is usually saved in people’s heads or scripts. If you’ve got a script, why not simply use it as documentation?

Most people will probably understand your shell script better, and it is sure to be up-to-date. I tend to document the script itself to make it clearer (and generally treat it as part of the code base) which pays off when I don’t have to hear people get angry with me for not updating the wiki.

The even more important example is sample code. Almost every app will, at some point, require you to supply sample code for working with it. The knee-jerk solution is to hack some code and put it on a wiki page. You might even go the extra mile and run the code to make sure it works. 24 hours later, you change the code and the samples are now obsolete. Again, duplication between real code and sample code has created a problem for us. The solution, once more, is to fight this duplication. Putting the samples as part of the app’s version control, which will make sure they compile and run with every change, and pointing the wiki to these files (or even embedding them in the wiki) will save you a great hassle. Sometimes you can even make these examples part of the code, such as Python’s Doctests, which allows you to put executable usage examples right next to the code.

Once more, training your senses to notice these violations and finding creative solutions for them takes practice and time, but gets a lot easier once you get the trick. This is a crucial tool in the pragmatic programmer’s belt.

DRY, actually, is a measure of technical debt. The more violations you allow to slip in the codebase, the harder you’ll have to work later to change the code. A DRY codebase is usually more cohesive and loosely coupled which leads to a more responsive design. Effort you put into keeping your code DRY pays off, quickly and by manyfold.

You should subscribe to my feed and follow me on twitter.

Logging with a Context: Users in Logback and Spring Security

Posted in Programming, techie on August 27th, 2010 by Aviv Ben-Yosef – Be the first to comment
During this hectic time of starting an amazing adventure we find that along many of the big and important challenges we have there is an endless stream of small technical problems that solving poorly means a lot of time will go to waste.
One of these is proper logging in a way that will allow you see what your users are doing properly. Pretty quickly we came to the conclusion we want each log message to contain information about the user’s context, so we could easily understand what went wrong when a user tells us about it, and be able to track usage patterns.
The simplest thing we thought of was creating our own logger wrapper to insert our special values, but didn’t like the idea of having to write our own logging interface all over again. We’re using Logback and Spring Security, and after some googling and stackoverflowing I’ve found this solution:
We create simple Converters to insert the username and session ID to the logs, if present:

To make logback know where to find these converters, we add this little guy:

And now all that’s left are configuration changes. Make your pattern contain our cool new keys (“%user” and “%session”):

And this last part is needed for the session context thingie to work (as I was instructed on Stack Overflow). We need to add a certain listener to our web.xml:

And that’s about it. Happy logging!

    nose doesn’t discover tests on Solaris

    Posted in Programming, techie, testing on March 3rd, 2010 by Aviv Ben-Yosef – 1 Comment

    Note: this is a technical post, to help poor souls that google this :)

    When using nose on Solaris machines, simply running nosetests without specifying the file names will not work if you are the root user. To fix this, you must either not be root, or pass nose the argument --exe. That’s it.

    Gory details: by default, nose ignores executable files. Each file it encounters it checks with os.access(test_file, os.X_OK) to see if it’s executable. Problem is that Solaris’ access function always returns success for root, regardless of actual file permissions. This is discouraged by POSIX, but known behavior.

    Ensuring that you’re aware of known behaviours is crucial, as it prevents you from looking up issues with software unnecessarily. Of course, you can find these out on the web and it’s not difficult to test run them if you’re unsure what they’ll entail in terms of reversing the process or even repairing a bug or glitch yourself.

    Then again, if you’re reading this tutorial you’re probably already fairly IT-savvy, so don’t pay it too much heed. Everything has its own unique digital behavioural patterns, from the latest O2 uk I phone 4 to the newest Alienware MX. It’s just a case of running checks before you start to ensure that you’re prepared for any eventuality.

    I hope this saves someone the 3 hours it wasted for me :)