MySQL Replication to EC2

Servers die.  People make mistakes. Solar flares, um, flare.  There are many things that can cause you to lose your your data.  Fortunately, there is a pretty easy way to protect yourself from data loss if you use MySQL.

My preferred solution is to store a copy on EC2 through replication.  One big reason I like to replicate to EC2 is that it becomes a pretty easy warm-failover site.  All of your database data will be there, to switch over you’ll just need to start up webservers or other systems required by your architecture and make a DNS change.  If your datacenter became a smoking hole in the ground, you could be back up and running on EC2 in 15 minutes or less with proper planning.

No matter where your MySQL master server is hosted, you can replicate to an EC2 instance over the internet.  Latency generally isn’t an issue when compared to the lag that may be introduced by the replication process itself.  I typically see a max of 5-10 second replication lag during general use.  That lag is due to the replication process being single-threaded (only one modification is made to the database at a time.)

Here are a few things to keep in mind when setting up replication:

  • Use a separate EBS volume partition for your mysql data directory
  • There is good replication documentation for MySQL
  • Use SSL
  • Set expire_logs_days to an acceptable value on the slave and server.  The value of this setting will vary depending on the volume of data you send to the slave each day.  Don’t make it so small that recovery with the binlogs will be difficult or impossible.
  • Store your binlogs on the same partition as the mysql data directory.  This simplifies the snapshot and recovery process.

Here’s a sample EBS snapshot perl script for MySQL that can be modified and used to create snapshots of the mysql data on the slave server:

Since this is a mysql slave server, you can create volume snapshots whenever you want without any impact on your master database. By default, AWS imposes a 500 volume snapshot limit.  If you have that many snapshots, you’ll have to delete some before you will be able to create more.

With the periodic snapshots and binlogs, you can recover to any point in time.  I’ve been able to recover from a “bad” query that unintentionally modified all rows in a table as well as accidentally dropped tables.

Can you replicate from multiple database servers to a single server?  Yes, but a rule of replication is that a slave can only have one master.  To make it possible for one server to be a slave to multiple master servers you need to run multiple mysql daemons.  Each daemon runs with its own configuration and separate data directory.  I’ve used this method to run 20 mysql slaves on a single host and I’m sure you could run many more than that.

Millcreek Systems is available to help you setup and maintain MySQL replication for you.  Please contact us if you’d like to discuss our services further.

Tips for programming a website from scratch

I’ve seen the backend of quite a few websites written from scratch and quite a few of the challenges that arise as a result.  I’m going to share a few suggestions I have for you due to some of the issues that I’ve seen come up.

  • Use a config file – A central location for any configuration options is essential.  I’ve seen some sites created where the database connection was defined at the beginning of each php file.  It was quite an ordeal to make all of the code changes necessary when they needed to move their database to its own server instead of ‘localhost’  Also, you can include options in there for specific environments (i.e. dev, qa, production)
  • Don’t ever trust any data sent to your server from web browsers – It doesn’t matter if you validated it with javascript, if you don’t validate your data on the server side and sanitize it before storing it, someone will figure out how to break it.  Those ‘someones’ usually don’t have the best of intentions. (see http://xkcd.com/327/ for a humorous example)
  • Don’t store plaintext passwords in your database – This goes for any sensitive information.  Passwords can be stored and validated with a one-way hash.  Other sensitive data should be encrypted prior to storage in the database.
  • Don’t store user uploaded files on the webserver – For example, if you allow your users to upload photos.  If you store those photos on the webserver, what will happen when you scale and add another webserver, or you add 10 webservers?  You’ll have to some how synchronize the files across all of the webservers which is time consuming and error prone.  An easier solution is to use a service like Amazon’s S3 to store and serve up files.  One additional bonus of using an external service is you don’t have to worry about users filling up the disks on your webservers.
  • Isolate your static content – As usage of your site increases, you’ll be looking for ways to make it faster for your user.  One of the best things to do is move your static content to a CDN.  You don’t need to use a CDN from the beginning, but you can set your site up to make it an easy transition when the time comes.  You can initially just use a vhost on your webserver as a pseudo CDN.
  • Be very nice to your database – I can’t emphasize this one enough.  As you grow, your database will be the most difficult thing to scale.  Do your JOINs in your code, don’t make the DB do them.  Make sure you’re using indexes properly.  Avoid table scans like the plague and eliminate any slow queries that come up during development.
  • Plan to scale – I’ve said this before.  Have a long term scaling plan drawn out but only execute on it as needed.  You don’t want to waste time and resources solving issues that may not exist down the road for you anyway.  Without a long-term plan, you can easily paint yourself into a non-scalable corner which will result in poor performance and downtime as you fix things.

Most of these things are already addressed if you use a framework.  The ones that aren’t can usually be easily taken care of.

Another advantage of using a framework is that it will force your developers to write code a certain way (which hopefully adheres to the framework’s standard)  This will allow you to enlist the help of other developers in the future without requiring them to spend time sifting through what otherwise could be messy code.  Code written from the ground up by a single developer is more difficult to learn.  And don’t forget, there will be a certain level of support with a framework.  If a vulnerability is found in a framework, it will usually get fixed quickly and released so you can use the update.

Be sure to have more than one person review your code and your plan for scaling.  If you need help, contact us and we can do both for you.

Are You Protecting Your Users From Session Hijacking?

This is nothing new, but a recently released tool demonstrates how easy it is for someone to hijack your sessions and pose as you on certain websites.  This problem has been around as long as there have been websites where you login.  It is not a complex attack and has been made easier to execute with the proliferation of wireless networks.

Imagine that you’re out of the office, on the road or at the coffee shop and decide to read up on those that you follow on Twitter.  You post an update or two, replying to your followers and continue working.  Later that day, you realize that someone has posted updates as you on Twitter.  Has someone hacked into  your computer?  Nope, but someone hijacked your session after you logged in to Twitter.  The same thing could happen to you on Facebook, your blog (WordPress included) and many other web sites that don’t properly protect your data.  Even if you use WPA/WPA2 encryption on your wireless network, you can still be vulnerable to this type of attack.

What can you do to protect yourself?

  • Don’t use wireless networks (yeah, right…)
  • Use a VPN to send your Internet traffic to a “trusted” network, then out to the Internet.  This will protect you from attackers on the wireless network, but it only pushes out the problem out to your “trusted” network.  Combine that with the fact that some VPN clients sometimes just disconnect without warning and send all of your network traffic unprotected over the network without notifying you.
  • Only use websites that send everything over SSL (check for the https:// in the URL) – Check out the HTTPS Everywhere extension for Firefox.

You run a website, what can you do to protect your users?

  • Serve every part of your site over SSL when a user is logged in.  It only costs $30/year for an SSL certificate to protect your users.  There really is no excuse.  Some certificate issuers charge over 50x that per year (yes, $1,500/year)  The level of encryption is the same on those very expensive certificates as it is on the $30 certificates.
  • Don’t serve mixed content.  When you serve content over HTTPS and some images over HTTP a warning pops up in the browser about some items on the page being insecure.  Don’t desensitize your users to security warnings like this.
  • When setting cookies for your users, set the Secure attribute on the cookie.  This will make the browser only send that cookie with HTTPS requests.

Be cautious about what you do when accessing the Internet over wireless networks.  And remember, the work involved in using SSL/HTTPS is minimal compared to the protection it offers your users.

Contact us if you would like assistance configuring your web site & servers for SSL/HTTPS.  If we are managing your servers, we’ll configure SSL/HTTPS at no additional cost.

How to choose a Domain Name: 7 tips for choosing a Domain Name

Here are a 7 quick tips to keep in mind when you register a domain:

  1. Use “.com”
  2. Don’t use hyphens
  3. Use an easy to spell domain
  4. Make sure your domain passes the phone test – say it to someone and have them tell you what they would type in (i.e. “my email address is joe@donutbar.com”) Remember, you might say Donuts4You.com, but some people will hear DonutsForYou.com. Trust me, you don’t want to have to say, “… ‘Donuts’, the number 4, ‘you’ dot com…” every time you give out your email address.
  5. Keep it short
  6. Use proper spelling (verify with spellchecker)
  7. Don’t spend a lot, registering a new domain shouldn’t cost you more than $10/year. Also, renewing your domains shouldn’t cost you more than $10/year. Don’t fall for the $30 renewal garbage that you’ll get in the mail.
  8. Consider registering other similar domains in addition to your primary domain time, such as; other TLDs (.net, .org, .us, .biz), singular/plural versions (If you only registered GoldenDonuts.com, you can bet someone will register GoldenDonut.com and get some web traffic from your work)

Need help getting started?
Brainstorm a little: write down all the words that come to mind when you think of your business (from the customer perspective.) Use a thesaurus to expand that list of words. Start searching for available domains using different word combinations from your list.

If the domain you really want is already taken, you may have to pay hundreds (or thousands) to get it. Even if the person that currently owns it isn’t using it.

When using your domain in print, consider capitalizing the beginning of words to make it more readable, for example thebestdonutshopintheworld.com vs. TheBestDonutShopInTheWorld.com