Amazon S3 Outages
Monday, July 21, 2008 - 9:54 AMI, like a lot of others, make use of the Amazon S3 service for data storage "in the cloud". The service is a huge success and normally behaves extremely stable.
Yesterday there was a very large S3 outage which caused many online companies to suffer outages as well. There was an interesting piece over at ReadWriteWeb asking "How Much Is To Much?".
I've been using S3 for nearly 2 years and have only experienced trouble twice. I wrote about the previous experience and that outage lasted about 3 hours. Yesterday however the outage was more than 6 hours!
Now I use S3 for the static media on many websites, including this site. Since it's my personal blog it's no big deal, but I also use it for many sites that are for profit, including my latest service: SmartJabber.
Yesterdays outage effected paying customers of mine. They have been understanding but I have to ask myself if a change may be needed to prevent future outages that last this long. The idea of using Amazon S3 is that, in theory, the service can't go down. It's a large cluster of servers whose purpose is to host their customers data securely and with a very high availability.
In the grand scheme of things, my data is probably safer more stable remaining with Amazon. Though I do hate the feeling of having my hands tied when the service is down.
Others out there seem to share my opinion of sticking with Amazon, and others don't.
Technorati Tags: amazon s3, business, outage, service
What Powers SmartJabber
Tuesday, April 22, 2008 - 3:31 PMI've had some inquiries to what powers SmartJabber, so I decided to write up a quick outline of the technology that is being used to power the new service.
In case you hadn't heard yet, I recently launched a new company called SmartJabber. The general idea behind the service is to offer an automated support agent via a "live chat" window. It's a cheap alternative to hiring someone to do basic live chat support.
The tool can be used to convert visitors into customers, save sales or offer a basic level of customer service (say on an FAQ page or something). There is a lot of work that goes on the back end that makes this service fast and reliable. In this geeky post, I will go over some of the technology that I used to create this tool.
Python was used for the programming on the back end. This includes the web site and back end processing. I went with Python because I love the language for a number of reasons and I have a lot of experience writing web applications using it.
Django is the ultimate web framework, written in pure Python. I began learning Django over a year ago and have worked on numerous projects using the framework since. I've created web apps using a variety of different languages and none of them can compare to the Django framework (at least in my opinion).
PostgreSQL is the most powerful and feature packed open source database system available. PostgreSQL is used to store all the records and statistics that are part of the SmartJabber service. Each chat instance is logged, with full records and tracking of user action. Because of the amount of data we store, track and process every day the database needs to be powerful enough to handle the service load. PostgreSQL should handle this just fine.
Memcached is an ultra light caching daemon that stores it's entire cache table in the server RAM. Because of this, and the access method, the results are stored, fetched and removed at a lightning fast speed. Memcached is used to store the results of some of the heavy processing that occurs often. To keep server load down, we store the results of various computations for an extended period of time.
Amazon AWS S3 is a "storage in the cloud" solution that provides super cheap data storage. The service cost is related to your direct usage. In other words, you only pay for what you actually use. Because of this policy, a lot of startups (and big companies too) use this service. We use S3 for the serving of all static media.
The actual chat window was created using Javascript combined with basic CSS usage. In other words, it's AJAX. Being that I am a Javascript weenie I hired out this aspect of the project, which was definitely a smart decision.
The entire service is run from 4 servers that handle an array of functions. While everything is designed to be highly scalable from the ground up, a lot of performance gains come from the user of S3 and Memcached. We also made sure to tune Django as much as possible to squeeze out every last drop of performance. Because Django is already pretty fast, and because of our layout, the service is very fast and if the cluster becomes a little loaded we can simply throw servers at the problem.
That's roughly the gist of it. Feel free to ask any specific questions you may have.
Technorati Tags: amazon s3, business, django, memcached, postgresql, python, smartjabber
Dealing With Business Emergencies
Friday, February 15, 2008 - 1:04 PMFun morning for me today. I use Amazon S3 service for storage of various files, including the static media for most of my websites. This morning there was a major outage across the AWS farm. After noticing the errors I began to scramble to figure out what was going on. Obviously the issue was on the remote side (Amazon) so I had to sit and wait it out.
The outage seemed to last about 3 hours (for me, others are still reporting issues.) While S3 is an awesome service and very affordable, this shows that nothing is fail proof. Even with this outage, their service falls into their 99.9% SLA.
These things happen
I've spent a lot of time on the other side of this fence. Trying to figure out what is causing a major outage, dealing with pissed users and keeping the bosses away long enough to get my job done. I understand that "these things happen," but how do you convey that to your customer who's service is effected as a result?
It's also difficult to accept when you are losing money as a result of an outage such as this. Luckily I keep backups of all my sites so if I needed to I could upload the static content and change the templates to reflect that. But that is a lot of work and money will still be lost during the down time.
How to deal?
There really is no need for huffing and puffing in the beginning. Especially since the service is still operating under their 99.9% uptime SLA. Once it begins operating outside of that SLA, is when the yelling may start. Obviously the situation will be different for everyone. There are a lot of startups who depend on Amazon AWS services right now.
You have to make a rational decision when is it time to quit the service. Do you have a back up plan? If not, you should. If your history with the service in question is a very positive one, then forgive them for their "bad" and continue on. If they have a bad or unstable history then look into an alternative and take your business else where.
A lot of times switching companies/services is easier said than done. It will probably come down to which scenario loses more money. Putting up with the bad service or the amount of man hours required to switch. In the long run, it's almost always smarter to dump the bad service.
In the case of Amazon, I've been very happy with their service and this outage today wasn't that big of an issue though I'm sure many would disagree.
Technorati Tags: amazon s3, business, outage, service, startups, technology
Django, metaWeblog and Amazon S3
Wednesday, June 6, 2007 - 11:11 AMThis is a quick write up about adding support for Amazon S3 to your Django weblog. I added support for metaWeblog API by using the this write up from All Your Pixel. Most, if not all, blogging clients support the metaWeblog API so I think it was a good choice.
Why use Amazon S3?
- It's super cheap (probably cost you less than $1/mo)
- Performance is improved by moving static media away from your Django/Apache instance.
- Amazon worries about managing and scaling the storage back end.
First off, let's edit the settings.py that is in your Django project's directory. We will want to add your Amazon access information. Note: You can get your access information by signing up for S3 at the Amazon site. Add the following to your settings.py file:
AWS_ACCESS_KEY = 'Your Key'
AWS_SECRET_ACCESS_KEY = 'Your Secret Key'
BUCKET_NAME = 'Your Bucket Name'
AWS_S3_URL = 'http://s3.amazonaws.com'
So, I am going to assume you are using the xmlrpc.py and metaweblog.py that was provided in the All Your Pixel posting (linked above). Be sure you are importing the Django project settings file by using:
from django.conf import settings
This will also require the Python S3 module that is provided by Amazon. Get this file and add it to your project, or your Python path. Import that module, and the mimetypes module which is used to guess the type of the file being uploaded.
from yourproject import S3
import mimetypes
Now we will create, or edit, the metaWeblog_newMediaObject method. This is what is called when your blogging client attaches, or uploads, a media file to your blog post.
def metaWeblog_newMediaObject(user, blogid, struct):
ret = {}
fext = os.path.splitext(struct['name'])[1].lower()
fname = generate_fname() + fext
try:
conn = S3.AWSAuthConnection(settings.AWS_ACCESS_KEY,
settings.AWS_SECRET_ACCESS_KEY)
buckets = conn.list_all_my_buckets()
if not settings.BUCKET_NAME in [b.name for b in buckets.entries]:
# BUCKET_NAME doesn't exist, create it!
res = conn.create_bucket(settings.BUCKET_NAME)
if res.http_response.status != 200:
raise
filename = 'uploads/' + fname
res = conn.get(settings.BUCKET_NAME, filename)
while res.http_response.status == 200:
# File exists, generate new filename
fname = generate_fname() + fext
filename = 'uploads/' + fname
res = conn.get(settings.BUCKET_NAME, filename)
content_type = mimetypes.guess_type(filename)[0]
if not content_type:
content_type = 'text/plain'
res = conn.put(settings.BUCKET_NAME,
filename,
S3.S3Object(struct['bits'].__str__()),
{'x-amz-acl': 'public-read',
'Content-Type': content_type}
)
if res.http_response.status == 200:
ret['url'] = '%s/%s/%s' % (settings.AWS_S3_URL,
settings.BUCKET_NAME,
filename)
except:
pass
return ret
I should mention that the method generate_fname(), which is used above, is just a function to generate a MD5 hash which will be used as a file name of the new file being added. It is not required, but you may want to use something similar. A quick run down of what this code does:
1 - Generates a file name to use.
2 - Creates an S3 instance.
3 - Gets a list of all your S3 buckets.
4 - Checks to see that the bucket you want to use exists. If not, it creates it.
5 - Checks to see if the file already exists. If so, it generates a new file name.
6 - Guesses the file type.
7 - Uploads the file to S3.
8 - Returns the file URL to your blogging client.
That's it! There are a few other articles written about this. Check them out as well!
Technorati Tags: amazon s3, blog, development, django, geek, python
