A new(er) tool in the services I use/recommend is Rackspace Cloud servers and Rackspace Cloud Files.
We were evaluating cloud services to host client websites, and I ended up choosing Rackspace’s cloud offerings. I really like the services the provide.
With their Cloud files, I can upload files that can be accessed anywhere. I decided that I wanted to put our common scripts there, that way when we provision a new server, behind a firewall or in the cloud, we can pull from the same place. All I would have to do is keep them up to date in one place.
Before I knew about Chef (future project I can’t wait to have time for), I created simple scripts to install a common set of packages on every server – our SOE (Standard Operatin Environment). Once a server is provisioned, from any other server, we can update the new server to have the same core set of packages and configurations. The most important part of this is that we install GIT and pulldown the python-cloudfiles:
yum install git -y git clone git://github.com/rackspace/python-cloudfiles.git
Once python-cloudfiles is installed, we use the following script to pull down the common set of scripts:
conn = cloudfiles.get_connection('usename','keynumberthatisreallylong') cont = conn.get_container(container) obj = cont.get_objects(path=sourcepath) for filename in obj: print "Downloading " + (os.path.join("/",container,sourcepath,os.path.basename(filename.name))) + " to " + destpath filename.save_to_filename(os.path.join(destpath, os.path.basename(filename.name))) destfile = os.path.join(destpath, os.path.basename(filename.name)) timestamp = filename.last_modified[:filename.last_modified.find(".")-3].replace('-','').replace(':','').replace('T','') cmd = "touch -m -t " + timestamp + " " + destfile os.system(cmd)
What this does is pull down each file in a directory in the Cloud Files infrastructure and saves it locally. Then I added the extra step of setting the modified date to the Cloud Files last_modified date, so that we can tell what downloaded files have been changed recently (uploaded to Rackspace Cloud Files).
I look to replace this with Chef one day, but right now it works really well for us
Thanks for the code. It appears that RS cloud API has a limitation of 10000 objects returned. I’ve made some modification to overcome this limit and just want to share them back:
#!/usr/bin/python -u
import cloudfiles
import os
import sys
container = ‘mediaContainer’
sourcepath = ”
last_file= None
print “Opening connection”
conn = cloudfiles.get_connection(‘X’, ‘Y’)
print “Getting container: ” + container
cont = conn.get_container(container)
objects=None
get_objects_call = 0
while 1 :
get_objects_call += 1
print “Calling get objects. Callnum: ” + str( get_objects_call ) + “, marker: ” + str(last_file)
objects = cont.get_objects( path=sourcepath, marker=last_file )
found_objects = len(objects)
print “Found objects: ” + str( found_objects )
for filename in objects:
last_file = filename.name
if os.path.exists( filename.name ) :
# print “Skipping: ” + filename.name
continue
print “Downloading ” + filename.name
try:
filename.save_to_filename( filename.name )
timestamp = filename.last_modified[:filename.last_modified.find(“.”)-3].replace(‘-‘,”).replace(‘:’,”).replace(‘T’,”)
cmd = “touch -m -t ” + timestamp + ” ” + filename.name
os.system(cmd)
except:
print “Removing ” + filename.name
os.remove( filename.name )
sys.exit(1)
if found_objects < 10000 :
break
Thanks! I did not know that. I haven’t had to deal with this code in a while or with that many objects. Thanks for taking the time to comment.