******* Caching ******* lazr.restfulclient automatically caches the responses to its requests in a temporary directory. >>> import httplib2 >>> httplib2.debuglevel = 1 >>> from lazr.restfulclient.tests.example import CookbookWebServiceClient >>> service_with_cache = CookbookWebServiceClient() send: 'GET /1.0/ ... reply: ...200... ... header: Content-Type: application/vnd.sun.wadl+xml ... send: 'GET /1.0/ ... reply: ...200... ... header: Content-Type: application/json ... >>> print service_with_cache.recipes[4].instructions send: 'GET /1.0/recipes/4 ... reply: ...200... ... Preheat oven to... The second and subsequent times you request some object, it's likely that lazr.restfulclient will make a conditional HTTP GET request instead of a normal request. The HTTP response code will be 304 instead of 200, and lazr.restfulclient will use the cached representation of the object. >>> print service_with_cache.recipes[4].instructions send: 'GET /1.0/recipes/4 ... reply: ...304... ... Preheat oven to... This is true even if you initially got the object as part of a collection. >>> recipes = service_with_cache.recipes[:10] send: ... reply: ...200... >>> first_recipe = recipes[0] >>> first_recipe.lp_refresh() send: ... reply: ...304... Note that if you get an object as part of a collection and then get it some other way, a conditional GET request will *not* be made. This is a shortcoming of the library. >>> service_with_cache.recipes[first_recipe.id] send: ... reply: ...200... The default lazr.restfulclient cache directory is a temporary directory that's deleted when the Python process ends. (If the process is killed, the directory will stick around in /tmp.) It's much more efficient to keep a cache directory across multiple uses of lazr.restfulclient. You can provide a cache directory name as argument when creating a Service object. This directory will fill up with cached HTTP responses, and since it's a directory you control it will persist across lazr.restfulclient sessions. >>> import tempfile >>> tempdir = tempfile.mkdtemp() >>> first_service = CookbookWebServiceClient(cache=tempdir) send: 'GET /1.0/ ... reply: ...200... ... send: 'GET /1.0/ ... reply: ...200... ... >>> print first_service.recipes[4].instructions send: 'GET /1.0/recipes/4 ... reply: ...200... ... Preheat oven to... This will save you a *lot* of time in subsequent sessions, because you'll be able to use cached versions of the initial (very expensive) documents. A new client will not re-request the service root at all. >>> second_service = CookbookWebServiceClient(cache=unicode(tempdir)) You'll also be able to make conditional requests for many resources and avoid transferring their full representations. >>> print second_service.recipes[4].instructions send: 'GET /1.0/recipes/4 ... reply: ...304... ... Preheat oven to... Of course, if you ever need to clear the cache directory, you'll have to do it yourself. Cleanup. >>> import shutil >>> shutil.rmtree(tempdir) Cache expiration ---------------- The '1.0' version of the example web service, which we've been using up til now, sets a long cache expiry time for the service root. That's why we were able to create a second client that didn't request the service root at all--just fetched the representations from its cache. The 'devel' version of the example web service sets a cache expiry time of two seconds. Let's see what that looks like on the client side. >>> tempdir = tempfile.mkdtemp() >>> first_service = CookbookWebServiceClient( ... cache=tempdir, version='devel') send: 'GET /devel/ ... reply: ...200... ... send: 'GET /devel/ ... reply: ...200... ... Now let's wait for three seconds to make sure the representations become stale. >>> from time import sleep >>> sleep(3) When the representations are stale, a new client makes *conditional* requests for the representations. If the conditions fail (as they do here), the cached representations are considered to have been refreshed, just as if the server had sent them again. >>> second_service = CookbookWebServiceClient( ... cache=tempdir, version='devel') send: 'GET /devel/ ... reply: ...304... ... send: 'GET /devel/ ... reply: ...304... ... Let's quickly create another client before the representation grows stale again. >>> second_service = CookbookWebServiceClient( ... cache=tempdir, version='devel') When the representations are not stale, a new client does not make any HTTP requests at all--it fetches representations direct from the cache. Cleanup. >>> httplib2.debuglevel = 0 >>> shutil.rmtree(tempdir) Cache filenames --------------- lazr.restfulclient caches HTTP repsonses in individual files named after the URL accessed. This is behavior derived from httplib2, but lazr.restfulclient does two things differently from httplib2. To see these two things, let's set up a client that uses a temporary directory as a cache file. The directory starts out empty. >>> from os import listdir >>> tempdir = tempfile.mkdtemp() >>> len(listdir(tempdir)) 0 As soon as we create a client object, though, lazr.restfulclient fetches a JSON and a WADL representation of the service root, and caches them individually. >>> service = CookbookWebServiceClient(cache=tempdir) >>> cache_contents = listdir(tempdir) >>> for file in sorted(cache_contents): ... print file cookbooks.dev...application,json... cookbooks.dev...vnd.sun.wadl+xml... This is the first difference between lazr.restfulclient's caching and httplib2's. httplib2 would store all requests for the service root in a filename based solely on the URL. This effectively limits httplib2 to a single representation of a given resource: the WADL representation would be overwritten with the JSON representation. lazr.restfulclient incorporates the media type in the cache filename, so that WADL and JSON representations are stored separately. The second difference has to do with filename length limits. httplib2 caps filenames at about 240 characters so that cache files can be stored on filesystems with 255-character filename length limits. For compatibility with eCryptfs filesystems, lazr.restfulclient goes further, and caps filenames at 143 characters. To test out the limit, let's create a cookbook with an incredibly long name. >>> long_name = ( ... "This cookbook name is amazingly long; so long that it will " ... "surely be truncated when it is incorporated into a file " ... "name for the cache. The cache file will contain a cached " ... "HTTP respone containing a JSON representation of of this " ... "cookbook, whose name, I repeat, is very long indeed.") >>> len(long_name) 281 >>> import datetime >>> date = datetime.datetime(1994, 1, 1) >>> book = service.cookbooks.create( ... name=long_name, cuisine="General", copyright_date=date, ... price=10.22, last_printing=date) lazr.restfulclient automatically fetched a JSON representation of the new cookbook, so it's already present in the cache. Because a cookbook's URL incorporates its name, and this cookbook's name is incredibly long, it must have been truncated to fit on disk. >>> [cookbook_cache_filename] = [file for file in listdir(tempdir) ... if 'amazingly' in file] Indeed, the filename has been truncated to fit in the rough 143-character safety limit for eCryptfs filesystems. >>> len(cookbook_cache_filename) 143 Despite the truncation, some of the useful information from the cookbook's name makes it into the filename, making it easy to find when manually crawling through the cache directory. >>> print cookbook_cache_filename cookbooks.dev...This%20cookbook%20name%20is%20amazingly%20long... To avoid conflicts caused by truncation, the filename always ends with an MD5 sum derived from the untruncated URL. Let's create a second cookbook whose name differs from the first cookbook only at the end. >>> longer_name = long_name + ": The Sequel" >>> book = service.cookbooks.create( ... name=longer_name, cuisine="General", copyright_date=date, ... price=10.22, last_printing=date) This cookbook's URL is identical to the first cookbook's URL for far longer than 143 characters. But since the truncated filename incorporates an MD5 sum based on the full URL, the two cookbooks are cached in separate files. >>> [file1, file2] = [file for file in listdir(tempdir) ... if 'amazingly' in file] The filenames are identical up to the last 32 characters, which is where the MD5 sum begins. But because the MD5 sums are different, they are not completely identical. >>> file1[:-32] == file2[:-32] True >>> file1 == file2 False Cleanup. >>> import shutil >>> shutil.rmtree(tempdir)