Amazon SimpleDB

This page explains a simple way to load data from Amazon SimpleDB into Resolver One.

Before you get started, you should know that SimpleDB is currently in Limited Beta - before you put any time and effort into trying to work with it, you should sign up with Amazon (if you don't already have an AWS account), join the beta, and wait until they give you access.

Contents

About Amazon SimpleDB

Amazon SimpleDB is a Web Service provided by Amazon which, just as you would expect, lets you store data on their servers like you would with a database. For full details about the service and how to use it, you should look at their website. However, here's the short version.

Amazon SimpleDB is a cut-down database system, and as such it structures data quite differently to a normal relational database - in a manner that is actually quite similar to a spreadsheet. Each database is structured into "domains", which can be regarded as similar to worksheets. A query to the database can only access a single domain (unlike relational database queries, which can "join" multiple tables). However, the domain is quite freely structured. It is made up of a number of items (which you can see as being similar to rows in the worksheet), each item having a number of attributes (which you can see as being similar to columns). There is no requirement for each row to have the same set of attributes - in relational database terms, it is highly denormalised.

In Amazon's view, this kind of simpler and less structured kind of data store "is easy to use and provides the core functionality of a database - real-time lookup and simple querying of structured data - without the operational complexity."

Getting the client-side libraries

While you could, in theory, write your own web service access code to talk to the Amazon AWS systems directly, it is much easier to use something someone else has already written. Amazon have provided a C# library for accessing Amazon SimpleDB, so this page describes how to use it as a .NET library from within Resolver One. (An independent group are working on a Python library, which we will be looking at later.)

Here's a step-by-step guide to creating the DLL that you need:

  • If you don't have a version of Visual C# installed, you should get the free "Express Edition" from Microsoft's website.
  • Next, you can download the Amazon SimpleDB C# library from Amazon's website.
  • Uncompress the C# library, and from Visual C# open the sample project amazon-simpledb-2007-11-07-cs-library\src\Amazon.SimpleDB.sln.
  • Compile it (hit F6 or select "Build solution" from the "Build" menu).
  • Take the file Amazon.SimpleDB.dll from amazon-simpledb-2007-11-07-cs-library\src\Amazon.SimpleDB\bin\Release and put it somewhere safe where you can use it later. This DLL file contains the client access code for SimpleDB.

Putting data into your SimpleDB account

Before you try to load data from Amazon SimpleDB into your worksheet, you will probably want to add some data :-) We recommend you go through Amazon's Getting Started Guide, but skip the step "Deleting a Domain" at the end - this will leave you with a domain called MyStore that holds some basic sample data. Alternatively, you can use IronPython - in the IronPython Cookbook there is a recipe for using Amazon SimpleDB that was adapted from this tutorial but also tells you how to load data.

Reading the data from within a Resolver One spreadsheet

Here is a sample RSL file that allows you to load up the contents of an AWS domain. To use it:

  • Save it on your system.
  • Copy the .NET library to the same location.
  • Load it up, and enter your AWS access key and secret keys into the appropriate fields in the "Settings" worksheet. If you want to access a domain other than the "MyStore" one created by the AWS getting started guide, you can change that there too.
  • Once the recalculation has completed, switch to the "Data" worksheet. You should see all of the data from your domain. The Header Rows and Columns will have been set up appropriately so that you can use them to query the data from your pre- or post-formula user code, or from your formulae.

Here's a step-by-step explanation of the pre-formulae user code that does the work.

Firstly, we load up the .NET library we created earlier:

import clr
clr.AddReference("Amazon.SimpleDB")

from Amazon.SimpleDB import AmazonSimpleDBClient
from Amazon.SimpleDB.Model import GetAttributes, Query

Next, we connect to the web service using the keys that were entered in the "Settings" worksheet (using references to named ranges):

accessKeyId = accessKey.Value
secretAccessKey = secretKey.Value
service = AmazonSimpleDBClient(accessKeyId, secretAccessKey)

Next, we set up the "Data" worksheet, giving it an appropriate header row and column

sheet = workbook["Data"]
itemHeader = "<Item>"
rowNum = 1
colNum = 1
sheet.HeaderRow = sheet.Rows[rowNum]
sheet.HeaderCol = sheet.Rows[colNum]
sheet.HeaderRow[colNum] = itemHeader
sheet.HeaderRow.Bold = True

Next, we make a query to Amazon SimpleDB, asking for all items in the specified domain:

queryAction = Query().WithDomainName(domain.Value)
response = service.Query(queryAction)

If the request returns any items, then we iterate over them:

if response.IsSetQueryResult():
    for itemName in response.QueryResult.ItemName:

For each item, we add it to a new row in our sheet, put its name into the special "<Item>" column, then get all of its attributes:

        rowNum += 1
        row = sheet.Rows[rowNum]
        row[itemHeader] = itemName
        getAttributesAction = GetAttributes().WithDomainName("MyStore").WithItemName(itemName)
        response = service.GetAttributes(getAttributesAction)     

If the item has any attributes, we iterate over the attribute/value pairs that have been returned:

        if response.IsSetGetAttributesResult():
            for attribute in response.GetAttributesResult.Attribute:

The next step is to put the attributes into appropriate cells in the item's row. this is a little more complicated than the code so far. Now, ideally we would be able to put each attribute in a column of its own, each column using the attribute name as its header. However, items in Amazon SimpleDB domains can actually have several values for the same attribute. For example, in the "Getting started" tutorial, one item in the MyStore domain is a sweater, which has three size attributes: Large, Medium and Small.

The solution we use here is that if an item has just one attribute with a particular name, then we simply put the value into the cell with the header. If it has several attributes with the same name, then the cell contains a list of values.

Our next lines of code do this by checking whether the current item already has an attribute with the name of the current attribute. If it does not, then we just putsthe current value into the appropriate cell. If it does have an attribute with the current name, then we change the cell to use a list (if that has not already been done) and then add the current value to the list.

                try:
                    if row[attribute.Name] is Empty:
                        row[attribute.Name] = attribute.Value
                    else:
                        if type(row[attribute.Name]) != list:
                            row[attribute.Name] = [row[attribute.Name]]
                        row[attribute.Name].append(attribute.Value)
                except IndexError:
                    colNum += 1
                    sheet.HeaderRow[colNum] = attribute.Name
                    row[attribute.Name] = attribute.Value

The try...except that wraps around this is simply to handle the case where we encounter an attribute whose name we have not seen before - we create a new column for it, and then put the value into the appropriate cell.

Summary

In this tutorial we've show how you can get data from Amazon SimpleDB into Resolver One. If there's anything else we could usefully cover, please do let us know!


Comments

If you have comments, questions or suggestions about any of the Resolver One documentation, please post them to the Documentation Suggestions Forum.

Retrieved from "http://www.resolversystems.com/documentation/index.php/Amazon_SimpleDB"

This page has been accessed 621 times. This page was last modified 15:30, 25 June 2008.

This page: Printable version | View source | Discuss this page | Page history | What links here | Related changes


This Wiki: Special pages | Disclaimers | Privacy policy | Recent changes | Log in / create account

Powered by MediaWiki
Theme designed by SisleyStudio