DynamoDB and K.E.V.

DynamoDB and K.E.V.

Recently had the opportunity of implementing a DynamoDB backend for K.E.V. I had been wanting to get more hands-on with AWS and this project was the perfect opportunity. This post will cover some of the main topics learned over the course of a week playing around with two tools.

1 Background

A quick primer for those unfamiliar with DynamoDB, it is a fast, highly flexible and scalable NoSQL service from Amazon. Unlike S3, a block storage service used for storing large objects, DynamoDB is great for storing small documents and other metadata in a key/value design. For the even fewer individuals unfamiliar with the github project K.E.V., it is an Object-Relational Mapping (ORM) tool designed to provide a single interface into storing key/value data and documents in any AWS service (currently supported backends include S3, Redis, S3/Redis, and now DynamoDB). It eliminates the need of the user knowing or caring how or where the data is being stored. Aside from providing backend specific attributes, interface to store, retrieve, and delete data are the same. This would make switching backends almost effortless.

2 Development

2.1 Tools

Developing with KEV, I was able to use a number of old and new tools.

Existing Tools:

  1. Sublime Text 3 – used as my primary IDE
  2. vagrant – Quick and easy setup of a development environment
  3. nosetests – python unit testing!

New Tools:

  1. boto3 – AWS recommended high-level API for interacting with Amazon services. Working with boto3, you’ll need to visit the documentation, I spent many hours pouring over examples and samples.

 

2.2 Setup

Install required libraries:
boto3, coverage, envs, nose, and redis

Set environmental variables:
export AWS_DEFAULT_REGION=us-east-1
export REDIS_HOST_TEST=localhost
export REDIS_HOST_PORT=6379

Store AWS credentials:
aws configure

Login to AWS account and create a table:

Start redis server:
redis-server

That’s it! You should be ready to start running unit tests within KEV.

2.3 Design

I needed to figure out the interfaces I needed to expose for DynamoDB and then later figure out how, given the simple interface, implementing the necessary functionality. Since there are already existing backends implemented, I was able to base my design on those.

  • save – store an item in the table
  • get – retrieve a single item from the table
  • flush_db – delete all items within the table
  • delete – delete a single item from the table
  • all – retrieve all items in the table
  • evaluate – internal method providing filtering on results before they are returned to the user

2.4 Lessons & Limitations

Over the course of the week, I learned numerous lessons on setting up and managing a DynamoDB table:

  • Primary Keys Matter!

When creating a new table, the first thing you need to decide on is a proper primary key. After a bit of searching, I came to realize a fair amount of effort would need to go into determining a proper primary key. You can look over this guidelines intro. You would need a good understanding of your expected workload before figuring out a primary key. Just starting out, I was quickly out of my depth reading through the documentation. Thankfully,  I only needed to be concerned with small sample datasets, so the primary key wasn’t vitally important.

  • Don’t trust the returned results of scan() or query()

The results returned by a scan or query might not contain all of the data you might expect. One issue is due to the inherent behavior of NoSQL. A database will eventually become consistent, but at any given time might be inconsistent. A recent put or update might not be reflected in a scan. You can use the ConsistentRead option to guarantee consistency; however, the option uses twice as many read units, a costly option.

Another issue is a limitation with DynamoDB itself. A Scan will be limited to returning 1MB of data in a result set. After receiving the first response, a user should check LastEvaluatedKey for outstanding items that still need to be returned.

  • Advanced filtering not available out of the box

DynamoDB has a number of builtin conditionals that can be used during querying or scanning, reducing the number of results that need to be returned. When comparing strings, I found I was unable to perform advanced, in this case-insensitive, filtering. String filtering is performed case-sensitively.

Solutions are to either duplicate each text field, effectively having a displayField and searchField. The search field could be all lower case, enabling normal filtering but potentially duplicating a large amount of data. The other solution would be to setup ElasticSearch. I’ve heard ElasticSearch come up often, I hope to get a chance to play around with it.

  •  Not everything Floats

Storing a float into a tabling using boto3 v1.4.4, results in the below error:

  ‘Float types are not supported. Use Decimal types instead.’)
TypeError: Float types are not supported. Use Decimal types instead.

There are a number of posts and issues related to finding workarounds and solutions to storing floats types. I tried converting the floats to Decimal; however, I hit an issue where Decimal isn’t serializable into JSON. The solution there was to cast it into a float, and thus I was returned to my original problem.

I ultimately solved the problem by converting the float into a string before saving.

 

3 Summary

DynamoDB has advanced features for providing scalable solutions while providing an interface where even a novices can quickly get table setup and populated with data. There are numerous applications for an easy to setup, near zero maintenance, scalable, highly available NoSQL DB. New applications or mobile apps could leverage the benefits of DynamoDB to their data.

Comments are closed.