Recently had the opportunity of implementing a DynamoDB backend for K.E.V. I had been wanting to get more hands-on with AWS and this project was the perfect opportunity. This post will cover some of the main topics learned over the course of a week playing around with two tools.
A quick primer for those unfamiliar with DynamoDB, it is a fast, highly flexible and scalable NoSQL service from Amazon. Unlike S3, a block storage service used for storing large objects, DynamoDB is great for storing small documents and other metadata in a key/value design. For the even fewer individuals unfamiliar with the github project K.E.V., it is an Object-Relational Mapping (ORM) tool designed to provide a single interface into storing key/value data and documents in any AWS service (currently supported backends include S3, Redis, S3/Redis, and now DynamoDB). It eliminates the need of the user knowing or caring how or where the data is being stored. Aside from providing backend specific attributes, interface to store, retrieve, and delete data are the same. This would make switching backends almost effortless.
Developing with KEV, I was able to use a number of old and new tools.
- Sublime Text 3 – used as my primary IDE
- vagrant – Quick and easy setup of a development environment
- nosetests – python unit testing!
- boto3 – AWS recommended high-level API for interacting with Amazon services. Working with boto3, you’ll need to visit the documentation, I spent many hours pouring over examples and samples.
Install required libraries:
boto3, coverage, envs, nose, and redis
Set environmental variables:
Store AWS credentials:
Login to AWS account and create a table:
Start redis server:
That’s it! You should be ready to start running unit tests within KEV.
I needed to figure out the interfaces I needed to expose for DynamoDB and then later figure out how, given the simple interface, implementing the necessary functionality. Since there are already existing backends implemented, I was able to base my design on those.
- save – store an item in the table
- get – retrieve a single item from the table
- flush_db – delete all items within the table
- delete – delete a single item from the table
- all – retrieve all items in the table
- evaluate – internal method providing filtering on results before they are returned to the user
2.4 Lessons & Limitations
Over the course of the week, I learned numerous lessons on setting up and managing a DynamoDB table:
- Primary Keys Matter!
When creating a new table, the first thing you need to decide on is a proper primary key. After a bit of searching, I came to realize a fair amount of effort would need to go into determining a proper primary key. You can look over this guidelines intro. You would need a good understanding of your expected workload before figuring out a primary key. Just starting out, I was quickly out of my depth reading through the documentation. Thankfully, I only needed to be concerned with small sample datasets, so the primary key wasn’t vitally important.
- Don’t trust the returned results of scan() or query()
The results returned by a scan or query might not contain all of the data you might expect. One issue is due to the inherent behavior of NoSQL. A database will eventually become consistent, but at any given time might be inconsistent. A recent put or update might not be reflected in a scan. You can use the ConsistentRead option to guarantee consistency; however, the option uses twice as many read units, a costly option.
Another issue is a limitation with DynamoDB itself. A Scan will be limited to returning 1MB of data in a result set. After receiving the first response, a user should check LastEvaluatedKey for outstanding items that still need to be returned.
- Advanced filtering not available out of the box
DynamoDB has a number of builtin conditionals that can be used during querying or scanning, reducing the number of results that need to be returned. When comparing strings, I found I was unable to perform advanced, in this case-insensitive, filtering. String filtering is performed case-sensitively.
Solutions are to either duplicate each text field, effectively having a displayField and searchField. The search field could be all lower case, enabling normal filtering but potentially duplicating a large amount of data. The other solution would be to setup ElasticSearch. I’ve heard ElasticSearch come up often, I hope to get a chance to play around with it.
- Not everything Floats
Storing a float into a tabling using boto3 v1.4.4, results in the below error:
‘Float types are not supported. Use Decimal types instead.’)
TypeError: Float types are not supported. Use Decimal types instead.
There are a number of posts and issues related to finding workarounds and solutions to storing floats types. I tried converting the floats to Decimal; however, I hit an issue where Decimal isn’t serializable into JSON. The solution there was to cast it into a float, and thus I was returned to my original problem.
I ultimately solved the problem by converting the float into a string before saving.
DynamoDB has advanced features for providing scalable solutions while providing an interface where even a novices can quickly get table setup and populated with data. There are numerous applications for an easy to setup, near zero maintenance, scalable, highly available NoSQL DB. New applications or mobile apps could leverage the benefits of DynamoDB to their data.