Wednesday, January 25, 2017

AWS Athena to read Dynamodb backup data

Our data scientists wanted to read some data from Dynamodb. In order to do this, we had to increase the Read Capacity of the table to a high value and it was cost consuming. So we architected a system using Athena where data scientist could run their query whenever they want.

We used AWS datapipeline template to backup the dynamodb table. So, we had our data archived into S3 in multiple json formatted files. Here Athena came into picture. Athena is a new tool created by AWS which runs presto underneath. So we created a table in Athena using the Dynamodb backup location and started exploring data:

CREATE EXTERNAL TABLE IF NOT EXISTS default.test ( key string, sortkey string, data string)
WITH SERDEPROPERTIES ('ignore.malformed.json' = 'true')
LOCATION 's3://bucket-name/DynamodbDBBackup/2016-12-23-20-28-20/'

It created a table in Athena. It's time to run query.

select * from default.test limit 3;
Sample Data:


Now we had to extract these data in such a way that we would get only values for key and sortkey and from details attribute we had to extract only usage.

As I said, Athena is a service for presto, we used presto function to extract data from json. There is a function is presto described at presto functions:

json_path expression can use the dot-notation and it takes $ as root path. More details are available at JsonPath

Using the above formula, we wrote our query:

SELECT cast(json_extract(key,'$.s') as varchar) as key ,
cast(json_extract(sortkey,'$.s') as varchar) as sortkey
, json_extract(replace(cast(json_extract(details,'$.s') as varchar),'""','"'),'$.usage') as usage
FROM test
limit 10;​

And that gave us the data as:


It was really fun to use Athena.

8 Responses so far

  1. Navyasri says:

    hello sir,
    thanks for giving that type of information. I am really happy to visit your blog.Leading Solar company in Andhra Pradesh

  2. PMP Certification with placement was never so easy and adaptable to everyone but here at ExcelR We teach you numerous ways of doing PMP Courses, which are way easy and interesting. Our experienced and expert faculty will help you reach your goal. 100% result oriented strategies are being performed; we offer PMP Certification Course in pune

    PMP Certification

  3. I Want to leave a little comment to support and wish you the best of luck.we wish you the best of luck in all your blogging endeavors.
    Business Analytics Course in Bangalore

  4. I want to leave a little comment to support and wish you the best of luck.we wish you the best of luck in all your blogging enedevors.
    Data Analytics Courses in Bangalore

  5. wonderful article. Very interesting to read this article. I would like to thank you for the efforts you had made for writing this awesome article. This article resolved my all queries. Mason Jacob

  6. This is really very nice post you shared, i like the post, thanks for sharing..
    data science training in malaysia

Leave a Reply