It is used to run read only SQL query on S3 files
It can be very useful to analyse the logs.
AWS logs on S3 so Athena really fits the purpose
There are 3 steps:
- create a database or use one already created
- create a table importing from an s3 file
- run the query
To know the syntax to create a table for different kind of services you can find the syntax here http://docs.aws.amazon.com/athena/latest/ug/querying-AWS-service-logs.html
All the queries produce a table that is saved in an s3 bucket in your account generated automatically you can change the default one.
Very nice tutorial on linuxacademy nuggets https://linuxacademy.com/linux/training/nugget/name/aws-athena
Some query I have used on cloudfront
list the ips
SELECT requestip FROM "cloudfronts3bucket"."cloudfront_logs" group by requestip;
list all the available sslprotocol
SELECT sslprotocol FROM "cloudfronts3bucket"."cloudfront_logs" group by sslprotocol;
extract all the ips that use a certain protocol
SELECT requestip FROM "cloudfronts3bucket"."cloudfront_logs" WHERE AND sslprotocol='TLSv1'group by requestip;
a query that I have copied , give back a list of operating system and how many from that oss
SELECT os, count(*) FROM cloudfront_logs WHERE date BETWEEN date ‘2015-07-05’ AND date ‘2016-07-05’ GROUP BY os;