How to create table in AWS Athena
If you are used to classical SQL, you might be confused with AWS Athena ways of adding data.
The classical INSERT INTO is not obvious in the cloud platform, but today you will learn to do it step by step using a CSV file!
First, let’s understand what this service provides.
What is AWS Athena?
According to the documentation, the AWS Athenas is an interactive query service that makes it easy to analyze data directly in Amazon Simple Storage Service (Amazon S3) using standard SQL.
Wait, Amazon Simple Storage Service (S3)?
Yes, the Amazon S3 is just like a Google drive to Amazon! It will store the data we want to manipulate later in Athena!
Step-by-step
I assume you are registered on Amazon. If not, please, click here to create your account.
1.Go to https://s3.console.aws.amazon.com/s3/
2. In our example, our database’s information will be uploaded as a .csv file in a new bucket. Thus, let’s create a new bucket in:
3. Choose the name of your folder (bucker), scroll down the page and click on ‘create bucket’
Ps.: there are several settings you can manage on the page, but we will keep it simple.
4. Open the folder you’ve just created, and you should see a page like:
5. As you’ve noticed, we have an empty folder to upload a CSV file. In our example, we will first add data from link.
Upload fraudTest.csv to your recently created bucked by clicking in Upload > Add Files. Select your file and finally click on Upload again.
6. If it is your first time in using AWS Athena, you have to create a new bucket to store the results of your queries. Therefore, create a new bucket named query-results-<yourname:
7. Now, go to https://sa-east-1.console.aws.amazon.com/athena/ to set up the AWS Athena (if it’s your first time) and to create your table.
If it is your first time, go to Settings > Manage:
In Manage, link in Browse S3 and locate the folder you’ve just created to store your queries:
8. Go back to the main Editor in the top menu to create the tabel. To make it easier, we can create the table by AWS wizard by clicking on S3 BUCKET DATA
9. Choose the name for our table at Table name
10. In Dataset, choose the folder you’ve create in the S3 bucket
11. If it is your first time, you can create a new database in:
12. In data format, choose CSV
13. In Column details, choose the name of the column and type
14. click on
15. The website will return to the query’ page. Then, you can check on the left side that table transactions was created.
16. Check if data is corrected by typing:
SELECT * FROM transactions limit 10
The result:
Notice that for some rows, the data is not corrected. This is due to the “ ” presented in the CSV file in strings with ‘,’. A preprocessing task can quickly fix it before the CSV upload or by creating the data ignoring the characters.