Posts

Showing posts from May, 2017

Importing data to dynamoDB from S3 (using AWS Data Pipeline)

You will have to have an S3 location first, let's say a directory 'X'. The directory 'X' from which the import will happen should contain below files: a.        manifest b.        your-file-here.txt (the one containing the actual data) your-file-here.txt will contain the data in JSON format, one per line. Go to dynamoDB, select your table by clicking on it. Under 'Actions', hit 'import data'. Create a pipeline and activate it, but before activating, consider below learnings about your data. Learnings when importing data to DynamoDB (from S3 file, using data pipeline): 1.        Replace \ with \\ 2.        No field value should be empty 3.        Each line should independently be a valid json object. Any line should NOT end in a comma. 4.        The file should be JSON verified using bash command: cat <file-name> | python -m json.tool Note that the file may need to be converted to a full json object first, by appen