I have been running my Forensic Artifact API on Ubuntu with a Nginx, Flask Python, and MariaDB stack. I wanted to get out of the infrastructure administration business by moving to the AWS Cloud. I decided to start with the migration of my SHA256 hash library. My goals were to improve availability, allow collaboration and keep the costs down. I wound up having an expensive learning experience while importing the data into DynamoDB!

I decided to use the Amazon Web Services (AWS) Boto3 SDK for Python so I could read from an S3 bucket with an EC2 instance that inserts into a DynamoDB table. I was able to read the line-delimited text file of SHA256 hashes as a stream minimizing the amount of memory required on the EC2 instance for Python. Batch writing of items into the DynamoDB table can use a maximum set of twenty-five. I set the batch volume with ‘range’ in the for loop that must match the minimum provisioned capacity for auto-scaling startup. Global tables being used to replicate DynamoDB across regions needs to match ‘range’ until the first auto-scale completes.
import boto3
def import_hash(hashlist,hashtype,hashsrc,hashdesc):
client = boto3.client('s3')
resource = boto3.resource('s3')
matchmeta = resource.Bucket('bucketname')
obj = client.get_object(Bucket='bucketname', Key=hashlist)
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('sha256')
while True:
with table.batch_writer() as batch:
for i in range(25):
item = obj['Body']._raw_stream.readline()[:-2].decode('utf-8')
if not item: breakĀ
batch.put_item(Item={'sha256':item.upper(),'type':hashtype,'source':hashsrc,'desc':hashdesc})
if not item: break
import_hash('Folder/File.txt','Known','HashSets.com','Windows')
DynamoDB has an issue if read/writes go to ‘zero’ that auto-scaling will not reduce down to the minimum provisioned capacity. I needed to use a time-based CloudWatch event to execute a Lambda function to generate regular database activity.
import boto3
dynamodb = boto3.resource('dynamodb')
def lambda_handler(event, context):
table = dynamodb.Table('sha256')
table.get_item(Key={'sha256':'0000000000000000000000000000000000000000000000000000000000000000','type':'TEST'})
table.put_item(Item={'sha256':'FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF','type':'TEST','source':'JOHN','desc':'PING'})
return
Happy Coding!
John Lukach
@jblukach