Company Overview:

Gavin La Rowe at ChalkLabs explains how the company is benefiting from their use of AWS to meet their large-scale data needs:

CHALKLABS Case StudyHi Gavin, briefly tell us about your business.

ChalkLabs is a specialized software development company located in Bloomington, IN, that focuses on novel high-performance computing, data mining, and visualization solutions for large-scale data problems.

How have you incorporated Amazon Web Services as part of your architecture? What services are you using and how?
AWS is a primary target for our large-scale data needs when we need to scale up rapidly or meet memory, CPU, or GPU requirements. We use Amazon EC2 for scaling out our high performance computing capabilities and sometimes use Amazon EBS for storing data on the cloud during large-scale data mining operations. In particular, we regularly use the 4xlarge memory instances, high performance computing instances, and cluster compute instances.

Why did you decide to use AWS? 
We did not have the internal firepower to do the type of processing we wanted to do for Elsevier’s ScienceDirect data set and it was cost prohibitive to purchase the needed resources. The high performance computing and high memory instance types, AWS user documentation, and ease-of-use were primary drivers for using AWS.

How has AWS helped your business?
AWS provides a cost-effective, time sensitive, large-scale data processing solution that is typically only available to enterprise-class businesses. If you have the technical know-how, AWS is a great equalizer.

Can you share any metrics on your usage of AWS to date?
Most recently we used AWS for calculating similarity scores and large-scale data modeling. Using multiple AWS HPC instances in parallel for a large-scale data mining algorithm, we achieved 95% run-time optimizations in both processes:

  • The first job involved calculating similarity scores for a total of 8.6 trillion data pairs.
  • Running it on just a single core of a single HPC instance would take a whopping 9,856 days! Running on a single HPC instance, this would typically take our algorithm approximately 22 days to finish.
  • With the use of multiple large HPC instances, we reduced the time for processing this data from 22 days to 33 hrs effecting a 95% optimization for the process.
  • The second job was optimizing a data modeling application to work on EC2’s High-Memory instance (m2.4xlarge) which contains a healthy 68.4 GB of RAM and 8 cores.
  • We optimized the data modeling application to best use the memory and cores available for the AWS high-memory instance and reduced our run-time processing for a job of 3.8 million ScienceDirect articles from 100 days on our infrastructure down to just 5 days of processing time on AWS.

Do you have any future plans to incorporate other AWS solutions?
Absolutely! Part of our data processing regime hinges on sustainment and evolution of the high performance instance types offered by AWS. Please keep iterating and evolving those services.


Amazon Case Study: Chalklabs

Date: 28-08-2013

--> /**/
We look forward to working with you for Cloud, Security, and other IT product-related opportunities. You can connect with us at +91-8826294740