Bootstrapped a startup for under $1000
We created a real product, not just an MVP prototype. It was robust, secured and scalable and we spent less than $1,000 on AWS.
We wanted to get to market quickly, but we also wanted to be secure and ready to scale. We did not want to to create an MVP that was really just a prototype, nor did we want to compromise on testing and security.
Compounding the problem, we were on a strict budget and wanted to spend as little as possible on AWS to get there.
The magic of the cloud is that you can create and destroy infrastructure almost effortlessly. This is wonderful for startups, as launching experiments costs virtually nothing. The problem is that cloud resources are typically billed by the second and costs can quickly mount as you test and grow. So it is essential you run only what you need and no more.
Our goal was to minimize AWS costs, but at the same time, build out a production-ready infrastructure.
This challenge was exacerbated by the need to replicate our entire production environment for test, staging and developer needs. That meant that on any given work day, we could have up to 4+ copies of our infrastructure running. These additional environments were complete implementations of our application with databases, servers, containers, lambdas, load balancers, caching, networks and security infrastructure. To create and run them all 24x7 was cost prohibitive for us.
Automating and Replicating Infrastructure
Once the production environment has been created, it can be replicated trivially for "staging" or "test" by running Terraform with a few parameters tweaked.
Terraform provides the ability to:
- reliably infrastructure in an automated manner.
- scale infrastructure via Terraform parameters.
- audit configurations to spot security vulnerabilities.
- turn on and off complete environments.
This last item was key to staying under budget. We could turn off all resources we did not need at any point in time.
Each day we would fire up the required environments as they were needed and stop them immediately when not required. Regardless at the end of the day, we would terminate everything. In the early days, this meant terminating everything (including production) at the end of the day, and starting anew each morning. This was implemented via a simple script to invoke terraform with the right environment definition.
While Terraform is ideal for automating the creation and replication of infrastructure, it does take considerable time to create an entire environment. Creating our production or test environments from scratch takes about 20 minutes, largely due to the database creation overhead.
That time delay was frustrating for eager developers who had to wait for cloud resources before getting down to work.
Fast Power Down
To keep costs low, we needed to be able to quickly start and stop entire environments. We wanted to reduce the time to power up and down to under a minute for the entire environment. However, you cannot create and terminate resources that quickly. Fortunately AWS resources can typically be stopped instead of being terminated. Thereafter, they can be quickly restarted. And of course, when stopped, the resources incur no charge.
We initially wrote a simple bash script that invoked the AWS CLI to power down each cloud resource in sequence for an entire environment. This would scale down AutoScale groups and stop server instances. This script immediately slashed our AWS bill by 70% and permitted our AWS free tier credits to last much longer. Combined, this kept our total AWS bills below $1,000 before launch.
Although this strategy worked, it was somewhat fragile and it required manual invocation. It had to be run explicitly and developers would sometimes forget to terminate environments in the rush to go home at night or for the weekend which would curtail the savings.
Knowing when to invoke the script was also problematic for remote developers who worked a different schedule, often late at night. These developers were sometimes rudely surprised when the test and development environments were powered down at the end of the day!
To reliably schedule powering up and down the required environments for all developers, we needed a better tool.
To go the next step and schedule our environments we leveraged an internal cloud scheduling tool that has since been released as PowerDown.
After defining our environments as PowerDown Resource Groups, we could schedule them to be warmed up just before work started and stopped at the end of the day.
This strategy was set and forget. PowerDown would automatically to start and stop environments like clockwork.
Each developer defined their own personal schedule for cloud resources and environments. PowerDown blended the personal schedules into a single master schedule. If a developer had unexpected work, they could temporarily override the schedule using the PowerDown CLI or web interface. Thereafter, the schedule resumed.
With this technique, we have been able to reliably reduce our AWS cloud costs by more than 70%.
Lower Production Costs
Our production service and its infrastructure needed to be available 24x7, so powering down was not an option for production. We could use reserved instances, but these would lock us into a 6 - 12 month term and we wanted the flexibility to scale quickly. We could use Spot instances which are up to 75% cheaper, but AWS can terminate these with as little as 2 minutes warning and Spot Fleet is not able maintain capacity when Spot instances are in short supply.
Fortunately, PowerDown can transparently migrate production workloads from On-Demand servers to more cost effective Spot instances without impacting availability. With PowerDown, you specify the ration of Spot vs On-Demand servers and PowerDown automatically maintains that balance. If AWS reclaims the Spot instances, PowerDown will automatically launch On-Demand instances to maintain capacity until the Spot supply is restored.
Other Cost Saving Tips
There were a few other items that helped us stay under budget.
AWS regions can vary in cost for resources by more than 35%. We picked the us-east-1 region in Virginia USA as it is the cheapest region and straddles US and EU markets.
AWS instances also vary greatly in price. Early on, we made extensive use of T2 instances as surrogates for our main instance types. Terraform makes switching instance types for testing quite easy. We'd test more scaled up instances and then revert back to T2 for cost savings. Be careful when selecting your instance type. AWS often makes newer instance types cheaper on a CPU per dollar basis.
We extensively used AWS Spot Instances which are up to 90% cheaper than their On-Demand counterparts. (Note: T2 Spot Instances are now available in most regions).
In the past, software companies were often forced to raise large amounts of venture capital to fund their development up to product release. It is truly amazing that now, we can create a product for less than $1,000 of cloud spend, and be fully ready for production and scale without compromising testing or security.