Best practices for securing your AWS architecture

Monitor, Restrict, and Prepare through automation

Derek Hutson
7 min readSep 5, 2022
Photo by FLY:D on Unsplash

With numerous high profile security incidents being brought to light lately, as well as countless others that don’t make headlines, securing your architecture and workloads is as important now more than ever. After all, why bother creating a beautifully designed cloud solution only to have it compromised and used by malicious actors?

There are numerous ways that your architecture can be taken advantage of which include but are not limited to SQL injections, DDoS attacks, Ransomware attacks, and even harmful events unintentionally caused by unaware users of applications on your architecture. One of the mantras of securing workloads is to be prepared for when an event happens not if it happens. You need to assume that at some point in time there is going to be a security incident, so it is best to be proactive and not reactive. By being proactive you can minimize incident response time, minimize data and revenue loss, and best of all have your recovery protocols implemented before you are even aware an incident is happening.

In the cloud, there are a few different principles to consider that can help you strengthen your workload security:

  • Implement a strong identity foundation
  • Enable traceability
  • Apply security at all layers
  • Automate security best practices
  • Protect data both in transit and at rest
  • Keep people away from data
  • Prepare for security events

Before we talk a little bit more about each of these areas, lets do a quick refresher on the shared responsibility model. This applies to cloud computing providers in general, but here we will reference AWS. In one sentence, you are responsible for the security in the cloud, and AWS is responsible for the security of the cloud.

AWS shared responsibility model

A nice feature of cloud computing that we have mentioned many times before, is that you have to do little to no maintenance, or in this case securing, of physical hardware that your applications use. You are only responsible for how you use this hardware and who is allowed access to it. Having said that, lets get a little more detailed on some of the best practices for securing your workloads in the cloud.

  1. Implementing a strong security foundation

This practice revolves around one concept: The principle of least privilege. In other words, a user accessing your workloads should only have enough permissions to do what they need to do, and nothing else. As a bit of an extreme example that will drive this point home, if you have 100 users that all have admin access, that could be a problem. They could accidentally make some changes to your architecture or data. Even if they don’t, if one of their accounts gets compromised by a malicious actor then you could be in real trouble.

A better solution would be to give permissions by role. For example if you have a database administrator, they only need enough permissions to access and maintain your database services and nothing more. If you have someone on your accounting team that needs access to certain data, then they only get enough permissions for read-only access to that storage/database service, and within that only the bucket/table etc. that contains the data they need to see.

2. Enable traceability

There are quite a few services within AWS that allow you to see who did what, and when. On top of that there are multiple 3rd party applications and services you can use in conjunction with AWS that will perform similar functions, but perhaps with more appropriate granularity as required by your organization.

The most common AWS service for this type of security would be AWS CloudTrail. This service allows you to track user activity and API usage across accounts and regions that you can setup to work together with EventBridge to monitor workflow rules you setup.

You can then configure event logs to be sent to S3 to be analyzed in the future so you know the source of any given event. Or for the sake of ease and automation you can use Amazon Athena to run queries over your log files to search for specific events.

With these services there is no reason any event should occur within your accounts without you knowing when it happened and who did it.

3. Apply security at all layers

You want to have a seamless, multi-layered approach that encapsulates all of your AWS users, services, code, and everything in between. There should be no areas that are not secured on some level.

If we take a typical 3 tier architecture for example (public subnet, 2 private subnets), then you want to make sure and secure your edge of network, VPC, load balancers, every compute, storage, and database instance, your operating system, codebase, and application itself. Most every service and action has a corresponding service that can be used to monitor and enforce security protocols within it.

It sounds tedious yes, but this is why we want to try and automate as much as we can, so that we can focus on building better apps and spend less time thinking about how to secure them and prevent incidents.

4. Automate security best practices

This is perhaps the most useful of all security concepts, as it allows you to drastically reduce human error and decrease response time. This is commonly done at either a service level, or an admin-defined rule level.

A service level would include applying AWS Shield to your Web application components to protect against DDoS attacks, or applying AWS WAF (Web Application Firewall) to those same resources to protect against common security risks like cross-site scripting, SQL injections, or server side request forgery attacks. These are AWS managed services where you just need to set rules and the rest is handled by the service itself.

An admin defined rule level on the other hand would include proper management of IAM roles and permissions for example, or configuring CloudTrail and CloudWatch across your services to monitor for specific events. You would also set specific actions to occur in response to specific events so that you manually do not have to respond. For example you could have an IAM group with defined permissions that a new user is automatically put into when they setup their account.

Automation saves you time, reduces human error, and is something that you should be taking advantage of at every opportunity.

5. Protect data both in transit and at rest

As mentioned above numerous data breach incidents have occurred in recent years, so it is important to protect data at all times.

There are common industry standard practices such as using SSL/TLS for data in transit, or using AES 256 encryption for data sitting in a database. On top of these though you can utilize services such as AWS KMS (Key Management Service) to manage your data and master keys and their rotations, and AWS certificate manager to make provisioning and deploying SSL certificates easy and quick to integrate with your services.

6. Keep people away from data

This one is somewhat self explanatory. If people have access to data then not only can they see possible sensitive data that can cause issues, but they also could get the ability to alter or accidentally delete it.

Again, if they have an account that is able to access data, that account could at some point be exposed to a malicious actor who can then cause serious problems for you and your organization.

So keep it simple and tie things back to the principle of least privilege. Most if not all of your users should not be able to see your data, so ensure their accounts and roles have the adequate restrictions to keep it that way.

7. Prepare for security events

This concept is a little less straightforward and is likely what is going to require the most manual effort.

This involves having defined protocols for responding to security events and incidents. However, until you experience such events it is impossible to know from what angles you could potentially be exposed. The point of documenting responses to security events is to have a defined action plan that minimizes RTO and RPO in the event of exposure.

However since we do not want to be reactive and wait until an event occurs, a best practices is to run simulated game day events. This is where you intentionally try to find exposures in your architecture and do harm to your services in a controlled manner. By running simulated events you can potentially find holes in your security that will let you patch them for future compliance, as well as document how to properly respond to that type of event in the future.

One thing to keep in mind though since you are running your applications on AWS hardware, is that some simulations require permission from AWS. This is because you will likely have some of your architecture with shared tenancy, meaning the hardware your applications sit on are being used by other customers as well. AWS wants to make sure that you running a simulation does not cause harm to other customers using the same hardware as you.

Hopefully this overview of how to better secure your architecture is helpful. And although you want to automate as much as you can, there is always more that can be learned and implemented, especially as AWS continues to come out with new innovations and ways to use their services.

You always need to assume that there is a malicious actor out there looking to do harm to your services and your organization. So automate everything that you can, have documented responses to particular events, and continue to follow the best practices in the AWS security whitepaper.

As always, best of luck on your continued journey in cloud computing.

--

--

Derek Hutson

Practicing Kaizen in all things. Being a dad is pretty neat too.