When you do things that only create an illusion of security, without actually improving it in any way, you are participating in security theater. This can lead to a counter-effect that actually decreases your security in the worst case or just lead to a big waste of time in the best case.
As you can guess, in this post I want to give examples of these traps and help you avoid them. As usual, your options to "fight back" depend greatly on the type of company. In a large enterprise, which is heavily risk-averse by default, you will have to pick your battles carefully.
An Example of Security Theater in Software Engineering
A long time ago my team had a third-party contractor perform a security review and penetration test of an internal application. This app was available only via the company’s VPN and required 2FA to access. It was quite secure by default, but a review was mandated by company policy.
There were some good findings, but overall nothing major. Except for one issue that really stuck with me as a seriously annoying waste of time. In our app, we had one creation modal that allows you to kick-start an asynchronous data processing job. In this modal window, you had a couple of free-text entries to provide certain parameters.
Our backend didn’t do any special validation of the incoming POST request. (Outside of checking that it was valid JSON and followed the expected schema.) So, when the security team saw that they were able to input all sorts of junk and get a
200 OK response they marked the issue as "Critical." This meant the team should stop whatever they are doing and fix it within 5 business days.
This was, to put it mildly, a waste of time. An SLA of 30 days would have been fair and acceptable, but the 5-day deadline meant we had to delay progress on an important feature in order to meet a threat that was not serious.
And here’s how I know the threat really wasn’t serious – the async job was performing the validation anyway. The API was just scheduling it, but the job was doing various validations and checks.
There was a good reason why the team decided to build it like this. The job could be invoked through multiple channels, so it made sense to have the job do this itself instead of trying to maintain it in several different places.
Here’s a simple diagram of the architecture that I remember:
The security team did not care about this of course and their word was final. Ignoring the 5-day deadline meant fighting against the company’s policy. In our setting, this was not a worthwhile investment.
We extracted the validation into a shared library and reused it on our API. The level of effort was not huge, but slowing down on an important feature, with a tight deadline, created churn problems for us.
As I look back on this today, there are two aspects of security theater at play. First, this was not a critical issue. This was an internal app on VPN with two-factor authentication. The input validation was not missing, it was just performed at a different layer. I’m sure there’s a brilliant security engineer who could find some sort of argument that our thin in-between layer was somehow being compromised, but realistically everyone should be able to agree that this could wait.
Second, there are other things on this diagram that should be more concerning. The design of the system gives too many attack vectors and the part that deserves more scrutiny is on the right, labeled as "other systems." The security team raised a lot of noise around low-hanging fruit that is easy to figure out, but they didn’t do anything to dive deep into the "black box behind the API."
And this may be quite dangerous. We can argue that the engineering team now (wrongly) thinks that everything else in their system is fine. They just need to fix this one action item and their platform is untouchable. When security theater leads to this kind of overconfidence, it produces a counter-effect to the original intent.
I think that if this review was done correctly, it would have yielded the following action items:
- Standardize all channels to one access pattern. Centralize validation here.
- Eliminate legacy CRON schedules.
And it would have come with a suggestion similar to this diagram:
Don’t Let Security Theater Degrade Customer Experience
In the previous example, the unpopular "right" choice is to follow along. The fix we had to do was acceptable, but annoying given the timelines and other circumstances. It’s not worth fighting battles that will take more time than the solution – especially when the solution changes nothing in the customer experience.
Battles worth fighting are those that directly involve your customer’s experience and your core business flows.
A common example in this space is access control management. Security reviews often end up recommending the creation of hundreds of different roles and permissions along with all sorts of usage flows that are not good for the end-user experience. This can be a hard battle and high control granularity is a must in most enterprises. (And there’s a good reason for that – both from a business and from a security perspective.)
However, this doesn’t mean the user experience has to be overly complex. Security mandates on the internal workings of the system should not be allowed to spill over and degrade your customer experience. This is where you will find battles worth fighting and your customer-centric approach will provide you with enough ammunition to back up your case.