Designing for exploitation

Juan Berner
14 min readJan 14, 2020

Note: Unlike the rest of the posts in my blog this is a highly speculative one based on some of my expectations of the impact of AI on the cyber security space and how we might need to adapt to it.

Time as a defensive barrier

While building different systems to protect networks, either for databases, web applications or system attacks, a constantly emerging pattern would be a limitation that would arise around how effective our detection and response would be were we faced with an automated system as an adversary. Something that would be clear in all these cases would be how much we would rely on both the human intervention of those working on defending the networks and the time one would expect an attacker to spend understanding their environment and finding more vulnerabilities or vectors of attack to move around the systems until they can achieve their objective.

But what if we no longer had time? One of the most interesting prospects that I foresee in security is the augmentation and automation of the vulnerability finding and exploitation process aided by machine learning which could lead to what I will call for lack of a better term a Pentest AI (PAI). This is by far not a new concern, in 2016 DARPA create a challenge for teams to build such a system under their Cyber Grand Challenge (CGC). More recently, there has been the HITB AI Challenge allowing members to compete on developing an automated penetration testing model based on the DeepExploit framework and we should expect the interest on the field to keep growing over time. A key difference would be that while such an AI might perform at a human level it would be able to think and act several orders of magnitude faster than its human counterpart.

As an example there are currently few exceptions where being as noisy as needed while attacking a network is a better strategy than trying to avoid triggering alerts, even if it would give more time to the defenders to both find and stop them from achieving their objective. With the rise of ubiquitous usage of machine learning, one thing that stood my attention was how the time that would be used by defenders as one of their main benefits against attackers could be greatly reduced to the point where it would no longer be a factor to consider in their defense strategies. This does not only mean that systems need to be able to respond quickly, but that any human interaction would no longer be an option. This could result in that any security system built around the expectation of human interaction or depending on the time taken for a human attacker to understand the network and find vulnerabilities to exploit would lose its effectiveness.

During this post I will try to go over on why the time dimension as a key defensive barrier might significantly decrease due to the usage of machine learning during attacks, how this could change the defense paradigm for the security of our systems and what we could do to create systems that can not only withstand this but thrive on it by changing to a mindset where we design systems that can be exploited and are able to defend by themselves without any human interaction.

On the automation of exploitation

Automating the exploitation process is older than the World Wide Web, and not much has changed from the first publicly shared instance of a worm known as Morris Worm in 1988¹ to NotPetya in 2017². The automation of exploitation still heavily relies on having a preset of attack vectors baked into the code, and while NotPetya caused an estimated 10 billions of dollars in damage, there are almost no features from the latter that were not present on the former despite being almost 3 decades apart.

Floppy disk containing the source code for the Morris Worm (Also known as The Worm) held at the Computer History Museum

There is however a bright future for automating the exploitation process using machine learning, currently we can have a glimpse of progress in different areas such as on the automation of bypass of antivirus software³ or to improve fuzzing for vulnerabilities⁴ or through challenges mentioned above. We are slowly building all the pieces needed for the creation of a piece of software that can replicate the behaviour of the Morris worm or NotPetya but with the added bonus that it would be able to find vulnerabilities in its environment to exploit and devise ways to counteract static measures taken to stop it.

The danger of PAI worms

Pictured: An optimistic future where AI agents are burdened with the same constraints as humans

What is the potential of a PAI? One can devise a system that behaves very similarly to NotPetya or WannaCry but with the added bonus that it would not rely on a set of pre-baked vulnerabilities to move around its environment and would actively discover new ones. A system that is able to automate the usual capabilities of an attacker targeting a network, by being able to discover what software exists on the network it has established its foothold. Such a system could either get public code/binaries for the software it thinks it has encountered and try to find shared exploits for them or interact with it directly in the network by attempting to perform attacks to discover possibly unknown vulnerabilities which would dramatically change the scenario for those defending a network. In such a scenario, the time barrier described before would no longer be effective as a means to stop or significantly delay an attack until human interaction can find a way to push an update, or reverse the software to build patches or find any possible kill switch⁵. In a similar way that an attacker might currently take weeks or months to go through a network, in particular due to trying to fly under the radar, such a system would be able to accomplish that in minutes or hours.

Even if we would discard the possibility of PAI’s finding unknown vulnerabilities (since there would be a financial incentive to whoever is able to build such a system to apply it for their own commercial gain outside of real time sabotage) we have seen in cases such as NotPetya or WannaCry that Zero-days are not necessary for such an attack. A mixture of using currently known vulnerabilities and looking for ways to bypass defenses (such as how NotPetya or the Morris Worm leverage credentials in exploited systems to access other systems that might not be vulnerable) might already be enough to completely take over most of the existing networks that currently exist.

Our current security tools that focus on prevention techniques (such as isolating in VMs/Containers or configuring the system with SELinux or AppArmor) have a fatal flaw against this threat which is their lack of capacity to automatically adapt based on attempts to bypass them and their reliance on not being vulnerable to attacks. Many others, such as honeypots, would be just pointless. A PAI would be able to quickly explore the system trying to find ways to bypass them either through configuration flaws or vulnerabilities that are present. Not only would the myriad of security solutions placed to protect environments become inefficient against such an attacker, but they might even increase the attack surface due to the amount of code they are built on⁶. To be able to defend against them we would need to change the paradigm of how we build and protect our systems.

Designing for exploitation

Just like the internet was built around the question “Can we create a reliable network with unreliable parts” .. Can we create a secure system with insecure parts? — Bruce Schneier, DEF CON 2019⁷

The motto “Design for failure” is widely accepted nowadays as the only way you can possible create systems in today’s highly distributed and connected environments. When dealing with a system that might reside on thousands of physical servers, it would sound ludicrous to state that the response to a failure on any of those machines should be a human reviewing an alert and having them perform a manual action to allow the system to continue functioning, yet we are placed in a similar situation every day to ensure our system’s security. While there is some agreement that we should expect systems to be compromised, our strategy of just piling up layers to make it harder under the assumption that it will take too long for someone to find a way to bypass all of them would not work against PAI’s. And in many cases it’s the systems that should provide our security the ones that care the least about how we would be protected if they were to be exploited heavily relying on human interaction to protect them from such situations. If you rely on static defence in depth to keep you secure and simultaneously acknowledge that every layer has a vulnerability (hopefully hidden), then you have to realise that once time and human interaction are removed no amount of depth can protect you.

Of course many systems advertise themselves as providing automated security, yet afterwards present hundreds of dashboards and views for analysts to use, and all of them in my experience are unable to protect themselves from vulnerabilities on their own products. We need to start building systems (in particular security systems) with the expectation not just that they have vulnerabilities, but that any preventive measures we place will be bypassed and that they will be actively and constantly exploited just like we do have the expectation not just that they have possible points of failure despite our best efforts but that they will actively and constantly break (even the systems that control the reliability).

While for me this is clearly the only way forward in a world where finding vulnerabilities and exploiting them in an automated fashion might become more commonplace, there won’t be a single way to achieve it. During the rest of this post I will go through a high level overview of what properties I found are needed to protect an online forum where we could assume every component has a vulnerability an adversary would be able to discover in a shorter amount of time than any possible human response.

Zero Trust, beyond just the network

One of the most important mindset shifts I have witnessed in security has been the move to zero trust since its most popular expression through Google’s BeyondCorp paper⁸. In a nutshell this has helped to remove the assumption that networks are secure by themselves inside a particular perimeter and consider any environment as insecure. This has also led among other things to establish the requirement of authentication from each client trying to access a resource which is great to reduce the access of an attacker once they are inside a network. Yet once a system that has access to a particular resource is compromised, then all the data and all their actions can be manipulated by an attacker since there is an inherent trust that any actions performed by the system are correct. Other zero trust approaches such as data anonymization do little more than add layers of complexity for an attacker, but all of them can be broken by an attacker with sufficient time.

Moving to a model where we have zero trust of our own systems would require that any action that creates a permanent record is cross validated and that system can learn from interference, ignore compromised systems and ensure the data’s integrity is maintained. To achieve it, we might need to rely on byzantine fault tolerant systems, of which we can leverage the work done for blockchain technologies implementing smart contracts. While this is still a relatively new field with several problems such as scaling them, there are very promising solutions such as Plasma⁹ that might allow us to build large scale distributed systems that can operate as fast as our current systems yet do not demand complete trust of its members.

Some of these changes might also require moving logic to purpose built hardware, for example ensuring that devices share signals with other devices in a network despite any compromise that could happen to their systems, so that other systems in the network can learn of the compromise and adapt quickly enough despite the initial compromise.

On the weakness of homogeneous systems

When considering the security of systems, even those that are tasked with protecting the environment, one of the problems right now is how homogeneous they are. While the lack of diversity of an application deployment means that we can rely that the same behaviour will consistently happen for each instance of the application, from a security perspective this also means that learning how to exploit any of those systems means you could exploit all of them. If we want to design systems that can have some of their components exploited, this homogeneity is a fatal design flaw which we would need to remediate. Just like a species where all its members are too similar is in danger of being quickly eradicated by a small environment change since none of the members are mutated enough to allow them to slowly adapt, we need systems that also have some level of mutation to ensure they can survive attacks from PAI’s.

This is currently a known problem that systems such as Bitcoin or Ethereum try to solve by having heterogeneous implementations both by their clients programming languages and operating system in which they run. There are several ways we could accomplish this for all our systems in a scalable way either by moving to something to what is called software 2.0¹⁰ or ensuring systems are implemented in several different programming languages and logics, each exposing different vulnerabilities but none of them widespread enough so that all the systems could be taken over faster than they can learn how to protect themselves yet all performing the same tasks in different ways.

From Resilience to Antifragility

I was introduced to the topic by a paper from Kennie H. Jones titled Engineering Antifragile Systems: A Change In Design Philosophy while trying to think around the concept of security systems that might be able to adapt to attacks. Later it was through Talleb’s Nasim book on Antifragility that it really settled to me the importance of making antifragility a founding concept for systems that need to survive in our current hostile environments. If we are trying to create systems that can not just survive but thrive in situations where they are actively attacked and exploited, making them antifragile is a key concept to dominate. A key difference with current security systems is that we would need future security solutions to adapt on the fly to the actions of the environment, learning and becoming better even faster than PAI’s can try to find different ways to exploit them.

Limited impact surface

If we work with the assumption that every component has known vulnerabilities that PAI’s will want to quickly discover and exploit, we need to both assume they will be compromised and make it as hard as possible for them to be able to continue inside the network finding other vulnerabilities and exploiting them or exfiltrating data. This will allow us to ensure our systems can respond faster and adapt to any attacks in the environment, but even if we can respond quickly enough and adapt to it immediately (for example after a single sql injection is executed) it might already be too late since data could have been stolen and sent back to an attacker.

We have already seen attempts to limit the impact surface in large scale systems prone to attacks, such as Amazon’s DNS architecture which utilises cell structures where resources are divided into small groups to reduce the impact of attacks¹¹. If we not only care about availability but also confidentiality and integrity, we could go even further by ensuring that every user has their own environment with its own data. In such a setup I would reference as “Atoms”, any attack would not compromise any data (since either the attacker has compromised user’s credentials or is accessing its own user’s data). Given that we would be working on a zero trust environment, a user that is able to gain a foothold inside one of these environments would have no more impact than it would before, allowing the system enough time to adapt to whatever exploits allowed them to establish a foothold in the first place.

Using Atom like structures does bring an incredible amount of complexities, such as how to deal with aggregation or searchable data, how to allow Atom to Atom communications and how to deal with systems that cannot be partitioned such as identity services (or how to perform wide scale search when every user’s data is stored behind an Atom’s controls). It does on the other hand ensure that a system that places Atom’s as the first entry point to users can expect those Atom’s to be externally compromised without impacting the data of other users and allowing systems to learn from the compromise without exposing any data.

Moving forward

To develop further on this topic I started working on a system where I would expect it to survive a PAI. In a way similarly to the SELinux playground built by Russell Coker the idea behind this is to have a web forum that can function normally while possessing several vulnerabilities people might be aware of and actively able to exploit. This meant implementing many of the concepts mentioned in this post and while it’s still a work in progress (so I won’t be writing details on it in any time soon given the constant changes) it has allowed me to test many of the concepts mentioned previously. At the time of this post it follows an Atom architecture where each user (anonymous or logged in) has their own environment which is both detecting and learning from attacks, and with many baked vulnerabilities that can be exploited (such as RCE, SQLi, SSRF and IDOR) and maintaining it’s security despite of it. For example a user exploiting an SQL Injection would be able to only dump its own data, since the database it can access only has information for that user.

Example of an Atom architecture where each user has a Kubernetes POD where its data is stored and other users need to interact with it to manage data. Directory services provided public data to all PODs (such as usernames or threads) and everyone can request each user for any data they want.

In this architecture, every Atom communicates with each other to access any data needed (such as user names, thread posts or titles) and their compromise should not allow a PAI to move through the environment before the system can destroy the exploited environment, learn from the attack and become more resilient through it. It would be similar to establishing the same level of trust that blockchain distributed applications place on each other to our current networks.

One of the usual concerns I get for the future of such an architecture is how could we scale it to millions of active users. In my implementation since any user environment active (either due to the user using it or someone actively querying it) would demand its own Kubernetes POD to be up and running I focus on having small memory footprints (with each api consuming 2mb of ram) and environments can start or die based on user demand. Other small changes would allow to have users which are heavily requested to have more replicas available to allow for growth and spikes of traffic to a particular user. Still Kubernetes was not design for such a usage so there is a lot of room for improvement to allow such an architecture to function in an efficient way and robust against denial of service attacks.

You can find the forum mentioned here: https://drascalon.com and try to exploit it, I hope you can send me feedback @89berner !

(Note: the site above is no longer running)

--

--

Juan Berner

All about security and scalability. Views expressed are my own.