Tuesday, December 13, 2011

Trusted Execution In Untrusted Cloud


Wouldn't it be nice if we could actually own our data and programs in the cloud? By “owning” here I mean to have control over their confidentiality and integrity. When it comes to confidentiality and integrity for the data, it's not much of a rocket since, as the classic crypto (and secure client systems) is all that we need. I have already wrote about it in an earlier post.
But it would also be nice, if we could somehow get the same confidentiality and integrity assurance for our programs that we upload for the execution in the cloud...

For example, a company might want take their database application, that deal with all sorts of corporate critical sensitive data, and then upload and safely run this application on e.g. Amazon's EC2, or maybe even to some China-based EC2-clone. Currently there is really nothing that could stop the provider, who has a full control over the kernel or the hypervisor under which our application (or our VM) executes, from reading the contents of our process' memory and stealing the secrets from there. This is all easy stuff to do from the technical point of view, and this is also not just my own paranoia...


Plus, there are the usual concerns, such as: is the infrastructure of the cloud provider really that safe and secure, as it is advertised? How do we know nobody found an exploitable bug in the hypervisor and was not able to compromise other customer's VMs from within the attacker-hired VM? Perhaps the same question applies if we didn't decided to outsource the apps to a 3rd party cloud, but in case of a 3rd party clouds we really don't know about what measures have been applied. E.g. does the physical server on which my VMs are hosted also used to host some foreign customers? From China maybe? You get the point.

Sometimes all we really need is just integrity, e.g. if we wanted to host an open source code revision system, e.g. a git repository or a file server. Remember the kernel.org incident? On a side note, I find the Jonathan Corbet's self-comforting remarks on how there was really nothing to worry about, to be strikingly naive... I could easily think of a few examples of how the attacker(s) could have exploited this incident, so that Linus & co. would never (not soon) find out. But that's another story...

But, how can one protect a running process, or a VM, from a potentially compromised OS, or a hypervisor/VMM?

To some extent, at least theoretically, Intel Trusted Execution Technology (TXT), could be used to implement such protection. Intel TXT can attest to a remote entity, in that case this would be the cloud customer, about the hash of the hypervisor (or kernel) that has been loaded on the platform. This means it should be possible for the user to know that the cloud provider uses the unmodified Xen 4.1.1 binary as the hypervisor and not some modified version, with a built-in FBI backdoor for memory inspection. Ok, it's a poor example, because the Xen architecture (and any other commercially used VMM) allow the administrator who controls Dom0 (or equivalent) to essentially inspect and modify all the memory in the system, also that belonging to other VMs, and no special backdoors in the hypervisor are needed for this.

But let's assume hypothetically that Xen 5.0 would change that architecture, and so the Dom0 would not be able to access any other VM's memory anymore. Additionally, if we also assumed that the Xen hypervisor was secure, so that it was not possible to exploit any flaw in the hypervior, then we should be fine. Of course, assuming also there were also no flaws in the TXT implementation, and that the SMM was properly sandboxed, or that we trusted (some parts of) the BIOS (these are really complex problems to solve in practice, but I know there is some work going on in this area, so there is some hope).

Such a TXT-bases solution, although a step forward, still requires us to trust the cloud provider a bit... First, TXT doesn't protect against bus-level physical attacks – think of an attacker who replaces the DRAM dies with some kind of DRAM emulator – a device that looks like DRAM to the host, but on the other end allows full inspection/modification of its contents (well, ok, this is still a bit tricky, because of the lack of synchronization, but doable).

Additionally for Remote Attestation to make any sense, we must somehow know that we “talk to” a real TPM, and not to some software-emulated TPM. The idea here is that only a “real” TPM would have access to a private key, called Endorsement Key, used for signing during Remote Attestation procedure (or used during the generation of the AIK key, that can be used alternatively for Remote Attestation). But then again who generates (and so: owns) the private endorsement keys? Well, the TPM manufacturer, that can be... some Asian company that we not necessarily want to trust that much...

Now we see it would really be advantageous for customers, if Intel decided to return to the practice of implementing TPM internally inside the chipset, as they did in the past for their Series 4 chipsets (e.g. Q45). This would also protect against the LCP bus-level attacks against TPM (although somebody told me recently that TPM in current systems cannot be so easily attacked from LCP bus, because of some authentication protocol being used there – I really don't know, as physical attacks have not been the area we ever looked at extensively; any comments on that?).

But then again, the problem of DRAM content sniffing always remains, although I would consider this to be a complex and expensive attack. So, it seems to me that most governments would be able to bypass such TXT-ensured guarantees in order to “tap” the user's programs executing in the cloud provides that operate within their jurisdictions. But at least this could stop malicious companies from staring up fake cloud services with an intent to easily harvest some sensitive data from unsuspecting users.

It seems that the only way to solve the above problem of DRAM sniffing attacks is to add some protection at the processor level. We can imagine two solutions that processor vendors could implement:

First, they could opt for adding an in-processor hardware mechanism for encrypting all the data that leave the processor, to ensure that everything the is kept in the DRAM is encrypted (and, of course, also integrity-protected), with some private key that never leave the processor. This could be seen as an  extension to the Intel TXT.

This would mean, however, we still needed to relay on: 1) the hypervisor to not contain bugs, 2) the whole VMM architecture to properly protect VM's memory, specifically against the Dom0, 3) Intel TXT to not be buggy either, 4) SMM being properly sandboxed, or alternatively to trust (some parts of) the BIOS and SMI handler, 5) TPM's EK key to be non-compromised and verifiable as genuine, and 6) TPM bus attacks made impossible (those two could be achieved by moving the TPM back onto the chipset, as mentioned above), and finally, 7) on the encryption key used by the processor for data encryption to be safely kept in the processor.

That's still quite a lot of things to trust, and it requires quite a lot of work to make it practically really secure...

The other option is a bit more crazy, but also more powerful. The idea is that the processor might allow to create untrusted supervisors (or hypervisors). Bringing this down to x86 nomenclature, it would mean that kernel mode (or VT-x root) code cannot sniff or inject code into (crypto-protected) memory of the usermode processes (or VT-x guests). This idea is not as crazy as you might think, and there has even been some academic work done in this area. Of course, there are many catches here, as this would require specifically written and designed applications. And if we ever considered to use this technology also for client systems (how nice it would be if we could just get rid of some 200-300 kLOC of the Xen hypervisor from the TCB in Qubes OS!), the challenges are even bigger, mostly relating to safe and secure trusted output (screen) and, especially, input (keyboard, mouse).

If this worked out, then we would need to trust just one element: the processor. But we need to trust it anyway. Of course, we also need to trust some software stack, e.g. the compilers we use at home to build our application, and the libraries it uses, but that's somehow an unrelated issue. What is important is that we now would be able to choose that (important) software stack ourselves, and don't care about all the other software used by the cloud provider.

As I wrote above, the processor is this final element we always need to rust. In practice this comes down to also trusting the US government :) But we might imagine users consciously choosing e.g. China-based, or Russia-based cloud providers and require (cryptographically) to run their hosted programs on US-made processors. I guess this could provide reasonable politically-based safety. And there is also ARM, with its licensable processor cores, where, I can imagine, the licensee (e.g. an EU state) would be able to put their own private key, not known to any other government (here I assume the licensee also audits the processor RTL for any signs of backdoors). I'm not sure if it would be possible to hide such a private key from a foundry in Hong Kong, or somewhere, but luckily there are also some foundries within the EU.

In any case, it seems like we could make our cloud computing orders of magnitude safer and more secure than what is now. Let's see whether the industry will follow this path...

Tuesday, December 06, 2011

Exploring new lands on Intel CPUs (SINIT code execution hijacking)

Today we're releasing a new paper where we describe exploiting a bug in Intel SINIT authenticated code module that allows for arbitrary code execution in what we call an “SINIT mode”. So, to the already pretty-well explored “lands” on Intel processors, that include ring 3 (usermode), ring 0 (kernelmode), ring “-1” (VT-x root), and ring “-2” (SMM), we're now adding a new “island”, the SINIT mode, a previously unexplored territory inhabited so far only by the Intel-blessed opcodes.

What is really interesting about the attack are the consequences of SINIT mode hijacking, which include ability to bypass Intel TXT, LCP, and also compromise system SMRAM.

It's also interesting how difficult was this vulnerability for Intel to patch, as they had to release not only updated SINIT modules, but also updated microcode for all the affected processors, and also work with the BIOS vendors so they release updated BIOSes that would be unconditionally loading this updated microcode (plus provide anti-rollback mechanisms for both the BIOS and microcode). Quite an undertaking...

You can get the paper here.

Intel also published an advisory yesterday, which can be downloaded from their website here. The advisory is peculiar in a few ways, however...

First, the advisory (I'm referring to the revision 1.0) never explicitly mentions that the attack allows to bypass TXT launch itself, only that the attack “may compromise certain SINIT ACM functionality, including launch control policy and additionally lead to compromise of System Management Mode (SMM). Intel also recommend to disable TXT altogether in the BIOS, as a preventive measure, in case the user doesn't “actively running Intel® TXT”... This reminds me how various vendors started actively disabling Intel VT-x after certain virtualization rootkits have been demonstrated some 5 years ago, and how many laptops still ship with this technology disabled today (or VT-d at least) to the questionable delight of many users.

Second, the advisory assigns only an “Important” rating to this vulnerability, even though another Intel advisory, published some two years ago for a problem also reported by us, and which which was strictly a subset of the current vulnerability in terms of powers that it gave to the attacker (in other words the current vulnerability provides the attacker with everything that the previous one did, plus much more), was given a “Critical” rating... This is called evolution, I guess, and I wonder what would be considered critical by Intel these days?

UPDATE (Dec 7th, 2011): Intel has just released an updated advisory (release 1.1) that now explicitly states that the vulnerability also bypasses Intel TXT.

This is the last paper co-authored with Rafal Wojtczuk, who recently decided to try some new things and to leave ITL. Rafal has been the most talented exploit writer I have worked with, and I will surely miss his ingenious insights, such as e.g. how to practically win an absolutely hopeless race condition with ICMP-delivered MSI! But then again, how many times can one break Intel technologies, before getting bored? At the same time ITL is really transforming now into a development company, with all our efforts around Qubes and architecting, rather than on breaking. I wish Rafal all the best with his new endeavors, and thank him for all the excellent contributions he made while working for ITL over the past 3+ years.

Wednesday, September 28, 2011

Playing with Qubes Networking for Fun and Profit

Today, I would like to showcase some of the cool things that one can do with the Qubes networking infrastructure, specifically with all the new features that have been brought by the just released Qubes Beta 2. This will cover the use of multiple Net VMs for creating isolated networks, the use of a Proxy VM for creating a transparent Tor Proxy VM, as well as demonstration of how to use a Standalone VM with manually assigned devices, to create a “WiFi pen-testing” VM, which surely represents the “for fun” aspect of this post.

Qubes Networking Intro

From the networking point of view there are three types of VMs in Qubes:

  • Net VMs, that have networking devices assigned to them, such as e.g. a WiFi or Ethernet card. Each Net VM contains a Xen network backend that is used to provide networking to all VMs that are connected to this Net VM.
  • Regular VMs (AppVMs) that use the networking provided by Net VMs (so they have Xen network frontends that provide virtual interfaces that are backed by the backend in the corresponding Net VM.
  • Proxy VMs that combine both of the above: to Net VMs they look like regular AppVMs, because they are consumers of the networking they provide, but to other AppVMs they act as if they were Net VMs themselves, allowing other VMs to connect to them. Of course the Proxy VMs do not have directly assigned networking devices – they use the networking provided by the Net VM that they connect to. One can chain many Proxy VMs, as we will see below.

The virtual interfaces in client VMs are called ethX, and are provided by the xen_netfront kernel module, and the corresponding interfaces in the Net/Proxy VM are called vifX.Y and are created by the xen_netback module.

Each Net and Proxy VM implements NAT, specifically masquerading, for all the connected VMs. Additionally to this SNAT, each Net or Proxy VM provides also DNAT redirection for DNS resolutions, so that each VM behind a Proxy or Net VM thinks that it uses a DNS in the Net/Proxy VM, but in fact all the DNS request are DNAT-ed by all the Proxy and Net VMs down the original DNS that is provided to the final Net VM. This smart trick allows us to avoid running a DNS caching server in Proxy/Net VMs.

Also, any VM-to-VM traffic, among the VMs connected to the same Net/Proxy VM is blocked by default.

Additionally, each Proxy VM enforces system-wide firewaling rules, specifically the rules for all the directly connected VMs. Those firewalling rules are centrally managed in Dom0 and exposed to each Proxy VM through Xen store. One useful application of this firewalling mechanism is to limit certain VMs to only specific type of white-listed traffic to minimize likelihood of user mistakes. A good example could be a work VM that might be limited to network connectivity only with the select corporate servers and denied all other traffic. This way, when the user receives an email message with an embedded http link (possibly leading to a malicious website) and accidentally clicks on it, nothing wrong happens.

The current infrastructure doesn't support IPv6 routing, but we will likely add this support in the upcoming Beta 3.

The default networking topology in Qubes OS

When you proceed with the default installation of Qubes Beta 2, then your initial networking topology looks like on the diagram below:
The default network configuration in Qubes.
So, by default there is one Net VM, called 'netvm', that is automatically assigned all the networking devices in the system. There is also one Proxy VM, called 'firewallvm' that is directly connected to the default Net VM, and which provides networking to all other VMs in the system. This Proxy VM is used for firewall rules enforcement. Each such service VM consumes 200MB of RAM by default.

Network-isolated VMs

For some VMs it might be desirable to completely disconnect them from any kind of networking access. This can be easy done using the following command (issued from Dom0's konsole):

[dom0]$ qvm-prefs -s netvm none

For example I have a 'vault' VM that I use for keeping my master PGP keys, and other secrets, and this machine is not connected to any network.

Using multiple Net VMs for physically isolated networks 

In some scenarios the machine might be connected to two or more physically separate networks (e.g. safe corporate intranet, reachable via ethernet cable on the user's desk, and the unsafe and evil Internet, reachable via WiFi card).

It is easy to use more than one Net VMs in Qubes, and assign different networking devices to different Net VMs, and also decide which VMs are connected to which Net VMs. The diagram below presents an exemplary such setup:
A simple setup with two isolated networks, and one fully isolated domain ('vault').
 
We could created such a setup using the following commands (issued in Dom0):

[dom0]$ qvm-create netvm1 --net --label red
[dom0]$ qvm-create netvm2 --net --label yellow

Currently qvm-create when used with the --net option automatically assigns all networking devices to the just created VM, so in the example above you would want to remove extra devices from each Net VM using qvm-pci -d, leaving only those you really want, e.g.: 

[dom0]$ qvm-pci -l netvm1 # to get a list of currently assigned devices

[dom0]$ qvm-pci -d netvm1 02:00.0

Now we should create the Firewall VMs:

[dom0]$ qvm-create firewallvm1 --proxy --label green
[dom0]$ qvm-create firewallvm2 --proxy --label green

... and connect them to proper Net VMs:

[dom0]$ qvm-prefs -s firewallvm1 netvm netvm1
[dom0]$ qvm-prefs -s firewallvm2 netvm netvm2

And now, for any other VM, just set the appropriate Net VM (either firewallvm1 or firewallvm2, or 'none), to get it assigned to either of the isolated networks, e.g.:

[dom0]$ qvm-prefs -s banking netvm firewallvm1
[dom0]$ qvm-prefs -s xfiles netvm firewallvm2
[dom0]$ qvm-prefs -s vault netvm none
...

This configuration provides very strong isolation between the VMs belonging to network #1, and the VMs belonging to network #2. Specifically, this becomes significant if we fear about potential remotely exploitable bugs in the client code of the core TCP/IP stack (in this case the Net VM could potentially compromise all the connected VMs -- but the same problem applies to even physically separated machines that use the same network).

Setting up Tor Proxy using a Proxy VM

Let's now play a bit with Proxy VMs and see how we can use it to create a simple Tor proxy VM. Such a VM would provide anonymized networking to all its clients, so would allow to easily create VMs for anonymous Internet access. The simple setup we would like to prepare is depicted on the figure below:

The 'torvm' Proxy VM provides anonymized networking to 'anon-web' and 'anon-bitcoin' VMs. All the traffic generated by the VMs behind 'torvm' is either fed into the Tor network, or discarded. Furthermore, any app running in those VMs is not able to read any global system identifiers, such as the external IP, external MAC address, etc.

Our Tor proxy would forward only the Tor traffic, so we don't have to fear about some Tor-not-aware applications, or even intentionally malicious ones to compromise the privacy of our connection. This is because such applications have no way to generate traffic to the outside world without going through our Tor proxy (unless they could exploit a hypothetical vulnerability in the Tor process running in the Tor VM). Also, the applications running in any VM behind the Tor proxy are not able to determine any globally identifiable IDs, such as the user's external IP address, the real MAC address used by real NICs, etc.

Interestingly just after writing the above paragraph, I discovered that one of our xenstore keys had wrong permissions and, as a result, any VM could read it and get to know the actual external IP (the key is used by a Net VM to communicate the external IP configuration to the connected Proxy VMs, so they could know when to update the firewall configuration). The fix for this problem is here, and the update (qubes-core-dom0-1.6.32) is now available for Dom0 (just do qvm-dom0-update to get it installed).

 
So, this represents a rather strong setup for use with Tor. Let's now have a look at how to practically create such a configuration, step by step.

First, let's create the VM that will become our Tor proxy:

[dom0]$ qvm-create torvm --proxy --label green

This will create a Proxy VM named 'torvm', based on the default template. We will need to now start the template VM and install the Tor client there:

[dom0]$ qvm-run -a fedora-14-x64 gnome-terminal

Alternatively, if we didn't trust the Tor client rpm package to be non-malicious, specifically for its installation scripts to be non malicious, we could have based this on a different template, e.g. one used for less trusted VMs, or we could installed the Tor client in /usr/local, that is backed by the VM's private storage, but this would require compiling Tor from sources.

Now, in the just started template VM, lets install the Tor client and (optionally) the Vidalia graphical frontend:

[fedora-14-x64]$ sudo yum install tor vidalia

And then power off the template VM. Now, every VM based on this template, started after the template shutdown, will also see the Tor binary in its filesystem.

Let's now configure our torvm to properly start Tor proxying at boot:

[dom0]$ qvm-run -a torvm gnome-terminal

Now, we will create the following script for starting up the Tor transparent proxy and setting up traffic redirection using iptables:

[torvm]$ vim /rw/config/start_tor_proxy.sh

...and now paste the following into this file:
#!/bin/sh
killall tor
QUBES_IP=$(xenstore-read qubes_ip)
TOR_TRANS_PORT=9040

if [ X$QUBES_IP == X ]; then
echo "Error getting QUBES IP!"
echo "Not starting Tor, but setting the traffic redirection anyway to prevent leaks."
QUBES_IP="127.0.0.1"
else
/usr/bin/tor \
--SocksPort 0 \
--TransListenAddress $QUBES_IP --TransPort $TOR_TRANS_PORT \
--DNSListenAddress $QUBES_IP --DNSPort 53 \
--RunAsDaemon 1 --ControlPort 9051 \
|| echo "Error starting Tor!"

fi

echo “0” > /proc/sys/net/ipv4/ip_forward
/sbin/iptables -t nat -F
/sbin/iptables -t nat -A PREROUTING -i vif+ -p udp --dport 53 -j DNAT --to-destination $QUBES_IP:53
/sbin/iptables -t nat -A PREROUTING -i vif+ -p tcp -j DNAT --to-destination $QUBES_IP:$TOR_TRANS_PORT
/sbin/iptables -I INPUT 1 -i vif+ -p udp --dport 53 -j ACCEPT
/sbin/iptables -I INPUT 2 -i vif+ -p tcp --dport 9040 -j ACCEPT
/sbin/iptables -F FORWARD

echo “1” > /proc/sys/net/ipv4/ip_forward

Except for the “QUBES_IP=$(xenstore-read qubes_ip)” line that reads the torvm's IP address, there is nothing Qubes-specific in the above listing. It's just a standard way of setting up transparent Tor proxy.

It is important that this file be located in the /rw directory, as this directory is backed by the VM's private storage and will survive VM reboots. The VM's root file-system is read-only and all the changes to it are lost on VM shutdown (VM gets an illusion of the root fs being writeable thanks to Copy-On-Write mechanism, but the actual COW backing device is cleared upon each VM shutdown).

We should also modify the /rw/config/rc.local script, to ensure that our Tor proxy is automatically started -- just paste the following into this script:
#!/bin/sh

# Uncomment this if you would like to use a custom torrc file:
#rm -f /rw/config/log
#ln -sf /rw/config/torrc /etc/tor/torrc

chkconfig qubes_netwatcher off
chkconfig qubes_firewall off
/rw/config/start_tor_proxy.sh
Finally we should also provide a script that would restart our proxy in case the user dynamically switched the NetVM, which would result in the completely different routing. This could be done by creating a script with predefined name qubes_ip_change_hook within /rw/config/ directory:
#!/bin/sh
/rw/config/start_tor_proxy.sh
Make sure that all the scripts are executable (chmod +x). And that's all. Now, shutdown the torvm:

[dom0]$ qvm-run --shutdown --wait torvm

From now on, every time you start the torvm (or when Qubes starts it in response to start of some other VM that uses torvm as its Net VM), the Tor transparent proxy should be automatically started.

Let's test this by creating a VM that would be using the just created Tor proxy:

[dom0]$ qvm-create anon-web --label black
[dom0]$ qvm-prefs -s anon-web netvm torvm

Now, every time you start the anon-web VM (e.g. by clicking on the Web browser icon in the anon-web's start menu), Qubes will also ensure that torvm is up and running, and this in turn would configure all the Tor proxying for this VM.

Fo additional control one might want to use Vidalia, the graphical front end for Tor (this should be installed within the template VM that has been used for torvm). We could easily start Vidalia by just typing:

[dom0]$ qvm-run -a torvm vidalia

We should however make sure to disable "Start the Tor software when vidalia starts" option in Settings/General in Vidalia. Otherwise, Vidalia might kill your original Tor (that has transparent proxy open) and start own without transparent proxy enabled.

The web browser runs in the 'anon-web' VM that uses 'torvm' for networking access, and thus all the traffic generated by 'anon-web' is routed through the Tor network, or discarded if it's a different traffic than TCP or DNS.


Of course one case easily create more VMs that would be using torvm as their Net VM, as so would have anonymized network access. The beauty of this solution is that in case one of my anonymized VM gets compromised, others do not. Plus, the already mentioned benefit, that no matter whether apps in those VMs are buggy, or even intentionally malicious, they would not be able to leak out the user's external IP address.

Creating a WiFi pen-testing VM

Finally let's have some fun and create a WiFi pen-testing VM. The desired config is depicted below:

Because we would like to use all sorts of l33t h4x0r t00lz pen-testing security software in this VM, it would make sense to create it as a Standalone VM, which means that it would get its own copy of the whole file-system (as opposed to just the home directory, /rw and /usr/local, as it is the case with regular Qubes VMs). This would ease the installation of all the extra software we would need there, and also ensure that even if the install/build scripts were malicious, the damages would be contained only to this very VM and nothing else. Also, for some reason the standard Linux WiFi stack and drivers still don't support injection on (all?) most of the WiFi cards out of the box, so we would need to patch the actual kernel drivers -- yet another reason to use a Standalone VM in this case.

So, let's create the VM first, and assign a WiFi card to it:

[dom0]$ qvm-create wififun --standalone --label yellow
[dom0]$ qvm-prefs -s wififun memory 800 # ensure at least this mem at startup
[dom0]$ qvm-prefs -s wififun kernel none # use own copy of kernel and modules
[dom0]$ qvm-pci -a wififun

You can easily find the BDF address of any device using the lspci command in Dom0 -- this would be something like e.g. “02:00.0”. You should make sure that this WiFi card is not used by any other VM, specifically by your default Net VM (called 'netvm' in a standard Qubes installation). Ideally you could just use a dedicated Express Card-based WiFi card, leaving the built in WiFi assigned to your default Net VM.

Because it's a Standalone VM, Qubes will make a copy of the whole root filesystem, and thus it would eat about 5GB of your disk (normal VMs would take only as much space as their private fs takes up).

Let's now start the VM...

[dom0]$ qvm-run -a wififun gnome-terminal

... and then install the prerequisite software there, starting with downloading the reasonably new compat-wireless sources, together with the required injection patches, and then building and installing the new kernel modules. All actions below are now executed within the VM. This stuff here is really nothing Qubes- or Xen-specific -- one would do more or less the same on any Linux in order to get injection working (so, treat this as a free bonus WiFi hacking tutorial on Linux).

[wififun]$ wget http://linuxwireless.org/download/compat-wireless-2.6/compat-wireless-2011-07-14.tar.bz2

[wififun]$ wget http://patches.aircrack-ng.org/channel-negative-one-maxim.patch
[wififun]$ wget http://patches.aircrack-ng.org/mac80211-2.6.29-fix-tx-ctl-no-ack-retry-count.patch
[wififun]$ wget http://patches.aircrack-ng.org/mac80211.compat08082009.wl_frag+ack_v1.patch

[wififun]$ sudo yum install kernel-devel patch gcc

[wififun]$ tar xjf compat-wireless-2011-07-14.tar.bz2
[wififun]$ cd compat-wireless-2011-07-14
[wififun]$ patch -p1 < ../channel-negative-one-maxim.patch
[wififun]$ patch -p1 < ../mac80211-2.6.29-fix-tx-ctl-no-ack-retry-count.patch
[wififun]$ patch -p1 < ../mac80211.compat08082009.wl_frag+ack_v1.patch

[wififun]$ make
[wififun]$ sudo make unload
[wififun]$ sudo make install

Now, lets reboot the VM to ensure that all the patched drivers will get properly loaded on each VM boot:

[dom0]$ qvm-run --shutdown --wait wififun
[dom0]$ qvm-run -a wififun gnome-terminal

Let's first see if the WiFi driver got properly loaded and if the interface has been created (look for wlanX interface):

[wififun]$ ifconfig -a

If yes, then proceed with the steps below (if not, then have a look into dmesg and see what was the problem):

[wififun]$ sudo bash
[wififun]# yum install aircrack-ng dnsmasq
[wififun]# airmon-ng start wlan0
[wififun]# iptables -F INPUT
[wififun]# iptables -F FORWARD
[wififun]# echo “1” > /proc/sys/net/ipv4/ip_forward

Note that you don't need to add any explicit masquerading rules, as they are applied by default on Qubes VMs (you can take a look at the nat table in the VM if you want to see by yourself).

Edit the /etc/dnsmasq.conf, so that it contains at least the following:

interface=at0
dhcp-range=192.168.0.50,192.168.0.150,12h

and then start the dnsmasq daemon -- we will use it for providing DHCP to our fake AP (the at0 interface will be created by airbase-ng and emulates the “uplink” of a traditional AP):

[wififun]# /etc/init.d/dnsmasq start

And finally the fake AP:

[wififun]# airbase-ng -e free_wifi mon0

and on another console (before any client connects, but after airbase-ng got started), configure the at0 interface (make sure it matches what you wrote into dnsmasq.conf):

[wififun]# ifconfig at0 192.168.0.1 up

(you can also add an udev rule to that automatically).

and just to verify it really is working:

[wififun]# tcpdump -i at0

... and now, just wait for a client to connect to your AP. What you do next is only limited by your imagination... But hey, this article is about Qubes networking and not about 0wning client systems ;)

Here's an innocent example using Moxie's sslstrip (amazing this attack still works so well at the end of 2011...):

My 'wififun' VM in action using a simple sslstrip attack, that surprisingly still works pretty nice...
Please note that as your wififun VM is a regular Qubes VM, it is automatically connected to the default Net VM, which in turn provides networking to it. That's why it is so easy to create a fully functioning fake AP.

When using custom driver domains, there are currently some catches you should be aware:

Catch #1: When you start a driver domain late after system boot, so after some days of uptime and extensive use of VMs, Xen might not be able to allocate enough continues (in terms of MFNs) memory for a driver domain. And PV driver domains, unlike normal domains or HVM driver domains, do require MFN-continuous memory for their DMA buffers (HVM domains do not need that, because IOMMU can create an illusion of this; even though IOMMU is also used for PV driver domains, for protection, it doesn't actively translate bus addresses into GMFNs).

This is usually not a big problem in practice, because in most cases all the driver domains are started early at system boot, when there is still plenty of non-fragmented memory available. However it might become a problem when one wishes to start e.g. the WiFi pen-testing at some later time. The work around is to close as many VMs as possible before starting such driver domain, and then also reducing, for a moment, the amount of memory assigned to Dom0:

[dom0]$ xm mem-set 0 1600m

and then starting the driver domain should be fine. Now we can start all other domains, and that should no longer be problematic for the already running driver domain.

Catch #2: Some network cards, notably Express Cards, might not work well with the 3.0.4 pvops kernel that we use in all VMs by default. In that case you might want to try to use the 2.6.38.3 xenlinux kernel in your WiFi fun VM -- to do that, follow these steps:

[dom0]$ sudo qvm-dom0-update kernel-qubes-vm-2.6.38.3-10.xenlinux.qubes
[dom0]$ cp /var/lib/qubes/vm-kernels/2.6.38.3/* /var/lib/qubes/appvms/wififun/kernels/
[dom0]$ qvm-prefs wififun -s kernelopts "swiotlb=force"

And then, in the VM:

[wififun]$ sudo yum install kernel-devel-2.6.38.3-10.xenlinux.qubes

And rebuild the compat-wireless, unload, install modules, and then load drivers again.

Summary

As you can see, Qubes Beta 2 now offers a very advanced networking infrastructure that allows more advanced users to create very sophisticated configurations, allowing for pretty good isolation between various domains and networks. Qubes leaves it up to the user (or admin) to figure out what would be the best configuration -- most users would be happy with the default simple setup with just one Net VM and one Firewall VM, while others would go for much more advanced setups.

A bit more advanced networking setup. The usbvm has a 3G modem assigned, and it is possible to dynamically switch between the Net VMs without restarting any other VMs.

Monday, September 19, 2011

Qubes Beta 2 Released!

I'm proud to announce that we have just released Qubes Beta 2! You can view installation instructions and download the ISO here.

We faced quite a few serious problems with this release that were caused by an upgrade to Xen 4.1 (from Xen 3.4) that we used in Beta 1. But finally we managed to solve all those problems and all in all I'm very happy with this release. It includes many performance optimizations compared to Beta 1 (CPU- and memory-wise) and also many bugfixes.

We also introduced a couple of new features:
  • Generic mechanism for inter-domain services with a centralized policy enforcement (more)
  • Network-less update mechanism for Dom0 (more)
  • VM management improvements: easy device assignment for driver domains, dynamic netvm switching, flexible VM kernel configuration, etc (see the new qvm-prefs utility)
  • Easy management of appmenus (shortcuts in the Start Menu)
  • Update to Xen 4.1 that offers, among other things, better VT-d support and more lightweight management stack (we have ported Qubes to use the new xl now, instead of the slow and heavy xend), and also to 2.6.38-xenlinux kernel for Dom0, and to 3.0.4 pvops kernel for VMs (better hardware compatibility, better power management)
I will write some more posts shortly that would present in detail some of the new features and what cool things one could do with them.

We have also created a dedicated wiki page that enumerates all the security-critical code for Qubes OS. We hope this page would be useful for security researchers that might attempt to find weaknesses in Qubes OS either in our code or in the 3rd party code that we rely on (Xen hypervisor, select Xen backends). Whether your motives are noble (gaining immortal fame, helping create a secure client OS), or not (proving ITL wrong), we would appreciate your efforts! And you might even get a job at ITL.

Speaking of which, I'm happy to announce that Marek Marczykowski, who has effectively become the key Qubes developer over the past few months, has now officially joined ITL :)

Wednesday, September 07, 2011

Anti Evil Maid


Anti Evil Maid is an implementation of a TPM-based static trusted boot with a primary goal to prevent Evil Maid attacks.

The adjective trusted, in trusted boot, means that the goal of the mechanism is to somehow attest to a user that only desired (trusted) components have been loaded and executed during the system boot. It's a common mistake to confuse it with what is sometimes called secure boot, whose purpure is to prevent any unauthorized component from executing. Secure boot is problematic to implement in practice, because there must be a way to tell which components are authorized for execution. This might be done using digital signatures and some kind of CA infrastructure, but this gets us into problems such as who should run the CA, what should be the policy for issuing certificates, etc.

The adjective static means that the whole chain of trust is anchored in a special code that executes before all other code on the platform, and which is kept in a non re-flashable memory, whose sole purpure is to make the initial measurement of the next component that is going to be executed, which is the BIOS code. This special code, also known as Core Root of Trust for Measurement (CRTM), might be part of the BIOS (but kept on a special read-only memory, or implemented by some other entity that executes before the BIOS reset vector, such as e.g. Intel ME or the processor microcode even. Once measured, the BIOS code is executed, and it is now its turn to measures the platform configuration, Option ROM code, and MBR. Then the loader (stored in the MBR), such as Trusted GRUB, takes over and measures its own next stages (other than the MBR sector), and the hypervisor, kernel, and initramfs images that are to be loaded, together with their configuration (e.g. kernel arguments).

As explained above, trusted boot can only retrospectively tell the user whether correct (trusted) software has booted or not, but cannot prevent any software from executing. But how can it communicate anything reliably to the user, if it might have just been compromised? This is possible thanks to the TPM unseal operation that releases secrets to software only if correct software has booted (as indicated by correct hashes in select PCR registers).

So the idea is that if a user can see correct secret message (or perhaps a photo) being displayed on the screen, then it means that correct software must have booted, or otherwise the TPM would not release (unseal) the secret. Of course we assume the adversary had no other way to sniff this secret and couldn't simply hardcode it into the Evil Maid – more on this later.

Another way to look at it is to realize that Anti Evil Maid is all about authenticating machine to the user, as opposed to the usual case of authenticating the user to the machine/OS (login and password, decryption key, token, etc). We proceed with booting the machine and entering sensitive information, only after we get confidence it is still our trusted machine and not some compromised one.

Installing Anti Evil Maid

Anti Evil Maid should work for any Linux system that uses dracut/initramfs, which includes Qubes, Fedora and probably many other distros. You can find the Anti Evil Maid source code in a git repository here. You can also download a tarball with sources and prebuilt rpm packages from here (they all should be signed with the Qubes signing key). Qubes Beta 2, that is coming soon, will have those RPMs already per-installed.

To install Anti Evil Maid, follow the instructions in the README file.

Some Practical considerations

If you decided to use no password for your TPM SRK key (so, you passed '-z' to tpm_takeownership, see the README), then you should definitely install Anti Evil Maid on a removable USB stick. Otherwise, if you installed it on your disk boot partition, the attacker would be able to just boot your computer and note down the secret passphrase that will be displayed on the screen. Then the attacker can compromise your BIOS/MBR/kernel images however she likes, and just hardcode the secret passphrase to make it look like if your system was fine.

If you decided to use custom TPM SRK password (so, you did not pass -z to tpm_takeownership), then you can install Anti Evil Maid onto your regular boot partition. The attacker would not be able to see your secret passphrase without knowing the SRK password. Now, the attacker can try another Evil Maid attack to steal this password, but this attack is easy to spot and prevent (see the discussion in the next section).

However, there is still a good argument to install Anti Evil Maid on a separate USB stick rather than on your built-in disk boot partition. This is because you can use Anti Evil Maid as a provider of a keyfile to your LUKS disk encryption (as an additional file unsealable by the TPM). This way you could also stop adversary that is able to sniff your keystrokes (e.g. using hidden camera, or electromagnetic leak), and capture your disk decryption passphrase (see the discussion in the next section).

In any case it probably would be a good idea to make a backup stick that you might want to use in case you lose or somehow damage your primary stick. In that case you should have a way to figure out if your system has been compromised in the meantime or not. Use another stick, with another passphrase, and keep it in a vault for this occasion.

Finally, be aware that, depending on which PCRs you decided to seal your secrets to, you might be unable to see the secret even after you changed some minor thing in your BIOS config, such as e.g. the order of boot devices. Every time you change something in your system that affects the boot process, you would need to reseal your secrets to new PCR values as described in the installation instructions.

Attacks prevented by Anti Evil Maid

The classic Evil Maid attack is fully prevented.

If the attacker is able to steal your Anti Evil Maid stick, and the attacker gets access to your computer, then the attacker would be able to learn your secret passphrase by just booting from the stolen stick. This is not fatal, because user should get alarmed seeing that the stick has been stolen, and use the backup stick to verify the system (with a different secret messages, of course), and later create a new stick for every day use with a new secret message.

A variation of the above attack is when the attacker silently copies the content of the stick, so that the user cannot realize that someone got access to the stick. Attacker then uses the copied stick to boot the user's computer and this way can learn the secret passphrase. Now, the attacker can infect the computer with Evil Maid, and can also bypass Anti Evil Maid verification by just hardcoding the secret message into Evil Maid. So, even though TPM would know that incorrect software has booted, and even though it would not unseal the secret, the user would have no way of knowing this (as the secret would still be displayed on screen).

In order to protect against this attack, one might want to use a non-default SRK password – see the installation instructions. Now an extra SRK password would be needed to unseal any secret from the TPM (in addition to PCRs being correct). So the attacker, who doesn't know the SRK password, is now not able to see the secret message and cannot prepare the Evil Maid Attack (doesn't know what secret passphrase to hardcode there).

The attacker might want to perform an additional Evil Maid attack targeted at capturing this SRK password, e.g. by infecting the user's stick. This, however, could be immediately detected by the user, because the user would see that after entering the correct SRK password, there was no correct secret passphrase displayed. The user should then assume the stick got compromised together with the SRK password, and should start the machine from the backup stick, verify that the backup secret is correct, and then create new AEM stick for daily usage.

If an attacker is able to capture the user's keystrokes (hidden camera, electromagnetic leaks), the attacker doesn't need Evil Maid attack anymore, and so doesn't need to bother with compromising the system boot anymore. This is because the attacker can just sniff the disk decryption password, and then steal the laptop and will get full access to all user data.

In order to prevent such a “keystroke sniffing” attack, one can use an additional sealed secret on the Anti Evil Maid stick that would be used as a keyfile for LUKS (in addition to passphrase). In this case the knowledge of the sniffed LUKS passphrase would not be enough for the attacker to decrypt the disk. This has not been implemented, although would be a simple modification to dracut-antievilmaid module. If you decided to use this approach, don't forget to also create a backup passphrase that doesn't need a keyfile, so that you don't lock yourself from access to your data in case you lose your stick, or upgrade your BIOS, or something! You have been warned, anyway.

Attacks that are still possible

An adversary that is able to both: sniff your keystrokes (hidden camera, electromagnetic leak) and is also able to copy/steal/seize your Anti Evil Maid stick, can not be stopped. If a non-democratic government is your adversary, perhaps because you're a freedom fighter in one of those dark countries, then you likely cannot ignore this type of attacks. The only thing you can do, I think, is to use some kind of easy-to-destroy USB stick for keeping Anti Evil Maid. A digestible USB stick, anyone?

Another type of attack that is not addressed by Anti Evil Maid is an attack that works by removing the “gears” from your laptop (the motherboard and disk at the very least), putting there a fake board with a transmitter that connects back to the attacker's system via some radio link and proxies all the keyboard/screen events and USB ports back to the original “gears” that execute now under supervision of the attacker. Another way of thinking about this attack is as if we took the motherboard and disk away, but kept all the cables connecting them with the laptop's keyboard, screen, and other ports, such as USB (yes, very long cables). The attacker then waits until the user boots the machine, passes the machine-to-user authentications (however sophisticated it was), and finally enters the disk decryption key. In practice I wouldn't worry that much about such an attack, but just mentioning it here for completeness.

Finally, if our adversary is able to extract secret keys from the TPM somehow, e.g. using electron microscope, or via some secret backdoor in the TPM, or alternatively is able to install some hardware device on the motherboard that would be performing TPM reset without resetting the platform, then such an attacker would be able to install Evil Maid program and avoid its detection by SRTM. Still, this doesn't automatically give access to the user data, as the attacker would need to obtain the decryption key first (e.g. using Evil Maid attack).

Implementation Specific Attacks

In the discussion above we assumed that the trusted boot has been correctly implemented. This might not be true, especially in case of the BIOS. In that case we would be talking about attacks against a particular implementation of your BIOS (or TrustedGRUB), and not against Anti Evil Maid approach.

One typical problem might be related to how CRTM is implemented – if it is kept in a regular BIOS reflashable memory, than the attacker who can find a way to reflash the BIOS (which might be trivial in case your BIOS doesn't check digital signatures on updates) would be able to install Evil Maid in the BIOS but pretend that all hashes are correct, because the attacker controls the root of trust.

Another possible implementation problem might be similar to the attack we used some years ago to reflash a secure Intel BIOS (that verified digital signatures on updates) by presenting a malformed input to the BIOS that caused a buffer overflow and allowed to execute arbitrary code within the BIOS. For such an attack to work, however, the BIOS should not measure the input that is used as an attack vector. I think this was the situation with the logo picture that was used in our attack. Otherwise, even if there was a buffer overflow, the chain of trust would be broken and thus the attack detected. In other words, the possibility of such an attack seems to be rather slim in practice.

What about Intel TXT?

Intel TXT takes an alternative approach to trusted boot. It relies on a Dynamic instead of Static Root of Trust for Measurement (DRTM vs. SRTM), which is implemented by the SENTER instruction and special dynamic PCR registers that can be set to zero only by SENTER. Intel TXT doesn't rely anymore on the BIOS or CRTM. This offers a huge advantage that one doesn't need to trust the BIOS, nor the boot loader, and yet can still perform a trusted boot. Amazing, huh?

Unfortunately, this amazing property doesn't hold in practice. As we have demonstrated almost 3 years ago (!), it is not really true that Intel TXT can remove the BIOS away from the chain of trust. This is because Intel TXT is prone to attacks through a compromised SMM, and anybody who managed to compromise the BIOS would be trivially able to also compromise the SMM (because it is the BIOS that is supposed to provide the SMI handler).

Thus, if one compares SRTM with Intel TXT, then the conclusion is that Intel TXT cannot be more secure than SRTM. This is because if an attacker can compromise the BIOS, then the attacker can also bypass Intel TXT (via a SMM attack). On the other hand, a BIOS compromise alone doesn't automatically allow to bypass SRTM, as it has been discussed in a paragraph above.

It really is a pity, because otherwise Intel TXT would be just a great technology. Shame on you Intel, really!

Alternative approaches to mitigate Evil Maid Attacks

Various people suggested other methods to prevent Evil Maid attacks, so lets quickly recap and discuss some of them...

The most straight forward approach suggested by most people, has been to disable booting from external devices in BIOS, together with locking the BIOS setup with an admin password.

There are two problems with such an approach. First, all the BIOSes have a long history of so called default passwords (AKA maintenance passwords). You don't want to rely on the lack of BIOS default passwords when protecting your sensitive data, do you?

Second, even if your BIOS doesn't have a backdoor (maintenance password), it is still possible to just take your disk away and connect to another laptop and infect its boot partition.

Another suggested approach has been to keep your boot partition on a separate USB stick. This solution obviously doesn't take into account the fact that the attacker might install Evil Maid into your BIOS. Many consumer laptop BIOSes do not require digital signatures on BIOS firmware updates (my Sony Vaio Z, a rather high-end machine, is among them), making it simple to install Evil Maid there (the most trivial attack is to make the BIOS always boot from the HDD instead of whatever other device the user wanted to boot from).

Finally, some people pointed out that many modern laptops comes with SATA disks that offer ability to “lock” the disk so that it could only be used with a specific SATA controller. Using this, combined with setting your BIOS to only boot from your internal disk, plus locking access to BIOS setup, should provide reasonable protection. This solution, of course, doesn't solve the problem of a potential maintenance password in your BIOS. Also being skeptical and paranoid as I am, I would not trust this mechanism to be really robust – I would expect it would be fairly simple to unlock the disk so that it could be paired with another, unauthorized controller, and that this probably is a matter of NOP-ing a few instructions in the controller firmware... In fact it seems like you can buy software to unlock this mechanism for some $50... And apparently (and not very surprisingly) some drives seems to continue on the 'default passwords' tradition.

FAQ 

Q: Bitlocker implemented this already several years ago, right?
A: No.

Q: But, two-factor authentication can also be used to prevent Evil Maid, right?
A: No.

Q: Does it make any sense to use Anti Evil Maid without a full disk encryption?
A: No.

Q: Are you going to answer 'no' for each question I ask?
A: No.

Q: Why there are no negative indicators (e.g. a big scary warning) when the unseal process fails?
A: The lack of negative indicators is intentional. The user should keep in mind that if somebody compromised their computer, then the attacker would be able to display whatever she wants on the screen, and especially to skip displaying of any warning messages. The only thing the attacker would not be able to display would be the secret message. Thus, it would make no sense to use negative indicators, as they would likely not work in case of a real attack. One solution here would be to use the unsealed secret as a keyfile for disk encryption (as discussed above), which would make it impossible to decrypt the user disk (and so generally proceed with the boot) without successfully unsealing the secret from the TPM.