#########################
Tor Without An IP Address
#########################

What's Going On Here?
=====================

This is an attempt to produce a computing environment which is connected to the
Internet but has no idea how that magic is happening.  (Or, well, as little of
an idea as possible.)  In particular, we aim to connect a Linux userland over
Tor without the machine even *having* an IP address, and without any (easy?)
way to discover the addresses of its physical neighbors.

Concretely, this project aims to mitigate end-node vulnerabilities (*e.g.*
sandbox escapes, information disclosures) by ensuring that as little
information as possible is available to *arbitrary* code running on the
end-node.  It is a variant of the `isolating proxies
<https://trac.torproject.org/projects/tor/wiki/doc/TorifyHOWTO/IsolatingProxy>`_
design documented by Tor; it seeks to further minimize the attack surface
available to a compromised machine.

We're going to use two computers to do this.  One we'll allow access to the
Internet via traditional networking; this *gateway* node will run Tor.  The
other, *client* machine will also run Tor but will reach the Tor network (and
the Internet beyond) *over Tor*, as provided by the gateway, over a
non-traditional link: a serial link carrying a Tor "orconn".  Because there
seems to be some confusion, perhaps some clarification is in order.  This setup
differs substantially from SLiRP and other IP-in-X encapsulation techniques;
there is no symmetric notion of identifier for the client and gateway nodes,
though of course the client node has to use some identifier when it opens
streams, and for that it will continue to use DNS and IP addresses.  (As the
Tor network also has a notion of hidden services, the client node can, at its
discretion, establish one or more cryptographic identities for itself and allow
"inbound" connections over its outbound streams.)

Why Network Without A Network Address?
======================================

Traditionally, one reaches the Internet by symmetry, by being assigned an IP
address. [#nat]_  That seems natural enough, given the goal, but it is not
necessarily what one wants.  IP addresses are (more often than not) more
than sufficient to narrow down one's location.

In addition to its usual mode of being a SOCKS proxy, Tor already has some
support for such whole-network asymmetry with
`transparent proxies
<https://trac.torproject.org/projects/tor/wiki/doc/TransparentProxy>`_ as
well as the `isolating proxies`_ extension mentioned earlier.  However, even
these require that one runs a network stack.

Network stacks are *huge*.  They are vast bodies of bizarre historical relics,
such as Linux's tune-ables ``/proc/sys/net/ipv4/conf/*/arp_filter`` and
``.../arp_announce``; by default Linux will cheerfully respond to ARP requests
for any address assigned to any interface.  Meaning: if you run your Tor
transparent gateway with on your network's border gateway node, the *supposedly
anonymous, funneled-into-Tor* network either can probe for or *get told* your
external IP address.  Not ideal.  [#linuxarp]_

By contrast, the only vulnerabilities that might be faced here are those
within the Tor daemon itself, a much smaller and hopefully more audit-able
chunk of code.

Tor Issues Making This Nontrivial
=================================

Unfortunately, we're going to have to work around some apparently *intentional*
information disclosure in Tor and deal with the fact that our use case remains
niche even within the niche of Tor.

* Until https://gitlab.torproject.org/tpo/core/tor/-/issues/9498 is resolved,
  even private (i.e., not published in BridgeDB) Tor bridges will reveal their
  public address(es?) to their clients.  This is probably not seen as an issue
  given the usual way that bridges are meant to be used.

* Until
  https://gitweb.torproject.org/torspec.git/tree/proposals/188-bridge-guards.txt
  is implemented, any bridge client can direct that bridge to attempt to
  connect to any Tor relay (or possibly any IP address?) as a nominal next-hop.
  Therefore, a conspiring client and relay operator can trivially reveal the IP
  address of the bridge the client is using.  This is again probably not
  considered a threat to *client users* because usually the client is not also
  running a bridge; I am sort of surprised it's not fixed anyway, despite the
  bridge enumeration risk.  See also
  https://gitlab.torproject.org/legacy/trac/-/issues/7144 and
  https://gitlab.torproject.org/tpo/core/tor/-/issues/9500 and
  https://gitlab.torproject.org/tpo/core/tor/-/issues/40093 .

And, for I assume good reasons, Tor tries to stop you from exiting back into
the Tor network.  So we're going to have to subvert that.

Therefore, what we're going to do is manually construct a rather ludicrous
path:

* Our gateway will run Tor as an ordinary onion *proxy* rather than bridge.
  (For this demo, the gateway is not, itself, configured to use bridges; it
  could be, but that is an orthogonal concern.)

* We will have the gateway connect a serial port to a public Tor *bridge* via
  Tor, using ``socat``.  That is, the bridge will see us connecting from a Tor
  exit node that we will reach by the usual 3-hop circuit (from the gateway).

* The client node will also run Tor, set to use ``localhost`` as its bridge,
  using the key information of the public bridge.

* We will have ``socat`` on the client respond to those loop-back connections
  and bridge to the serial port.

So, until Tor resolves the above issues, this is going to result in a six-hop
network path when we reach out (ideally it would just be four), nine hops when
the client reaches (or is reached as) a hidden service, or *twelve* if two such
clients connect (yikes).

Demo How-To
===========

The ingredients for this demo are

* a Raspberry Pi 1 (the lack of WiFi and BT radios are important, so that ACE
  cannot be used to scan the wireless neighborhood)

* an old laptop running Tor

* two FTDI USB UART chips, wired in crossover/null-modem configuration

* The ``socat`` command

* A public Tor bridge.  For this demo, I picked one from
  https://gitweb.torproject.org/tor-browser.git/tree/tor-config/tor_obfs4.conf
  since we're not using the bridge to hide our identity and we don't depend
  on local network reachability.

* Tor itself, of course

Serial Link
-----------

You'll want a UART interface that can achieve relatively high baud rates and
understands flow control.  This is a taller order than might be imagined!  I've
had good luck with FTDI FT232RL chips and bad luck with the cheaper Silab
CP210x [#cp210x]_.  YMMV. 

Just as a reminder, a minimal crossover cable with "both levels" of handshaking
pins connects ground to ground and crosses within each of the TXD/RXD, RTS/CTS
[#rts]_,
and DTR/DCD pair.  RI and DSR can be left disconnected.

Gateway
-------

On the gateway, we want to run something like the following in a loop::

  socat /dev/ttyUSB0,b3000000,flock-ex,cfmakeraw,crtscts,hupcl=1 SOCKS4:localhost:${BRIDGE},socksport=9050

In detail, that bridges ``/dev/ttyUSB0`` and the Tor bridge, over the gateway's
Tor proxy (using its SOCKS interface).  We set the serial link to use 3 Mbaud
(as fast as I have found reliable, YMMV), lock the port exclusively, disable
interpretation of the raw bytes, use RTS/CTS flow control, and have the port
signal us to quit when DCD goes low.  Ideally ``socat`` would close and re-open
the connection, rather than exit, but AFAICT it doesn't understand ``fork``-ing
in this use case; looping the command is sufficient workaround.

Client
------

And on the client, we will run the reverse::

  socat TCP-LISTEN:9001,reuseaddr,fork /dev/ttyUSB0,b3000000,flock-ex,cfmakeraw,crtscts,hupcl=1

Here we can use ``socat``'s ``fork`` functionality and don't necessarily need
this in a loop, but it can't hurt to do that, too.

And for Tor, we have to tell it to use our bridge, so ``/etc/tor/torrc``
contains (with variables expanded)::

  UseBridges 1
  Bridge obfs4 127.0.0.1:9001 ${BRIDGE_KEY} ${BRIDGE_EXTRA}

Other Tor options (``ControlPort``, ``HiddenServiceDir``, &c) are of course
quite useful.

Success
-------

Now, to check that everything's working, try running something like this on
the client::

  torify lynx --source google.com

It's possible to configure ``apt`` to use the Tor proxy by default, too, so the
client can even keep itself up to date easily enough.  In
``/etc/apt/apt.conf.d/90proxy``, write::

  Acquire::Queue-Mode "access";
  Acquire::http::proxy "socks5h://localhost:9050";

The ``Queue-Mode`` cuts down on the number of concurrent connections and,
experimentally, improves reliability.  I presume this is some side-effect of
the very high bandwidth-delay product of our link and Tor's stream
multiplexing.

We can ask the kernel to verify that all of this happens without a route-able IP
address::

  pi@raspberrypi:~ $ ip -o a s
  1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
  1: lo    inet6 ::1/128 scope host \       valid_lft forever preferred_lft forever
  pi@raspberrypi:~ $

Paranoid Device Construction
============================

Thus far I've described the procedure for getting things working assuming that
both ends are already perfectly functional Linux machines.  What if we don't
have a client yet?  We don't really want to connect it to our network, replete
with possibly traceable information flying about, as that might get written
down somewhere.

The simplest approach is to grab a distribution that already has ``tor`` and
``socat`` its installation media, but, failing that, it's easy enough to use
the serial link itself to move files.  (You can, of course, fetch those files
over Tor assuming you're already set up.) ``stty`` and ``cat`` are your friends
here and will let you (slowly) bootstrap up just about anything you like.  If
you're going to move larger files this way, I highly recommend ``gkermit`` or
even the full ``ckermit`` program.

Footnotes
=========

.. [#nat] Some technologies (e.g. the progression of NAT, NAPT, NPAPT, and
   proxies like SOCKS and those for HTTP) make the connection more or less
   asymmetric, but they tend (I won't claim to have comprehensive
   knowledge) to be reached over a local, "private" Internet, often using the
   deliberately reserved non-routable IP addresses of RFC 1918, rather than being
   reached over a different fabric.

.. [#linuxarp] See https://www.kernel.org/doc/Documentation/networking/ip-sysctl.txt
   I don't mean to pick on Linux, it's just the example I know.
   I'd be very impressed if there were a network stack designed with
   information disclosure vulnerabilities in mind, as that's been,
   traditionally, not a concern.

.. [#cp210x] The Linux driver does not attempt to detect DCD transitions and
   even when applying the requisite trivial patch, the CP2102 chip itself
   appears not to signal such until it also has characters to send, which is
   not useful.

.. [#rts] The pin labeled RTS is properly called RTR these days; see circuit
   133 of `the V.24 spec
   <https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-V.24-199303-S!!PDF-E&type=items>`_.