My name is David Fifield, and on behalf of my coauthors
Cecylia, Arlo, Serene, and Xiaokang,
I'm going to present our research on the Snowflake circumvention system.
We have an online, HTML version of the home page paper
that is more extensively annotated than what we could afford
to put into the print version.
Précis
Snowflake is a censorship circumvention system—a way
of enabling communication between network endpoints
despite the interference of an intermediary censor.
(Censors may do things like block IP addresses,
send forged TCP RST packets,
or falsify DNS responses.)
Snowflake uses a large pool of
ultra-lightweight, temporary proxies ("snowflakes")
that communicate using WebRTC protocols.
How does Snowflake resist address-based blocking?
Its pool of temporary proxies is large (on the order of 100 K), and varies over time.
How does Snowflake resist content-based blocking?
Transporting traffic in an encrypted WebRTC container.
Snowflake has been in serious deployment for 3+ years.
It is a built-in circumvention option in Tor Browser,
and serves a few tens of thousands of users at any time.
So, in summary, Snowflake is a censorship circumvention system, and what that means is,
it's a way of enabling network communication between two endpoints
despite the presence of some adversary in the middle, a censor in the middle,
who's interfering with the communication.
Now, that's kind of an abstract, scientifically useful definition of censorship,
but this model is, of course, motivated by real-world considerations,
actual censorship people encounter in practice.
It's security and privacy, but it's also tied up with human rights
and freedom of expression, and that's why we do this work.
There are a lot of networks in the world—I won't belabor the point—but
there are a lot of networks where, you want to read some news,
you want to use some app, you want to participate in some discussion group, and you can't.
Or you cannot easily, because there's a censor preventing you from doing so.
And to give you an idea of the types of things we see in practice,
a censor can do stuff like block IP addresses,
it can inject RST packets to tear down TCP connections,
it can give you false answers to DNS queries,
and these are all very commonly seen in practice.
So there are a lot of different circumvention systems, using a variety of different techniques,
what is Snowflake's angle?
In a nutshell, Snowflake uses a large network of very lightweight, temporary proxies,
which we call snowflakes, and they communicate using WebRTC protocols.
So when I say "temporary proxies," what I mean by that
is that these proxies are allowed to appear and disappear at any time.
So the pool is, kind of, constantly changing,
and you don't depend on these proxies to be reliable.
And WebRTC is a suite of protocols that are often used for real-time communication on the web.
So: audio, video, text chat, online games, a lot of these things use WebRTC.
Now we're we're equipped to answer the following two questions.
And if you are accustomed, if you're used to censorship research,
these answers to these two questions will tell you most of what you need to know
to understand what Snowflake is doing.
And you'll also understand why these are the two critical questions to ask.
If you're not so familiar with this research field, I hope to give you a little bit
of familiarity with why these are important questions through the course of this talk.
The first questions is: How does Snowflake resist address-based blocking?
Well, the answer there is the pool of temporary proxies.
It's large, and by "large" you should think, about 100 thousand, and
it's not always the same 100 thousand, which is important.
Making proxies very easy to run is part of achieving this large proxy pool.
The second question is: How does Snowflake resist content-based blocking?
Well, that's WebRTC. Rather than transmit client traffic in the clear,
we wrap it in an encrypted WebRTC container.
Now, our team started the Snowflake project in order to innovate in the circumvention space,
to explore a different combination of parameters and see how well it works.
It turns out, it works quite well.
But this is more than a research prototype.
This has been in serious deployment for three or more years.
We serve actual users on an ongoing basis.
We have to care about operations, and things like that.
It is a built-in circumvention option in Tor Browser, so in Tor Browser
you can just choose "Snowflake" from the menu and you'll be using Snowflake.
And at any given time we're serving an average of a few tens of thousands of users.
The thing that is perhaps most special about Snowflake
is just how easy it is to run a proxy.
And, like, how easy?
Like, this easy.
And now we're running a proxy.
This is real, this is live by the way. This is just an HTML web page,
and I copied the embed code into the slides here.
It's ordinary HTML and JavaScript, no special permissions or anything like that.
Nothing weird.
And this is now ready to help serve censored users.
If you want to see this for yourself, snowflake.torproject.org
has a badge installation just like this.
You can try it. The badge will change color when you get a client,
it'll give you a little bit of a reward for doing that.
And you can run this from a web page, but it turns out most people don't do it that way.
A lot of people prefer to run the Snowflake badges as a web browser extension,
so you can install this and your browser acts as a proxy for censored users
as long as your browser is open.
Or, if you use the Orbot Tor app for mobile phones, and you activate the
"kindness mode," what that does is it turns your phone into a Snowflake proxy,
under certain conditions.
The name, "Snowflake," incidentally, comes from these tiny, like, micro-proxies.
Because, like real snowflakes, they are ephemeral—they don't last forever—and
they are all unique.
And also, there is a protocol inside WebRTC called "ICE," so
a little bit of a pun there.
Now, if you're worried about other people using your computer, or using your phone
as a proxy, the short summary there is that you don't need to be worried.
As I will explain here, as I go through the system components,
and take you through a connection, start to finish.
Snowflake system components
So the first part of this diagram always looks the same.
We draw a line to indicate the boundary of the censor's network.
And then inside the censor's network, there is a censored client ,
and the client's goal is to reach some destination
that's outside of the censor's control.
But the client cannot access it directly, because of the censor.
Now, in the Snowflake system, we have a central server component called the broker ,
and the broker's job is to match up censored clients with temporary Snowflake proxies.
And, speaking of temporary Snowflake proxies, they're in here too.
There's thousands and thousands of them. There's one or more running in this room right now.
And what these proxies do is, they are constantly polling the broker,
and they are saying, "Do you have a client for me to serve?"
And the broker says, "No clients right now, check back later."
So this whole system of events is kicked off
when the client makes a registration request to the broker
indicating its need for service.
Now let's stop for a minute and examine what just happened.
We just drew an arrow that crossed the censor's boundary,
and whenever we do that, we need to justify why that should be possible.
We need to justify why that connection in particular
should be difficult or expensive for the censor to block.
So I'm going to defer that discussion, for this connection,
for just a few minutes, and for now, let's just suffice to say that
this connection use a secure so-called rendezvous method ,
and right now we're just going to assume that this cannot be blocked.
So now, when a Snowflake proxy polls the broker and says,
"Do you have a client for me to serve?" the broker says, "Indeed I do,
and here is its information." It's an IP address, it's some cryptographic metadata.
And then the broker also sends the proxy's information back to the client
through the same rendezvous channel.
Now at this point, the client and the temporary proxy connect to each other,
peer to peer, directly, using WebRTC.
So why does this connection work?
One reason we're allowed to do this is because of the WebRTC protocol.
WebRTC is used for a lot of applications, and if you block WebRTC, things break.
And the second reason is, the IP address that the client is connecting to
is an IP address that is not on the censor's blocklist.
It's a fresh Snowflake proxy.
Now we're not quite ready to exit client traffic to the destination just yet.
And the reason for that is that these Snowflake proxies are untrusted.
Anybody can run one of these proxies.
We don't trust them to be in a position to monitor client traffic.
So instead, what we do is, we introduce another node here, we call the
bridge —and this is a long-term, permanent server at a stable address—and
as a Snowflake proxy connects to the client, it also connects to the bridge.
And what we have here is an end-to-end, encrypted connection
between the client and the bridge.
The Snowflake proxy in the middle just copies ciphertext bytes back and forth.
And now, at this point, it's the bridge that actually exits
the client traffic to the destination.
The bridge also serves one other critical function.
The bridge serves as a repository of session state that is shared with the client.
And this is what makes it possible for Snowflake proxies to be temporary.
When a Snowflake proxy disappears, the client can reconnect, on a different proxy,
and pick up a session exactly where it left off, with no interruption,
and that's because of the way that we have a bridge that is decoupled
from the temporary proxies. The bridge is permanent, the proxies are temporary.
So I promised we'd take a closer look at this rendezvous step. Let's do that now.
Snowflake is not alone in needing an initial rendezvous/signaling/bootstrapping step.
There's a lot of circumvention systems that need something like this.
And it's a pretty well-understood and -studied problem.
Kind of what you need is, just a miniature circumvention system
to bootstrap your full circumvention system.
And the good news is, here, that many things work,
and the design space is wide open.
During rendezvous, you can tolerate things, like high latency and low bandwidth,
that would be unacceptable if you used them for your main circumvention channel.
So a lot of options are open to you.
And we have a somewhat modular rendezvous system in Snowflake.
I'm not going to dwell on this too much,
because rendezvous is not what makes Snowflake special,
but it is something you have to have,
and this is what we have implemented at this point.
There's good old domain fronting.
There's AMP cache, this is a special kind of HTTP proxy used for mobile web pages.
And Amazon SQS, Simple Queue Service, is a communication service by Amazon.
The SQS rendezvous was actually contributed to Snowflake from an outside research team,
and you can read about that in
a paper at this year's FOCI workshop .
And, in general, if you are curious about rendezvous at a broader level,
I recommend this paper,
"Communication Breakdown," from this year's PETS.
It's probably the best survey and overview that exists.
Session persistence
When an in-use proxy goes away,
the client does another rendezvous and resumes the session
on a new proxy.
This process uses end-to-end session state
stored at the client and the bridge
(a Turbo Tunnel design).
Temporary Snowflake proxies are just pipes.
See "SpotProxy: Rediscovering the Cloud for Censorship Circumvention"
(USENIX Security 2024) for an active migration
that avoids the need for a repeated rendezvous.
And now a little more about this session persistence feature.
This is really one of the things that does make Snowflake special.
The watchword of a Snowflake proxy is impermanence.
We designed the whole system around the assumption that these proxies
can appear and disappear at any time, that's someone
closes their laptop, unplugs the wifi, whatever.
Even while a proxy is in use, it can disappear.
Even while it's currently being used by a client.
And when that happens, the client will do another rendezvous,
get another proxy, and reconnect, and resume its session.
Now the way the session resumption works, is we actually embed another layer
inside the WebRTC tunnel, that is a session protocol.
And that's, sort of, the magic that makes this work.
The session state is stored at the two endpoints, the client and the bridge;
the proxies in the middle have nothing to do with that.
They're just temporary conduits for this long-term, ongoing session.
This type of design pattern is sometimes called "Turbo Tunnel."
In the very next talk, the one immediately following this one,
SpotProxy, which is based in part on Snowflake,
you'll see how, if you have a little bit of advance notice
before a proxy goes away, you can actually do an active migration
using the already established channel, and avoid another rendezvous.
That's a feature that the base Snowflake system doesn't have.
Users and bandwidth
Snowflake users (daily average concurrent) and bandwidth (daily average).
Now, a large part of the paper is devoted to narration and discussion
of the experience of deployment.
As I say, this is a serious, deployed system.
This shows you about three years of history here.
The upper graph is number of users—and this is not unique users per day,
this is average concurrent users per day.
And the lower graph is bandwidth in gigabits per second.
I only have time to go over the barest details here.
But I'll call out two events.
Here, at the end of 2021, there was a mass blocking of
Tor-related protocols in Russia, and at that time,
the number of Snowflake users actually increased.
This is not really unusual, if you've seen enough of these events,
when there's extreme network blocking,
circumvention goes up. And why is that?
Well, more people turn to the things that continue to work
when the normal avenues of Internet access are shut off.
So it actually can be a good sign of the robustness of your circumvention system.
And then this, here, the most conspicuous feature on the graph,
at the end of 2020 2022 , this was the Mahsa Amini protests in Iran,
and the extreme network restrictions that followed on these protests.
For whatever reason, Snowflake really took hold in Iran, and almost overnight,
literally within 48 hours or something like that,
the number of users of Snowflake quadrupled, and we were scrambling
to add capacity, and all things like this.
And even today, more than half of Snowflake users are from Iran,
according to geolocation.
It's kind of funny, it was about here [March 2022] where I pitched the rest of the team,
I said, "Hey, maybe we should write a paper about Snowflake.
We have something interesting to report now."
Little knowing what lay in store for us.
All right. Some more details you'll find in the paper.
We'll talk about protocol fingerprinting considerations,
the importance of NAT compatibility testing between clients and proxies,
some investigation of the size and composition of the proxy pool over time,
some of the scaling and engineering challenges that we faced, and also
detailed case studies and experience reports,
fielding a circumvention system against real censors
in Russia, Iran, China, and Turkmenistan.
But that's all I'll say for now.
Thank you for your attention, and I'll take your questions.