Fun with Fuzzers: How I Discovered Three Vulnerabilities (Part 1 of 3)

This is the first in a three-part series that chronicles how I discovered three vulnerabilities over the course of a day. This post discusses the steps leading up to the discovery of the first vulnerability. The first vulnerability is the least impactful of the three, but they get progressively more interesting.

Introduction

I had been working at Canonical on the Ubuntu Security Team for about a year when I decided to try my hand at vulnerability research. All day long, I investigated vulnerabilities that were reported against packages in Ubuntu. I thought it would be interesting to discover one myself. I was surprised by how quickly I found not one, but three vulnerabilities (CVE-2019-13032, CVE-2019-13453, and CVE-2019-13241).

New vulnerabilities are discovered every day by fuzzing applications. Fuzzing is the process of providing a program with random or semi-random input and observing what happens; a fuzzer is a tool that automates this process. (You can read more about fuzzing here.) In order to increase the likelihood of discovering a vulnerability, I had decided to take the following approach:

  1. Search through the Ubuntu package archive and select an application that looks like it might be simple to fuzz. I didn’t have a ton of free time to devote to this project, so I didn’t want to take on anything overly complicated.
  2. Search the Ubuntu CVE Tracker to see if any vulnerabilities had previously been found in the selected application. If CVEs had already been reported, I’d be less likely to discover new vulnerabilities.
  3. Use american fuzzy lop (AFL) to fuzz the package. Using a fuzzer would allow me to work smarter, not harder.

After a few minutes of searching through the Ubuntu archive I came across FlightCrew, an EPUB validator. FlightCrew can be run as a stand-alone application or as plugin for Sigil Ebook or used as a library by third-party software. FlightCrew provides a command line utility for validating EPUB files, making it simple to fuzz. All I had to do was compile it using the AFL compiler, find some sample EPUB files, and sit back and let the fuzzer do all the work. Within minutes, I had discovered the first vulnerability: a NULL pointer dereference.

AFL status

Vulnerability Analysis

It turns out that EPUBs are just ZIP archives that contain a specific set of files. To understand what might be causing FlightCrew to crash, I extracted both the original EPUB and the fuzzed EPUB. Then, I used Meld to compare the contents of the archives.

Diff of original and fuzzed EPUB files

Interesting! The href attribute of the item elements had been corrupted, as had the itemref element. I fired up gdbtui (one of my favorite debugging tools) and dove in.

src/Xerces/Xercesc/util/XMLUri.cpp

// NOTE: no check for NULL value of uriStr (caller responsiblilty)
bool XMLUri::isValidURI(const XMLUri* const baseURI
                       , const XMLCh* const uriStr
                       , bool bAllowSpaces/*=false*/)
{
    // get a trimmed version of uriStr
    // uriStr will NO LONGER be used in this function.
    const XMLCh* trimmedUriSpec = uriStr;

    while (XMLChar1_0::isWhitespace(*trimmedUriSpec))
        trimmedUriSpec++;

The actual NULL pointer dereference occurs within the embedded copy of Xerces-C++ — a validating XML parser — on line 2027. The comment at the top of this method states that it is the caller’s responsibility to check whether or not uriStr is NULL.

src/FlightCrew/Framework/ValidatEpub.cpp

fs::path GetRelativePathToNcx( const xc::DOMDocument &opf )
{
    std::vector< xc::DOMElement* > items = xe::GetElementsByQName( 
        opf, QName( "item", OPF_XML_NAMESPACE ) );

    foreach( xc::DOMElement* item, items )
    {
        std::string href       = fromX( item->getAttribute( toX( "href" )       ) );
        std::string media_type = fromX( item->getAttribute( toX( "media-type" ) ) );

        if ( xc::XMLUri::isValidURI( true, toX( href ) ) &&
             media_type == NCX_MIME )
        {
            return Util::Utf8PathToBoostPath( Util::UrlDecode( href ) );  
        }
    }

    return fs::path();
}

Working our way up the stack, we can see that href is initialized on line 118. Using gdbtui, I found that href was an empty string. This makes sense, as the fuzzed EPUB file contains no href elements. We can see on line 121 that href is passed into toX(), and the result is then passed to xc::XMLUri::isValidURI().

src/XercesExtensions/ToXercesStringConverter.h

#define toX( str ) XercesExt::ToXercesStringConverter( (str) ).XercesString()

src/XercesExtensions/ToXercesStringConverter.cpp

ToXercesStringConverter::ToXercesStringConverter( const char* const utf8_string )
{
    if ( utf8_string )
    {
        size_t string_length = strlen( utf8_string );

        if ( string_length > 0 )
        {
            xc::TranscodeFromStr transcoder( 
                (const XMLByte*) utf8_string, string_length, "UTF-8" );

            m_XercesString = transcoder.adopt();
        }

        else
        {
            m_XercesString = NULL;
        }
    }

    else
    {
        m_XercesString = NULL;
    }
}
...
const XMLCh* ToXercesStringConverter::XercesString() const
{
    return m_XercesString;
}

Here is where it all comes together: It turns out that toX() is a macro that calls the function ToXercesStringConverter::ToXercesStringConverter(). We can see on line 51 that, if the input string has a zero length, this function will set the pointer m_XercesString to NULL. The ToXercesStringConverter::XecesString() function simply returns m_XercesString. This NULL pointer is passed to xc::XMLUri::isValidURI() without being checked, resulting in a NULL pointer dereference.

Security Impact

This vulnerability is not particularly concerning within the context of Sigil and FlightCrew. While it fulfills the letter of the law and thereby qualifies as a security vulnerability — it’s a memory corruption error that degrades availability — it doesn’t really have any security impact. If my EPUB editor were to crash I might be mildly inconvenienced, but I wouldn’t feel that a security violation had occurred; applications crash all the time. While the security impact to Sigil may be minimal, to consider this crash only within the context of Sigil and FlightCrew is to miss the bigger picture.

When assessing the security risk of this bug, it’s important to consider two factors:

  1. FlightCrew exists within an open source ecosystem where anyone may integrate it, in whole or in part, into their own software. As FlightCrew can function as a library or standalone EPUB validator — as used in check-all-the-things — bugs in FlightCrew can affect third-party software packages that depend on it.
  2. FlightCrew is a validator. Its sole purpose is to accept and properly handle malformed input.

Third-party users of FlightCrew will have the expectation that they can provide it with malformed input without crashing their applications. Such crashes in these third-party applications may have greater security impact than in FlightCrew alone. For these reasons, I submitted this vulnerability to MITRE so it could be assigned a CVE.

Remediation

As of July 15, 2019, Ubuntu users have access to a patched version of FlightCrew 0.7.2. Other FlightCrew users can apply the following patches and rebuild:

Disclosure Timeline

A NULL pointer dereference vulnerability is discovered, with the help of american fuzzy lop.

Upstream developers are notified via a secure channel. A CRD of July 22, 2019 is proposed.

Upstream developers respond, requesting that the issue be made public on GitHub. Upstream developers patch the vulnerability the same day.

CVE-2019-13032 is assigned to this vulnerability by MITRE.

Further Reading

Part 2 and Part 3 of this series chronicle the next two vulnerabilities that I discovered while fuzzing FlightCrew.

CVE: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-13032

GitHub issue: https://github.com/Sigil-Ebook/flightcrew/issues/53