Translate

Image of Operating System Concepts
Image of Modern Operating Systems (3rd Edition)
Image of Beginning Google Maps API 3
Image of Linux Kernel Development (3rd Edition)

Programmatically Retrieve RPM Package Details

Recently, I needed to retrieve details about software packages installed on Fedora 15, Red Hat Enterprise Linux, CentOS and other distributions which distribute their software packages using the RPM Package Manager. To my surprise, what should have been a relatively simple task turned out to be quite messy because of changes in the RPM library APIs and internal format over the last few years. In this post, I demonstrate how to retrieve information about RPM packages using C and Python.

RPM is a command-line or API driven package management system capable of installing, uninstalling, verifying, querying, and updating Linux or Unix software packages. Each software package consists of an archive of files together with information about the package such as its version number, a summary and a description, and dependency information. There is also a library API to enable developers to manage such transactions from compiled programming languages such as C or scripting languages such Python. Package files are written to disk in network byte order. If necessary, RPM automatically converts data to host byte order when the package file is read.

RPM was initially developed in 1997 by Erik Troan and Marc Ewing for use in the Red Hat Linux distribution. For many years it was an opensource project which did not receive much love or attention. That picture changed in early 2007 when two separate (and competing) development communities were launched.

The more prominent RPM development community is rpm.org which is led by Red Hat. According to their website:

After a long development break rpm.org was relaunched in 2007 with the goal to reclaim the position as upstream home of RPM. As a first step patches that had piled up in the different distributions have been integrated into the code base as far as possible. We want RPM not be the province of one company, or a small set of developers. It needs to be developed in an open community, consumed and contributed to by many companies, users, distributions, and developers. We therefore welcome any and all contributors.
….
RPM will stay backward compatible to 4.4.2 for a quite long time. It is essential that third party packages can be installed without the need to recompile them – especially for enterprise distributions.

In May 2007 Red Hat hired Panu Matilainen to work on the RPM project. The first major code revision was in July 2007; version 4.8 was released in January 2010, and 4.9 in March 2011. This version is used by distributions such as Fedora, Red Hat Enterprise Linux, openSUSE, SUSE Linux Enterprise and CentOS.

The other RPM development community is rpm5.org which is lead by Jeff Johnson who was a maintainer of RPM whilst an employee of Red Hat. RPM version 5.0 was released in May 2007. Their latest version is 5.3.11 dated 02-Jun-2011. This version of RPM is used by distributions such as Unity Linux and cAos Linux, and also by the OpenPKG project which provides packages for some other Unix-like platforms. Apparently Mandriva has recently switched to it also although there seems to be some controversy about that particular decision.

The format of an RPM package is binary and consists of three sections in the following order:

  • The lead section identifies the file as an RPM file. It contains a number of obsolete headers which in previous previous versions of RPM were used to store information used internally by RPM. Today, however, the lead section’s only purpose is to make it easy to identify an RPM package file.
  • The signature section contains information that can be used to verify the integrity, and optionally, the authenticity of the majority of the package. This section is implemented using a header structure (see below).
  • The header section contains the package metadata such as name, version, architecture, list of files included and suchlike. It too is implemented as a header structure.
  • Th final section contains the actual file archive, which is usually in cpio format, compressed with gzip but more recent versions of RPM can also use bzip2, lzma or xz compression and xar (XML Archive) is supported by RPM 5.0.

Here is what the lead contains. Do not use anything from the lead except the major number and signature type.

struct rpmlead_s {
    unsigned char magic[4];
    unsigned char major;
    unsigned char minor;
    short type;
    short archnum;
    char name[66];
    short osnum;
    short signature_type;       /*!< Signature header type (RPMSIG_HEADERSIG) */
    char reserved[16];          /*!< Pad to 96 bytes -- 8 byte aligned! */
};


The header structure concept is RPM’s solution to the problem of easily manipulating information in a standardized way. The purpose of a header structure is to contain zero or more pieces of data. There are three sections to each header structure. The first section is known as the header structure header. The header structure header is used to identify the start of a header structure, its size, and the number of data items it contains. Following the header structure header is an area called the index.

The header structure’s index is made up of zero or more index entries. Each entry is sixteen bytes long. The first four bytes contain a tag — a numeric value that identifies what type of data is pointed to by the entry. There are a large number of header tags defined in rpmtag.h. Here are a few of them:

typedef enum rpmTag_e {
    ....
    RPMTAG_NAME                 = 1000, /* s */
    RPMTAG_VERSION              = 1001, /* s */
    RPMTAG_RELEASE              = 1002, /* s */
    RPMTAG_EPOCH                = 1003, /* i */
    RPMTAG_SUMMARY              = 1004, /* s{} */
    RPMTAG_DESCRIPTION          = 1005, /* s{} */
    RPMTAG_BUILDTIME            = 1006, /* i */
    RPMTAG_BUILDHOST            = 1007, /* s */
    RPMTAG_INSTALLTIME          = 1008, /* i */
    RPMTAG_SIZE                 = 1009, /* i */
    RPMTAG_DISTRIBUTION         = 1010, /* s */
    RPMTAG_VENDOR               = 1011, /* s */
    .....
   /* tags 1997-4999 reserved */
    RPMTAG_FILENAMES            = 5000, /* s[] extension */
    RPMTAG_FILEPROVIDE          = 5001, /* s[] extension */
    RPMTAG_FILEREQUIRE          = 5002, /* s[] extension */
    RPMTAG_FSNAMES              = 5003, /* s[] (unimplemented) */
    RPMTAG_FSSIZES              = 5004, /* l[] (unimplemented) */
    RPMTAG_TRIGGERCONDS         = 5005, /* s[] extension */
    RPMTAG_TRIGGERTYPE          = 5006, /* s[] extension */
    RPMTAG_ORIGFILENAMES        = 5007, /* s[] extension */
    RPMTAG_LONGFILESIZES        = 5008, /* l[] */
    RPMTAG_LONGSIZE             = 5009, /* l */
} rpmTag;


Following each tag, is a four-byte type, which is a numeric value that describes the format of the data pointed to by the entry. Here is the current list of types defined in rpmtag.h:

typedef enum rpmTagType_e {
    RPM_NULL_TYPE               =  0,
    RPM_CHAR_TYPE               =  1,
    RPM_INT8_TYPE               =  2,
    RPM_INT16_TYPE              =  3,
    RPM_INT32_TYPE              =  4,
    RPM_INT64_TYPE              =  5,
    RPM_STRING_TYPE             =  6,
    RPM_BIN_TYPE                =  7,
    RPM_STRING_ARRAY_TYPE       =  8,
    RPM_I18NSTRING_TYPE         =  9,
} rpmTagType;


Most of these types should be self explanatory. The difference between a STRING type and a STRING_ARRAY type is that the former is a regular null terminated string whereas the latter is a collection of strings. RPM_I18NSTRING_TYPE is depreciated.

Next is a 4-byte offset value that contains the actual position of the data, relative to the beginning of the store. Finally, there is a four-byte count that contains the number of data items pointed to by the index entry. STRING data always has a count of 1, while STRING_ARRAY data has a count equal to the number of strings contained in the store.

After the index comes the store. It is in the store that actual data items are kept. The data in the store is packed together as closely as possible in network byte order, i.e. most significant byte first. STRING data is terminated with a null byte. Integer data is stored at the natural boundary for its type, i.e. a 32-bit integer is stored on an 4-byte boundary.

Okay, time to talk about how to programmatically access RPM packages. However before you start using the RPM library, you need to figure out what version of the RPM library you are using. If you are writing a shell script, this is easy to do. For example on Fedora 15:

$ /usr/lib/rpm/rpmdeps --version
RPM version 4.9.0


The RPM library from rpm5.org has the rpmlibVersion API which can be used to retrieve the library version or you can simply access the RPMVERSION string as shown below.

#include <stdio.h>
#include <rpm/rpmlib.h>

int
main()
{
    fprintf(stderr, "RPM Version: %s\n", RPMVERSION);
}


Unfortunately the RPM library from rpm.org does not have anything similar. Note to rpm.org developers! Please add the rpmlibVersion API to your list of supported public APIs so that applications that need to programmatically access RPM packages and databases can easily figure out which flavor and release of an RPM library they are dealing with and adjust their code accordingly.

On Fedora 15, the version string that is returned is 4.9.0 which indicates a version of RPM that was released in March 2011. On CentOS 5.6, the version string that is returned is 4.4.2.3 which indicates a significantly older version of RPM released in April 2008. There are significant and major differences between these versions. In particular C code written for one of these two particular RPM versions will probably not work for the other version without modification.

It gets worse, by the way, as Fedora 15 and Red Hat Enterprise Linux 6 use the newer RPM library which also includes a new format and SHA hash. This causes problems when you try to install an RPM built on one of these platforms on an older platform such as CentOS 5.6 which does not know about the new format and hash. As a result, CentOS 5.6 will complain that it cannot confirm the integrity of the RPM package.

You also need to be able to determine what features a particular version of RPM supports. Here is one way of doing it:

#include <stdio.h>
#include <stdlib.h>

#include <rpm/rpmlib.h>
#include <rpm/rpmds.h>


int
main(int argc, char *argv[])
{
    const char *DNEVR;
    rpmds ds = NULL;
    int rc;

    rpmReadConfigFiles(NULL, NULL);

    rc = rpmdsRpmlib(&ds, NULL);
    ds = rpmdsInit(ds);

    fprintf(stdout, "Supported features:\n");
    while (rpmdsNext(ds) >= 0) {
        if ((DNEVR = rpmdsDNEVR(ds)) != NULL)
            fprintf(stdout, "%s\n", DNEVR + 2);
    }
    ds = rpmdsFree(ds);

    exit(0);
}


You have to have the RPM development package (rpm-devel) installed in order to be able to compile the above code.

This code simply prints out the supported RPM feature set. Here is what is outputted for the RPM library on Fedora 15:

Supported features:
rpmlib(BuiltinLuaScripts) = 4.2.2-1
rpmlib(CompressedFileNames) = 3.0.4-1
rpmlib(ConcurrentAccess) = 4.1-1
rpmlib(ExplicitPackageProvide) = 4.0-1
rpmlib(FileCaps) = 4.6.1-1
rpmlib(FileDigests) = 4.6.0-1
rpmlib(HeaderLoadSortsTags) = 4.0.1-1
rpmlib(PartialHardlinkSets) = 4.0.4-1
rpmlib(PayloadFilesHavePrefix) = 4.0-1
rpmlib(PayloadIsBzip2) = 3.0.5-1
rpmlib(PayloadIsLzma) = 4.4.2-1
rpmlib(PayloadIsXz) = 5.2-1
rpmlib(ScriptletExpansion) = 4.9.0-1
rpmlib(ScriptletInterpreterArgs) = 4.0.3-1
rpmlib(VersionedDependencies) = 3.0.3-1


By the way, there is draft documentation for an RPM Guide on the Fedora Project website. I do not know when this guide was produced (probably 2003) but the latest copyright notice includes 2010. I cannot speak for the rest of this guide but the chapters on Programming RPM with C and Programming RPM with Python are frankly rubbish and downright misleading.

For example, listing 16-1 (rpm1.c), shown below, will not even compile on Fedora 15 or even on Centos 5.6 which has a far older version of RPM.

#include <stdio.h>
#include <stdlib.h>
#include <rpmlib.h>

int 
main(int argc, char * argv[]) 
{
   int status = rpmReadConfigFiles( (const char*) NULL, (const char*) NULL);
   if (status != 0) {
      printf("Error reading RC files.\n"); 
      exit(-1);
   } else {
      printf("Read RC OK\n");
   }

   rpmSetVerbosity(RPMMESS_NORMAL);
   rpmShowRC( stdout );

   exit(0);
}


The listing is not even pretty formatted in the guide. The above formatting is all mine. The RPMMESS_* defines were removed way back in 2007. As an aside, where did the convention to use exit(-1) come from? That looks more like something from the Microsoft Windows world! The compilation instructions are also wierd.

$ cc -I/usr/include/rpm -o rpm1 rpm1.c -lrpm -lrpmdb -lrpmio –lpopt


Why the need to reference lrpmdb and libpopt? No routines from either of these libraries are used in the above code.

Here is the above example rewritten to work on Fedora 15:

#include <stdio.h>
#include <stdlib.h>
#include <rpm/rpmlib.h>
#include <rpm/rpmlog.h>

int
main(int argc, char * argv[])
{
    int status;

    if ((status = rpmReadConfigFiles( (const char*) NULL, (const char*) NULL))) {
       printf("ERROR: reading RC files\n");
       exit(1);
    }

    rpmSetVerbosity(RPMLOG_NOTICE);
    rpmShowRC(stdout);

    exit(0);
}


You can compile the above code using gcc -o rpm1 rpm1.c -lrpm -lrpmio.

The following example shows the way that you will traditionally see being used to generate a list of the installed RPM packages on a system if you do a search for such code on the Internet.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>

#include <rpm/rpmlib.h>
#include <rpm/header.h>
#include <rpm/rpmdb.h>

int
main(int argc, char *argv[])
{
    rpmdbMatchIterator mi;
    int type, count;
    char *name;
    rpmdb db;
    Header h;

    rpmReadConfigFiles( NULL, NULL );
    if (rpmdbOpen( "", &db, O_RDONLY, 0644 ) != 0) {
        fprintf( stderr, "ERROR: Cannot open RPM database\n");
        exit(1);
    }

    mi = rpmdbInitIterator(db, RPMDBI_PACKAGES, NULL, 0);
    while ((h = rpmdbNextIterator(mi))) {
        headerGetEntry(h, RPMTAG_NAME, &type, (void **) &name, &count);
        printf("%s\n", name);
    }

    rpmdbFreeIterator(mi);
    rpmdbClose(db);

    exit(0);
}


It uses rpmdbOpen and rpmdbClose to open and close the RPM database, and a rpmdbMatchInterator to iterate through the RPM database looking for matching entries for RPMTAG_NAME.

This works for CentOS 5.6 but is not supported in Fedora 15. In RPM 4.9.0, almost all low-level rpmdb manipulation functions including the rpmdbOpen, rpmdbClose and rpmdbMatchInterator routines used above were removed or internalized. See the RPM 4.9.0 Release Notes for full details.

The following example works on Fedora 15 and outputs a list of the name (RPMTAG_NAME) and size (RPMTAG_SIZE) of each installed package.

#include <stdio.h>
#include <stdlib.h>

#include <rpm/rpmlib.h>
#include <rpm/header.h>
#include <rpm/rpmts.h>
#include <rpm/rpmdb.h>

int
main()
{
    rpmts ts = NULL;
    Header h;
    rpmdbMatchIterator mi;
    rpmtd td, tn;
    char time_buffer[512];
    int rc1, rc2;

    td = rpmtdNew();
    tn = rpmtdNew();
    ts = rpmtsCreate();

    rpmReadConfigFiles( NULL, NULL );

   mi = rpmtsInitIterator( ts, RPMDBI_PACKAGES, NULL, 0);
    while (NULL != (h = rpmdbNextIterator(mi))) {

        h = headerLink(h);
        rc1 = headerGet(h, RPMTAG_NAME, tn, HEADERGET_EXT);
        rc2 = headerGet(h, RPMTAG_SIZE, td, HEADERGET_EXT);

        // output installed package name and size
        fprintf(stdout, "%s (%llu)\n", rpmtdGetString(tn), rpmtdGetNumber(td));

        rpmtdReset(td);
        rpmtdReset(tn);
        headerFree(h);
    }

    rpmdbFreeIterator(mi);
    rpmtsFree(ts);

    exit(0);
}


The following example shows how to print out more information about of the installed packages on your system.

#include <stdio.h>
#include <stdlib.h>

#include <rpm/rpmlib.h>
#include <rpm/header.h>
#include <rpm/rpmts.h>
#include <rpm/rpmdb.h>

int
main(int argc, char *argv[])
{
    rpmts ts = NULL;
    Header h;
    rpmdbMatchIterator mi;
    char *n, *v, *r, *g, *a;

    ts = rpmtsCreate();
    rpmReadConfigFiles( NULL, NULL );

    mi = rpmtsInitIterator( ts, RPMDBI_PACKAGES, NULL, 0);
    while (NULL != (h = rpmdbNextIterator(mi))) {
        h = headerLink( h );
        headerGetEntry( h, RPMTAG_NAME, NULL, (void**)&n, NULL);
        headerGetEntry( h, RPMTAG_VERSION, NULL, (void**)&v, NULL);
        headerGetEntry( h, RPMTAG_RELEASE, NULL, (void**)&r, NULL);
        headerGetEntry( h, RPMTAG_GROUP, NULL, (void**)&g, NULL);
        headerGetEntry( h, RPMTAG_ARCH, NULL, (void**)&a, NULL);

        fprintf(stdout, "%s-%s-%s.%s\n", n, v, r, a);

        headerFree(h);
    }
    rpmdbFreeIterator(mi);
    rpmtsFree(ts);

    exit(0);
}


This example works on both CentOS 5.6 and Fedora 15. Here is some sample output:

iso-codes-0.53-1.noarch
zlib-1.2.3-3.x86_64
libstdc++-4.1.2-50.el5.x86_64
db4-4.3.29-10.el5_5.2.x86_64
info-4.8-14.el5.x86_64
gawk-3.1.5-14.el5.x86_64
libgcrypt-1.4.4-5.el5.x86_64
libfontenc-1.0.2-2.2.el5.x86_64
libieee1284-0.2.9-4.el5.x86_64
grep-2.5.1-55.el5.x86_64
....


By the way, RPM has Python, Perl and Lua support. Here is the equivalent code written in Python:

#!/usr/bin/python

import rpm

ts=rpm.ts()

mi=ts.dbMatch()
for hdr in mi:
    print "%s-%s-%s.%s" % (hdr['name'], hdr['version'], hdr['release'], hdr['arch'])


As you can see Python can greatly simplify things when you wish to work with RPM packages.

The following example demonstrates how to print out a number of tags in XML format for each installed package.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <inttypes.h>

#include <rpm/rpmlib.h>
#include <rpm/header.h>
#include <rpm/rpmts.h>
#include <rpm/rpmdb.h>


struct tag {
   int  tagno;
   char *element;
};

truct tag tags[] = {
   { RPMTAG_NAME, "Name"},
   { RPMTAG_VERSION, "Version" },
   { RPMTAG_RELEASE, "Release" },
   { RPMTAG_SUMMARY, "Summary" },
   { RPMTAG_BUILDTIME, "BuildTime"},
   { RPMTAG_BUILDHOST, "BuildHost" },
   { RPMTAG_INSTALLTIME, "InstallTime" },
   { RPMTAG_SIZE, "Size" },
   { RPMTAG_LICENSE, "License"},
   { RPMTAG_URL, "SourceUrl" },
   { RPMTAG_PAYLOADFORMAT, "PayloadFormat" },
   { RPMTAG_PAYLOADCOMPRESSOR, "PayloadCompressor" }
};

#define NTAGS (sizeof(tags)/sizeof(struct tag))

int
main(int argc, char *argv[])
{
    rpmdbMatchIterator mi;
    rpmts ts = NULL;
    rpmtd td;
    Header h;
    int i;

    td = rpmtdNew();
    ts = rpmtsCreate();
    rpmReadConfigFiles(NULL, NULL);

    printf("<InstalledPackages>\n");
    mi = rpmtsInitIterator( ts, RPMDBI_PACKAGES, NULL, 0);
    while (NULL != (h = rpmdbNextIterator(mi))) {
        h = headerLink(h);
        printf("   <Package>\n");
        for ( i = 0; i < NTAGS; i++) {
            headerGet(h, (rpm_tag_t)tags[i].tagno, td, HEADERGET_ALLOC | HEADERGET_EXT);
            if (td->data) {
                switch(td->type) {
                    case RPM_NULL_TYPE:
                        break;
                    case RPM_CHAR_TYPE:
                        printf("      <%s>%s</%s>\n", tags[i].element, td->data, tags[i].element);
                        break;
                    case RPM_INT8_TYPE:
                    case RPM_INT16_TYPE:
                        printf("      <%s>%d</%s>\n", tags[i].element, td->data, tags[i].element);
                        break;
                    case RPM_INT32_TYPE:
                        if ((tags[i].tagno == RPMTAG_BUILDTIME) ||
                           (tags[i].tagno == RPMTAG_INSTALLTIME)) {
                             printf("      <%s>%s</%s>\n", tags[i].element, rpmtdFormat(td, RPMTD_FORMAT_DATE , NULL), tags[i].element);
                        } else if (tags[i].tagno == RPMTAG_SIZE) {
                             printf("      <%s>%" PRIu64 "<%s>\n", tags[i].element, rpmtdGetNumber(td), tags[i].element);
                        } else {
                             printf("      <%s>%u</%s>\n", tags[i].element, td->data, tags[i].element);
                        }
                        break;
                    case RPM_INT64_TYPE:
                        printf("      <%s>%" PRIu64 "</%s>\n", tags[i].element, rpmtdGetNumber(td), tags[i].element);
                        break;
                    case RPM_STRING_TYPE:
                        printf("      <%s>%s</%s>\n", tags[i].element, rpmtdGetString(td), tags[i].element);
                        break;
                    case RPM_BIN_TYPE:
                        printf("      <%s>%x</%s>\n", tags[i].element, td->data, tags[i].element);
                        break;
                    case RPM_STRING_ARRAY_TYPE:
                    default:
                        break;
                }
            }

            rpmtdReset(td);
        }
        headerFree(h);

        printf("   </Package>\n");
    }
    printf("</InstalledPackages>\n");

    rpmdbFreeIterator(mi);
    rpmtsFree(ts);

    exit(0);
}


This works on Fedora 15. Here is example output:

<InstalledPackages>
   <Package>
      <Name>file-roller</Name>
      <Version>3.0.2</Version>
      <Release>1.fc15</Release>
      <Summary>Tool for viewing and creating archives</Summary>
      <BuildTime>Wed May 25 19:57:10 2011</BuildTime>
      <BuildHost>x86-06.phx2.fedoraproject.org</BuildHost>
      <InstallTime>Thu Jun  2 17:45:29 2011</InstallTime>
      <Size>5928015<Size>
      <License>GPLv2+</License>
      <SourceUrl>http://download.gnome.org/sources/file-roller/</SourceUrl>
      <PayloadFormat>cpio</PayloadFormat>
      <PayloadCompressor>xz</PayloadCompressor>
   </Package>
   <Package>
      <Name>expect</Name>
      <Version>5.45</Version>
      <Release>3.fc15</Release>
      <Summary>A program-script interaction and testing utility</Summary>
      <BuildTime>Wed Mar 16 09:58:49 2011</BuildTime>
      <BuildHost>x86-12.phx2.fedoraproject.org</BuildHost>
      <InstallTime>Mon Jun 13 10:24:25 2011</InstallTime>
      <Size>559676<Size>
      <License>Public Domain</License>
      <SourceUrl>http://expect.nist.gov/</SourceUrl>
      <PayloadFormat>cpio</PayloadFormat>
      <PayloadCompressor>xz</PayloadCompressor>
   </Package>
   .....
   <Package>
      <Name>libxkbfile-devel</Name>
      <Version>1.0.7</Version>
      <Release>2.fc15</Release>
      <Summary>X.Org X11 libxkbfile development package</Summary>
      <BuildTime>Tue Feb  8 08:03:57 2011</BuildTime>
      <BuildHost>x86-13.phx2.fedoraproject.org</BuildHost>
      <InstallTime>Thu Jun  2 17:56:48 2011</InstallTime>
      <Size>38055<Size>
      <License>MIT</License>
      <SourceUrl>http://www.x.org</SourceUrl>
      <PayloadFormat>cpio</PayloadFormat>
      <PayloadCompressor>xz</PayloadCompressor>
   </Package>
</InstalledPackages>


Note that the version of RPM on CentOS 5.6 does not support RPM_INT64_TYPE. As a result dates and suchlike are stored in a RPM_INT32_TYPE.

Turning now to how to examine individual RPM packages. The following simple example show you how to query a package file:

#include <stdio.h>
#include <stdlib.h>

#include <rpm/rpmlib.h>
#include <rpm/header.h>
#include <rpm/rpmts.h>
#include <rpm/rpmdb.h>
#include <rpm/rpmlog.h>

int
main(int argc, char *argv[])
{
    int i;
    rpmts ts;

    FD_t fd;
    rpmRC rc;
    Header hdr;
    char *pkg_name, *pkg_version, *pkg_release;
    rpmVSFlags vsflags = 0;

    rc = rpmReadConfigFiles(NULL, NULL);
    if (rc != RPMRC_OK) {
        rpmlog(RPMLOG_NOTICE, "Unable to read RPM configuration.\n");
        exit(1);
    }

    fd = Fopen(argv[1], "r.ufdio");
    if ((!fd) || Ferror(fd)) {
       rpmlog(RPMLOG_NOTICE, "Failed to open package file (%s)\n", Fstrerror(fd));
       if (fd) {
           Fclose(fd);
       }
       exit(1);
    }

    ts = rpmtsCreate();

    vsflags |= _RPMVSF_NODIGESTS;
    vsflags |= _RPMVSF_NOSIGNATURES;
    vsflags |= RPMVSF_NOHDRCHK;
    (void) rpmtsSetVSFlags(ts, vsflags);

    rc = rpmReadPackageFile(ts, fd, argv[1], &hdr);
    if (rc != RPMRC_OK) {
       rpmlog(RPMLOG_NOTICE, "Could not read package file\n");
        Fclose(fd);
        exit(1);
     }
     Fclose(fd);

    if (headerNVR(hdr, (const char **) &pkg_name,
                       (const char **) &pkg_version,
                       (const char **) &pkg_release))
    {
         rpmlog(RPMLOG_NOTICE, "Header read failed\n");
    } else {
         printf("Package is: %s-%s-%s\n", pkg_name, pkg_version, pkg_release);
         headerFreeData(pkg_name, RPM_STRING_TYPE);
         headerFreeData(pkg_version, RPM_STRING_TYPE);
         headerFreeData(pkg_release, RPM_STRING_TYPE);
    }

    headerFree(hdr);
    rpmtsFree(ts);

    exit(0);
}


You can compile the above example using gcc -o example example.c -lrpm -lrpmio. This compiles with no errors or warnings on CentOS 5.6 but issues a warning that headerNVR is depreciated on Fedora 15.

Suppose you want to find out the list of requirements a particular installed RPM package has. One way is to use rpm -qR . Another way is to use find-requires or rpmdeps as shown below:

$ rpm -ql rpm-devel | /usr/lib/rpm/find-requires 
libacl.so.1()(64bit)
libbz2.so.1()(64bit)
libcap.so.2()(64bit)
libc.so.6()(64bit)
libc.so.6(GLIBC_2.14)(64bit)
libc.so.6(GLIBC_2.2.5)(64bit)
libc.so.6(GLIBC_2.3.4)(64bit)
libdb-4.8.so()(64bit)
libdl.so.2()(64bit)
libelf.so.1()(64bit)
liblua-5.1.so()(64bit)
liblzma.so.5()(64bit)
libm.so.6()(64bit)
libnss3.so()(64bit)
libpopt.so.0()(64bit)
libpthread.so.0()(64bit)
librpmio.so.2()(64bit)
librpm.so.2()(64bit)
librt.so.1()(64bit)
libselinux.so.1()(64bit)
libz.so.1()(64bit)

$ rpm -ql rpm-devel | /usr/lib/rpm/rpmdeps -R
/usr/bin/pkg-config
libacl.so.1()(64bit)
libbz2.so.1()(64bit)
libc.so.6()(64bit)
libc.so.6(GLIBC_2.14)(64bit)
libc.so.6(GLIBC_2.2.5)(64bit)
libc.so.6(GLIBC_2.3.4)(64bit)
libcap.so.2()(64bit)
libdb-4.8.so()(64bit)
libdl.so.2()(64bit)
libelf.so.1()(64bit)
liblua-5.1.so()(64bit)
liblzma.so.5()(64bit)
libm.so.6()(64bit)
libnss3.so()(64bit)
libpopt.so.0()(64bit)
libpopt.so.0(LIBPOPT_0)(64bit)
libpthread.so.0()(64bit)
librpm.so.2()(64bit)
librpmbuild.so.2()(64bit)
librpmio.so.2()(64bit)
librpmsign.so.0()(64bit)
librt.so.1()(64bit)
libselinux.so.1()(64bit)
libz.so.1()(64bit)
rtld(GNU_HASH)


Here is an example of a C program that will print out the equivalent information together with installed files, conflicts and obsoletes:

#include <stdio.h>
#include <stdlib.h>

#include <rpm/rpmlib.h>
#include <rpm/rpmds.h>
#include <rpm/rpmts.h>
#include <rpm/rpmdb.h>


int
main(int argc, char *argv[])
{
    const char *DNEVR;
    rpmdbMatchIterator mi;
    rpmds ds = NULL;
    Header h;
    rpmtd td_name, td_version, td_release, td_size, td_group, td_installtime;
    rpmts ts = NULL;
    rpmfi fi;

    if (argc != 2) {
        fprintf(stderr, "ERROR: No RPM specified on command line.\n");
        exit(1);
    }

    td_name = rpmtdNew();
    td_version = rpmtdNew();
    td_release = rpmtdNew();
    td_size = rpmtdNew();
    td_group = rpmtdNew();
    td_installtime = rpmtdNew();

   ts = rpmtsCreate();

    rpmReadConfigFiles(NULL, NULL);

    mi = rpmtsInitIterator(ts, RPMTAG_NAME, argv[1], 0);
    if (NULL != (h = rpmdbNextIterator(mi))) {
        h = headerLink(h);
        headerGet(h, RPMTAG_NAME, td_name, HEADERGET_EXT);
        headerGet(h, RPMTAG_VERSION, td_version, HEADERGET_EXT);
        headerGet(h, RPMTAG_RELEASE, td_release, HEADERGET_EXT);
        headerGet(h, RPMTAG_SIZE, td_size, HEADERGET_EXT);
        headerGet(h, RPMTAG_GROUP, td_group, HEADERGET_EXT);
        headerGet(h, RPMTAG_INSTALLTIME, td_installtime, HEADERGET_EXT);

        printf("%-20s: %s-%s-%s\n", "Package", rpmtdGetString(td_name), rpmtdGetString(td_version), rpmtdGetString(td_release));
        printf("%-20s: %s\n", "Group", rpmtdGetString(td_group));
        printf("%-20s: %llu\n", "Size", rpmtdGetNumber(td_size));
        printf("%-20s: %s\n", "Installed on", rpmtdFormat(td_installtime, RPMTD_FORMAT_DATE, NULL));

        fi = rpmfiNew(NULL, h, RPMTAG_BASENAMES, RPMFI_KEEPHEADER);
        if (fi) {
            fprintf(stdout, "\nFiles Provided:\n");
            while (rpmfiNext(fi) != -1)
                 fprintf(stdout, "  %s\n", rpmfiFN(fi));
            fi = rpmfiFree(fi);
        }

#if EXTRA
        ds = rpmdsNew(h, RPMTAG_PROVIDENAME, 0);
        if (ds) {
            fprintf(stdout, "\nProvides:\n");
            while (rpmdsNext(ds) >= 0) {
                 if ((DNEVR = rpmdsDNEVR(ds)) != NULL)
                    fprintf(stdout, "  %s\n", DNEVR + 1);
            }
            ds = rpmdsFree(ds);
        }
#endif

        ds = rpmdsNew(h, RPMTAG_REQUIRENAME, 0);
        if (ds) {
            fprintf(stdout, "\nRequires:\n");
            while (rpmdsNext(ds) >= 0) {
                 if ((DNEVR = rpmdsDNEVR(ds)) != NULL)
                    fprintf(stdout, "  %s\n", DNEVR + 1);
            }
            ds = rpmdsFree(ds);
        }

        ds = rpmdsNew(h, RPMTAG_OBSOLETENAME, 0);
        if (ds) {
            fprintf(stdout, "\nObsoletes:\n");
            while (rpmdsNext(ds) >= 0) {
                 if ((DNEVR = rpmdsDNEVR(ds)) != NULL)
                    fprintf(stdout, "  %s\n", DNEVR + 1);
            }
            ds = rpmdsFree(ds);
        }

        ds = rpmdsNew(h, RPMTAG_CONFLICTNAME, 0);
        if (ds) {
            fprintf(stdout, "\nConflicts:\n");
            while (rpmdsNext(ds) >= 0) {
                 if ((DNEVR = rpmdsDNEVR(ds)) != NULL)
                    fprintf(stdout, "  %s\n", DNEVR + 1);
            }
            ds = rpmdsFree(ds);
        }

        headerFree(h);
    }

    rpmdbFreeIterator(mi);
    rpmtsFree(ts);

    exit(0);
}


This works on Fedora 15 but not on CentOS 5.6. However, it is relatively easy to modify to get it to work on CentOS 5.6 and thus I will leave that as an exercise for you. The main changes relate to the use of the rpmtd type. These need to be eliminated as the version of RPM on CentOS 5.6 does not support the rpmtd type.

Here is what is outputted for the xorg-x11-xkb-utils package:

./rpminfo.py xorg-x11-xkb-utils
Package             : xorg-x11-xkb-utils-7.5-3.fc15
Group               : User Interface/X
Size                : 199629
Installed on        : Thu Jun  2 17:39:31 2011

Files Provided:
  /usr/bin/setxkbmap
  /usr/bin/xkbcomp
  /usr/share/man/man1/setxkbmap.1.gz
  /usr/share/man/man1/xkbcomp.1.gz

Requires:
   libX11.so.6()(64bit)
   libc.so.6()(64bit)
   libc.so.6(GLIBC_2.2.5)(64bit)
   libc.so.6(GLIBC_2.3)(64bit)
   libc.so.6(GLIBC_2.3.4)(64bit)
   libc.so.6(GLIBC_2.4)(64bit)
   libc.so.6(GLIBC_2.7)(64bit)
   libxkbfile.so.1()(64bit)
   rpmlib(CompressedFileNames) < = 3.0.4-1
   rpmlib(FileDigests) <= 4.6.0-1
   rpmlib(PayloadFilesHavePrefix) <= 4.0-1
   rtld(GNU_HASH)
   rpmlib(PayloadIsXz) <= 5.2-1

Obsoletes:
   XFree86
   xorg-x11


Here is how to do the same thing using Python. It is loosely based on the example given in listing 17-3 in the Fedora Project RPM Guide which by the way, does not work nor, I suspect, ever worked! The output produced by this script is somewhat more comprehensive than that of the C version even though the script has fewer lines of code.

#!/usr/bin/python

import rpm, sys

def stringfromds(ds):
    retlist=[]
    for dataset in ds:
        t=dataset[0]
        values=t.split(" ")[1:]
        retlist.append(" ".join(values))
    return retlist

def printEntry(header, label, format, extra):
    value = header.sprintf(format).strip()
    print "%-20s: %s %s" % (label, value, extra)

def printHeader(h):
    if h[rpm.RPMTAG_SOURCEPACKAGE]:
        extra = " source package"
    else:
        extra = " binary package"

    printEntry(h, 'Package', "%{NAME}-%{VERSION}-%{RELEASE}", extra)
    printEntry(h, 'Group', "%{GROUP}", '')
    printEntry(h, 'Summary', "%{Summary}", '')
    printEntry(h, 'Arch-OS-Platform', "%{ARCH}-%{OS}-%{PLATFORM}", '')
    printEntry(h, 'Vendor', "%{Vendor}", '')
    printEntry(h, 'URL', "%{URL}", '')
    printEntry(h, 'Size', "%{Size}", '')
    printEntry(h, 'Installed on', "%{INSTALLTID:date}", '')
    print "%-20s: %s" % ("Description", h['Description'])

    print "\nFiles Provided:"
    for fi in h.fiFromHeader():
        print "  ", fi[0], "  ", fi[1], "  ", fi[12]

    ds = rpm.ds(h, 'requires')
    if ds:
        print "\nRequires:"
        for d in stringfromds(ds):
            print "  ", d

    ds = rpm.ds(h, 'obsoletes')
    if ds:
        print "\nObsoletes:"
        for d in stringfromds(ds):
            print "  ", d

    ds = rpm.ds(h, 'conflicts')
    if ds:
        print "\nConflicts:"
        for d in stringfromds(ds):
            print "  ", d


def main(argv):
    ts = rpm.TransactionSet()
    for h in ts.dbMatch( 'name', argv[1]):
        printHeader(h)

if __name__ == "__main__":
    if len(sys.argv) == 1:
        print "ERROR: No RPM specified on command line."
        sys.exit(1)
    else:
        main(sys.argv)


Here is what is outputted for the xorg-x11-xkb-utils package:

$ ./rpminfo.py xorg-x11-xkb-utils
Package             : xorg-x11-xkb-utils-7.5-3.fc15  binary package
Group               : User Interface/X 
Summary             : X.Org X11 xkb utilities 
Arch-OS-Platform    : x86_64-linux-x86_64-redhat-linux-gnu 
Vendor              : Fedora Project 
URL                 : http://www.x.org 
Size                : 199629 
Installed on        : Thu Jun  2 17:38:11 2011 
Description         : X.Org X11 xkb core utilities

Files Provided:
   /usr/bin/setxkbmap    19224    e5e5757d15ca331474c43a0319d0307920fb17b3ed9d04c041e13b56c4aa08e4
   /usr/bin/xkbcomp    176960    e6b265d05cd6432859d7a24fb8f6d6ccec3cbd668abd54afd9606322d3dd9a9e
   /usr/share/man/man1/setxkbmap.1.gz    1753    4a91c43a425699e4209880cf80a28eeae565763ece6cffaa8308a92139fd3250
   /usr/share/man/man1/xkbcomp.1.gz    1692    29d16f6c864bf62b7f3e3e0400e121def3f85e6d64676dffb6d92c54503a0c76

Requires:
   libX11.so.6()(64bit)
   libc.so.6()(64bit)
   libc.so.6(GLIBC_2.2.5)(64bit)
   libc.so.6(GLIBC_2.3)(64bit)
   libc.so.6(GLIBC_2.3.4)(64bit)
   libc.so.6(GLIBC_2.4)(64bit)
   libc.so.6(GLIBC_2.7)(64bit)
   libxkbfile.so.1()(64bit)
   rpmlib(CompressedFileNames) < = 3.0.4-1
   rpmlib(FileDigests) <= 4.6.0-1
   rpmlib(PayloadFilesHavePrefix) <= 4.0-1
   rtld(GNU_HASH)
   rpmlib(PayloadIsXz) <= 5.2-1

Obsoletes:
   XFree86
   xorg-x11


Once again, you can see it is far easier to use Python than C when working with RPM package internals.

Before I sign off on this post, I have to say that the Linux world does not need two different development communities around RPM. Yes, competition between the two RPM development communities had lead to some improvements in the RPM Package Manager but it also has lead to well known documented APIs such as rpmdbOpen and rpmdbClose being removed in recent releases. These two communities should bury the ax and merge their efforts. Sometimes forking community projects is good and leads to really useful innovation. In this case it is my opinion that it is not!

I also have to say that the documentation for using the RPM library APIs is frankly atrocious and both RPM developer communities seems to make no serious attempt to inform users of their libraries of major changes in APIs other than obscure references to the changes in their respective release notes. This needs to be corrected. A step in the right direction would be to provide simple examples of API usage when a new API is introduced or an old API removed.

Enjoy!

P.S 1/17/2017. Mandriva no longer exists as a distribution. It ran into financial difficulties in 2015. A number of Mandriva forks, including OpenMandriva and Mageia still exist.

1 comment to Programmatically Retrieve RPM Package Details

  • shipr

    Should one of the parties fully document their API, this would go a very long way toward making the other obsolete, since this would make it possible for developers to write code against that standard while it would remain difficult to do against the other.

    How about it, folks? Can we have some documentation?