Thursday, October 06, 2016 • Estimated Reading Time: 6 minutes
The python-bugzilla
library is used for interaction with Bugzilla. It's not tied to any
specific Bugzilla instance and works across various versions. It's not the
most intuitive library to use and the documentation is pretty much
non-existent. What follows is a couple of gotchas that I've been bitten by
when using this library.
As a prerequisite, let's initialize a Bugzilla object to work with throughout the examples:
>>> from bugzilla import RHBugzilla
>>> url = 'https://bugzilla.redhat.com'
>>> bz = RHBugzilla(url=url, user=u, password=p)
Note: in these examples I'm using Red Hat's Bugzilla instance but they should apply to any other Bugzilla instance out there.
Let's say you want to query your Bugzilla instance for a set of bugs.
We can do that with the getbugs()
call:
>>> BZ.getbugs([1291176, 1312219])
[<Bug #1291176 on https://bugzilla.redhat.com/xmlrpc.cgi at 0x2b50f50>,
<Bug #1312219 on https://bugzilla.redhat.com/xmlrpc.cgi at 0x2b50190>]
The returned bugs include a whole lot of information that we may not need
(you can run dir()
on one of the returned bugs to see all the data it
includes). This is especially important if you're trying to pull
information for a large set of bugs. We can limit the data that is returned
with the include_fields
parameter:
>>> res = bz.getbugs([1291176, 1312219], include_fields=['id', 'status'])
>>> res[0].status
'CLOSED'
Gotcha #1: you may be inclined to think that if you only care about the
statuses of the queried bugs, you can limit the include_fields
parameter
to only return status
. However, doing so results in a not so helpful
KeyError: 'id'
error. This is because the id
field is a
mandatory value.
Gotcha #2: if for some reason you try to access an attribute of a bug that
has not been originally queried for, you may be surprised that the
operation succeeds and the value is returned. What's the point of
include_fields
then you may ask. The fact is that when you query a bug
for an attribute it doesn't have loaded, it requests it from Bugzilla for
you (along with the rest of the attributes) and returns it as if nothing
happened:
>>> res = bz.getbugs([1291176, 1312219], include_fields=['status', 'id'])
>>> 'priority' in dir(res[0]), len(dir(res[0]))
(False, 52)
>>> res[0].priority
'high'
>>> 'priority' in dir(res[0]), len(dir(res[0]))
(True, 126)
Lesson learned: always double check that you're not using any attributes that you haven't initially requested. Otherwise you'll trigger a separate request for each bug when you access an unknown attribute.
If you're interested in querying your Bugzilla instance for comment #0 (the
initial comment, also called Description) for a large number of bugs, it's
smart to do so with a single request rather than iterating over a list of
bugs one by one. The get_comments()
function allows us to do just that:
>>> res = bz.get_comments([1291176, 1312219])
>>> type(res)
<type 'dict'>
Note that the returned object is a dictionary with two keys: bugs
and
comments
. I have yet to figure out what the comments
dictionary is
supposed to contain (hint: not comments, those are in bugs
). The bugs
dictionary contains the queried bugs and their comments; each comment is
accompanied by its metadata such as creation time, author, text, etc.
To get to comment #0, use:
>>> res['bugs']['1291176']['comments'][0]['text']
'The following flaw, reported by...
Gotcha #3: Curiously, bugs exist that do not have comment #0! For example,
a bug in Novell's Bugzilla instance:
873992. Thus, you
should always check if the returned res['bugs']
is an empty dictionary
before trying to extract any comments out of it.
So since we have a fancy function that returns comments for a list of bugs,
let's try and run it on a thousand bugs! Uh, no... I quickly learned that
literally no Bugzilla instance can handle a response that big and even if
it could, it would take a long long time. If you find yourself wanting to
pull a large number of comments, you can split out your load over several
requests (adjust bulk
as needed):
bulk = 100
for idx in range(0, len(bugs), bulk):
res = bz.get_comments(bugs[idx:idx + bulk])['bugs']
(Side note: please be mindful that you don't hammer a Bugzilla instance of your choice with non-stop requests that have ridiculously large responses like the one above)
If you need to build out a very specific bug query, it helps if you can use
the web UI to do this. You may end up with a list of bugs and a pretty long
URL. Now, if you want to use that same query in a script, you can use a
handy url_to_query
method to get the dictionary representation of that
query:
>>> url = ('https://bugzilla.redhat.com/buglist.cgi?bug_status=NEW'
'&classification=Red%20Hat&component=openssl'
'&product=Red%20Hat%20Enterprise%20Linux%207&query_format=advanced')
>>> bz.url_to_query(url)
{'bug_status': 'NEW', 'product': 'Red Hat Enterprise Linux 7', 'component': 'openssl',
'query_format': 'advanced', 'classification': 'Red Hat'}
>>> bz.query(bz.url_to_query(url))
[<Bug #1255248...
Similarly, you can use the build_query()
function to build out a
dictionary that can be passed to the query method. Very handy!