- 1 Introduction
- 2 Debugging using GDB
- 3 Finding file access errors using strace
- 4 Handling emerge Errors
- 5 Searching Using Bugzilla
- 6 Reporting Bugs
- 7 Working With Your Bug
- 8 Testing Ebuilds
- 9 Acknowledgements
One of the factors that delay a bug being fixed is the way it is reported. By creating this guide, we hope to help improve the communication between developers and users in bug resolution. Getting bugs fixed is an important, if not crucial part of the quality assurance for any project and hopefully this guide will help make that a success.
You are emerging a package or working with a program and suddenly the worst happens -- you hit a bug. Bugs come in many forms like emerge failures or segmentation faults. Whatever the cause, the fact still remains that such a bug must be fixed. Here are a few examples of such bugs.
- A run-time error
- An emerge failure
These errors can be quite troublesome. However, once you find them, what do you do? The following sections will look at two important tools for handling run time errors. After that, we'll take a look at compile errors, and how to handle them. Let's start out with the first tool for debugging run time errors --
Debugging using GDB
GDB, or the (G)NU (D)e(B)ugger, is a program used to find run time errors that normally involve memory corruption. First off, let's take a look at what debugging entails. One of the main things you must do in order to debug a program is to
emerge the program with
FEATURES="nostrip" . This prevents the stripping of debug symbols. Why are programs stripped by default? The reason is the same as that for having gzipped man pages -- saving space. Here's how the size of a program varies with and without debug symbol stripping.
Just for reference, bad_code is the program we'll be debugging with
gdb later on. As you can see, the program without debugging symbols is 3140 bytes, while the program with them is 6374 bytes. That's close to double the size! Two more things can be done for debugging. The first is adding
ggdb to your CFLAGS and CXXFLAGS. This flag adds more debugging information than is generally included. We'll see what that means later on. This is how /etc/portage/make.conf might look with the newly added flags.
Lastly, you can also add debug to the package's USE flags. This can be done with the package.use file:
Then we re-emerge the package with the modifications we've done so far as shown below.
Now that debug symbols are setup, we can continue with debugging the program.
Running the program with GDB
Let's say we have a program here called "bad_code". Some person claims that the program crashes and provides an example. You go ahead and test it out:
It seems this person was right. Since the program is obviously broken, we have a bug at hand. Now, it's time to use
gdb to help solve this matter. First we run
--args , then give it the full program with arguments like shown:
You should see a prompt that says "(gdb)" and waits for input. First, we have to run the program. We type in
run at the command and receive a notice like:
Here we see the program starting, as well as a notification of SIGSEGV, or Segmentation Fault. This is GDB telling us that our program has crashed. It also gives the last run function it could trace when the program crashes. However, this isn't too useful, as there could be multiple strcpy's in the program, making it hard for developers to find which one is causing the issue. In order to help them out, we do what's called a backtrace. A backtrace runs backwards through all the functions that occurred upon program execution, to the function at fault. Functions that return (without causing a crash) will not show up on the backtrace. To get a backtrace, at the (gdb) prompt, type in
bt . You will get something like this:
You can notice the trace pattern clearly. main() is called first, followed by run_it(), and somewhere in run_it() lies the strcpy() at fault. Things such as this help developers narrow down problems. There are a few exceptions to the output. First off is forgetting to enable debug symbols with
FEATURES="nostrip" . With debug symbols stripped, the output looks something like this:
This backtrace contains a large number of ?? marks. This is because without debug symbols,
gdb doesn't know how the program was run. Hence, it is crucial that debug symbols are not stripped. Now remember a while ago we mentioned the -ggdb flag. Let's see what the output looks like with the flag enabled:
Here we see that a lot more information is available for developers. Not only is function information displayed, but even the exact line numbers of the source files. This method is the most preferred if you can spare the extra space. Here's how much the file size varies between debug, strip, and -ggdb enabled programs.
As you can see, -ggdb adds about 13178 more bytes to the file size over the one with debugging symbols. However, as shown above, this increase in file size can be worth it if presenting debug information to developers. The backtrace can be saved to a file by copying and pasting from the terminal (if it's a non-x based terminal, you can use gpm. To keep this doc simple, I recommend you read up on the documentation for gpm to see how to copy and paste with it). Now that we're done with
gdb , we can quit.
This ends the walk-through of
gdb . Using
gdb , we hope that you will be able to use it to create better bug reports. However, there are other types of errors that can cause a program to fail during run time. One of the other ways is through improper file access. We can find those using a nifty little tool called
Finding file access errors using strace
Programs often use files to fetch configuration information, access hardware or write logs. Sometimes, a program attempts to reach such files incorrectly. A tool called
strace was created to help deal with this.
strace traces system calls (hence the name) which include calls that use the memory and files. For our example, we're going to take a program foobar2. This is an updated version of foobar. However, during the change over to foobar2, you notice all your configurations are missing! In foobar version 1, you had it setup to say "foo", but now it's using the default "bar".
Our previous configuration specifically had it set to foo, so let's use
strace to find out what's going on.
Using strace to track the issue
strace log the results of the system calls. To do this, we run
strace with the -o[file] arguments. Let's use it on foobar2 as shown.
This creates a file called strace.log in the current directory. We check the file, and shown below are the relevant parts from the file.
Aha! So There's the problem. Someone moved the configuration directory to .foobar2 instead of .foobar . We also see the program reading in "bar" as it should. In this case, we can recommend the ebuild maintainer to put a warning about it. For now though, we can copy over the config file from .foobar and modify it to produce the correct results.
Now we've taken care of finding run time bugs. These bugs prove to be problematic when you try and run your programs. However, run time errors are the least of your concerns if your program won't compile at all. Let's take a look at how to address
emerge compile errors.
Handling emerge Errors
emerge errors, such as the one displayed earlier, can be a major cause of frustration for users. Reporting them is considered crucial for maintaining the health of Gentoo. Let's take a look at a sample ebuild, foobar2, which contains some build errors.
Evaluating emerge Errors
Let's take a look at this very simple
The program is compiling smoothly when it suddenly stops and presents an error message. This particular error can be split into 3 different sections, The compile messages, the build error, and the emerge error message as shown below.
The compilation messages are what lead up to the error. Most often, it's good to at least include 10 lines of compile information so that the developer knows where the compilation was at when the error occurred.
Please make sure you always include error messages in English, even when your system language is set to something else. You can temporarily switch to English locale by prepending
LC_ALL=C to the emerge command like this:
Make errors are the actual error and the information the developer needs. When you see "make: ***", this is often where the error has occurred. Normally, you can copy and paste 10 lines above it and the developer will be able to address the issue. However, this may not always work and we'll take a look at an alternative shortly.
The emerge error is what
emerge throws out as an error. Sometimes, this might also contain some important information. Often people make the mistake of posting the emerge error and that's all. This is useless by itself, but with make error and compile information, a developer can get what application and what version of the package is failing. As a side note, make is commonly used as the build process for programs ( but not always ). If you can't find a "make: ***" error anywhere, then simply copy and paste 20 lines before the emerge error. This should take care of most all build system error messages. Now let's say the errors seem to be quite large. 10 lines won't be enough to catch everything. That's where PORT_LOGDIR comes into play.
emerge and PORT_LOGDIR
PORT_LOGDIR is a portage variable that sets up a log directory for separate emerge logs. Let's take a look and see what that entails. First, run your emerge with PORT_LOGDIR set to your favorite log location. Let's say we have a location /var/log/portage . We'll use that for our log directory:
Now the emerge fails again. However, this time we have a log we can work with, and attach to the bug later on. Let's take a quick look at our log directory.
The log files have the format [category]:[package name]-[version]:[date].log. A quick look at the log file will show the entire emerge process. This can be attached later on as we'll see in the bug reporting section. Now that we've safely obtained our information needed to report the bug we can continue to do so. However, before we get started on that, we need to make sure no one else has reported the issue. Let's take a look at searching for bugs.
Searching Using Bugzilla
Bugzilla is what we at Gentoo use to handle bugs. Gentoo's Bugzilla is reachable by HTTPS and HTTP. HTTPS is available for those on insecure networks or simply paranoid :). For the sake of consistency, we will be using the HTTPS version in the examples to follow. Head over to Gentoo Bugzilla to see how it looks.
One of the most frustrating things for developers and bug-wranglers is finding duplicate bug reports. These cost them valuable time that they could otherwise use to work on more important bugs. Often, this can be prevented by a few simple search methods. So we're going to see how to search for bugs and find out if you have one that's similar. For this example, we're going to use the xclass emerge error that was used earlier.
So to begin searching, we head over to the Bugzilla Homepage .
We'll click on "Query Existing bug reports". The reason why we choose this over the basic bug search is because the basic bug search tends to give vague results and often hinders users from looking through the results and finding the duplicate bug. Once we click on the query screen, we reach the next page:
Proceed by clicking on the "Advanced Search" link to bring up the Advanced Search page. While it may seem overwhelming at first, we're going to look at a few simple areas to narrow down the rather vague searches bugzilla returns.
- The first field is the summary of the bug. Here we're simply going to put the name of the package that's crashing. If bugzilla doesn't return results, try removing the package name, just in case someone didn't put that in the summary (highly unlikely, but we've seen a fair share of strange bug reports).
- Product, Component, and Version should all be set to the default. This prevents us from being too specific and missing all the bugs.
- Comment is the important part. Use the comment field to list what appears to be a specific instance of the error. Basically, don't use anything like the beginning of the build error, find a line that's before it stating a true error. Also, you'll want to filter out any punctuation to prevent bugzilla from interpreting the results the comment the wrong way.
Let's look at our example from the xclass emerge error again, and notice that it is specific enough to where we'll find the bug without wading through other xclass compile failure candidates:
- URI, Whiteboard, and Keywords can all be left alone. What we've entered so far should be enough to find our bug. Let's take a look at what we have filled out.
Now we click on the Search button and look at the results. If our search criteria are specific enough, then that's a lot easier to deal with. Chances are that the issue we found on bugzilla is exactly the problem we've hit, and that it has also been resolved. By checking the last comment we see the solution and know what to do in order to resolve it.
Let's say that you have searched and searched but still can't find a bug. You've found yourself a new bug. Let's take a look at the bug reporting process for submitting your new bug.
In this chapter, we'll figure out how to use Bugzilla to file a shiny, new bug. Head over to Gentoo Bugs and click on "Report a Bug - Using the guided format". As you can see, major emphasis has been placed on putting your bug in the right place. Gentoo Linux is where a large majority of bugs go.
Despite this, some people will file ebuild bugs in portage development (assumption that portage team handles the portage tree) or infra (assumption that infra has access to mirrors and rsync and can fix it directly). This is simply not how things work.
Another common misconception occurs with our Documentation bugs. For example, a user finds a bug with the Catalyst Docs . The general tendency is to file a bug under Docs-user, which gets assigned to the GDP , when it should actually go to a member of the Release Engineering team. As a rule of thumb, only documentation under http://www.gentoo.org/doc/* is under the GDP. Anything under http://www.gentoo.org/proj/* is under the respective teams.
Our bug goes in Gentoo Linux as it's an ebuild bug. We head over there and are presented with the multi-step bug reporting process. Let us now proceed with Step 1...
The first step here is really important (as the red text tells you). This is where you search to see that someone else hasn't hit the same bug you have, yet. If you do skip this step and a bug like yours already exists, it will be marked as a DUPLICATE thus wasting a large amount of QA effort. To give you an idea, the bug numbers that are struck out above are duplicate bugs. Now comes step 2, where we give the information.
Let us take a closer look at what's what.
- First, there's the Product. The product will narrow down the bug to a specific area of Gentoo like Bugzilla (for bugs relating to bugs.gentoo.org), Docs-user(for User Documentation) or Gentoo Linux (for ebuilds and the like).
- Component is where exactly the problem occurs, more specifically which part of selected product the bug comes under. This makes classification easier.
- Hardware platform is what architecture you're running. If you were running SPARC, you would set it to SPARC.
- Operating System is what Operating System you're using. Because Gentoo is considered a "Meta-distribution", it can run on other operating systems beside Linux.
So, for our example bug, we have :
- Product - Gentoo Linux (Since it is an ebuild issue)
- Component - Application (It is an application at fault, foobar2)
- Hardware Platform - All (This error could occur across architectures)
- Operation System - All (It could occur on all types of systems)
- Build Identifier is basically the User Agent of the browser that is being used to report the bugs (for logging purposes). You can just leave this as is.
- URL is optional and is used to point to relevant information on another site (upstream bugzilla, release notes on package homepage etc.). You should never use URL to point to pastebins for error messages, logs,
emerge --infooutput, screenshots or similar information. Instead, these should always be attached to the bug.
- In the Summary, you should put the package category, name, and number.
Not including the category in the summary really isn't too bad, but it's recommended. If you don't include the package name, however, we won't know what you're filling a bug for, and will have to ask you about it later. The version number is important for people searching for bugs. If 20 people filed bugs and not one put a version number, how would people looking for similar bugs be able to tell if one was there's? They'd have to look through every single bug, which isn't too hard, but if there are say, 200 bugs.. it's not that easy. After all the package information, you'll want to include a small description of the incident.
These simple rules can make handling bugs a lot easier. Next are the details in which we put in the information about the bug. With this in place, the developer knows why we're filing the bug. They can then try to reproduce it. Reproducibility tells us how often we were able to make the problem recur.
The next step is to explain what were the results we got and what we think they should actually be. We can then provide additional information. This could be things such as stack traces, sections (since the whole log is usually big and of not much use) of strace logs, but most importantly, your
emerge --info output.
Lastly we select the severity of the bug. Please look this over carefully. In most cases it's OK to leave it as is and someone will raise/lower it for you. However, if you raise the severity of the bug, please make sure you read it over carefully and make sure you're not making a mistake. A run down of the various levels is given below.
- Blocker - The program just plain doesn't want to emerge or is a major problem to the system. For example a
baselayoutissue which prevents a system from booting up would be a sure candidate to be labeled blocker.
- Critical - The program has loss of data or severe memory leaks during runtime. Again, an important program like say
net-toolsfailing to compile could be labeled critical. It won't prevent the system from starting up, but is quite essential for day to day stuff.
- Major - The program crashes, but nothing that causes your system severe damage or information loss.
- Minor - Your program crashes here and there with apparent workarounds.
- Normal - The default. If you're not sure leave it here unless it's a new build or cosmetic change, then read below for more information.
- Trivial - Things such as a misspelled word or whitespace clean up.
- Enhancement - A request to enable a new feature in a program, or more specifically new ebuilds .
Now we can submit the bug report by clicking on the Submit Bug Report box. You will now see your new bug come up. See Bug 97561 for what the result looks like. We've reported our bug! Now let's see how it's dealt with.
Zero-day bump requests
So far, we've shown what to do when filing a bug. Now let's take a look at what not to do.
Suppose that you've eagerly been following an upstream project's schedule, and when you check their homepage, guess what? They just released a new version a few minutes ago! Most users would immediately rush over to Gentoo's bugzilla to report the new version is available; please bump the existing version and add it to Portage, etc. However, this is exactly what you should not do. These kinds of requests are called zero-day (or 0-day) bump requests, as they're made the same day that a new version is released.
Why should you wait? First, it's quite rude to demand that Gentoo developers drop everything they're doing just to add a new release that came out 15 minutes ago. Your zero-day bump request could be marked as INVALID or LATER, as developers have plenty of pressing issues to keep them busy. Second, developers are usually aware of pending new releases well in advance of users, as they must follow upstream quite closely. They already know a new version is on its way. In many cases, they will have already opened a bug, or might even already added it in Portage as a masked package.
Be smart when testing and requesting new versions of packages. Search bugzilla before posting your bump request -- is there already a bug open? Have you synced lately; is it already in Portage? Has it actually been released by upstream? Basic common sense will go a long way, and will endear you to developers that already have a lot to do. If it's been several days since release and you're sure that there are no open requests for it (and that it's not in Portage), then you can open up a new bug. Be sure to mention that it compiles and runs well on your arch. Any other helpful information you provide is most welcome.
Want to see the newest version of your favorite package in Portage? File smart bugs.
Working With Your Bug
Looking at the bug, we see the information we provided earlier. You will notice that the bug has been assigned to firstname.lastname@example.org. This is the default location for Application component bugs. The details we entered about the bug are available as well.
However, bug-wranglers (usually) won't fix our bugs, so we'll reassign it to someone that can (you can let bug-wranglers re-assign it for you as well). For this we use the package's metadata.xml. You can normally find them in /usr/portage/category/package/metadata.xml.
Notice the maintainer section. This lists the maintainer of the package, which in this case is Chris White. The email listed is email@example.com. We will use this to re-assign the bug to the proper person. To do this, click the bubble next to Reassign bug to, then fill in the email.
Then hit the Commit button for the changes to take place. The bug has been reassigned to the correct developer. Shortly afterward, you notice (by email usually) that the developer responded to the bug. For instance, he might have asked to see an strace log to figure out how the program is trying to reach the configuration file. You follow the instructions on using strace and obtain an strace log. Now you need to attach it to the bug. In order to do this, click on "Create A New Attachment":
- File - This is the location of the file in your machine. In this example, the location of strace.log . You can use the "Browse..." button to select the file, or enter the path directly in the text field.
- Description - A short one liner, or a few words describing the attachment. We'll just enter strace.log here, since that's quite self-explanatory.
- Content Type - This is the type of the file we're attaching to the bug.
- Obsoletes - If there were attachments submitted to the bug before the current one, you have an option of declaring them obsoleted by yours. Since we have no prior attachments to this bug, we need not bother.
- Comment - Enter comments that will be visible along with the attachments. You could elaborate on the attachment here, if needed.
With respect to Content Type, here are a few more details. You can check the "patch" check box if you're submitting a patch. Otherwise, you could ask Bugzilla to "auto-detect" the file type (not advisable). The other options are "select from list", which is most frequently used. Use plain text (text/plain) for most attachments except binary files like images (which can use image/gif, image/jpeg or image/png depending on type) or compressed files like .tar.bz2 which would use application/octet-stream as content type.
Submit strace.log and it is reflected on the bug report.
We've mentioned before that sometimes ebuilds will tell you to attach a file in the emerge error. An example can be seen below.
Please attach any file mentioned like this to your bug report.
Sometimes a developer might ask you to attach a diff or patch for a file. Standard diff files can be obtained through:
For C/C++ source files, the -p flag is added to show what function calls the diff applies to:
The documentation team will require the flag combination -Nt as well as -u . This mainly has to do with tab expansion. You can create such a diff with:
And your diff is created. While we're doing all this, suppose another person finds your bug by searching through bugzilla and is curious to keep track of the bug, they may do so by putting their email in the Add CC field of the bug as shown below. You could also keep track of other bugs by following the same method.
After all this work, the bug can undergo various status markings. This is usually done by the Gentoo Developers and sometimes by the reporter. The following are the various possible states a bug may go through during its lifetime.
- UNCONFIRMED - You're generally not going to see this too often. This means that a bug reporter has opened a bug using the advanced method and is uncertain his or her bug is an actual bug.
- NEW - Bugs that are first opened are considered new.
- ASSIGNED - When the person you've assigned the bug too validates your bug, it will often receive ASSIGNED status while they figure out the issue. This lets you know that they've accepted your bug as a real bug.
- REOPENED - Someone has resolved a bug and you think the solution is not feasible or the problem still persists. At this point, you may re-open the bug. Please do not abuse this . If a developer closes the bug a second or third time, chances are that your bug is closed.
- RESOLVED - A firm decision has been taken on the bug. Usually goes onto FIXED to indicate the bug is solved and the matter closed although various other resolutions are possible. We'll look into those a little later.
- VERIFIED - The steps take to work the bug are correct. This is usually a QA thing.
- CLOSED - Basically means RIP for the bug and it's buried under the never ending flow of new bugs.
Now shortly afterward, we find the error in the strace log and fix the bug and mark it as RESOLVED FIXED and mention that there was a change in the location of configuration files, and that I will update the ebuild with a warning about it. The bug now becomes resolved.
If you open the bug, you'll notice you can still change the bug status. For instance, there is a link to REOPEN. This gives you the option of Reopening the bug if you wish to (i.e. the developer thinks it's resolved but it's really not to your standards).
The following is an overview of possible resolutions:
- FIXED - The bug is fixed, follow the instructions to resolve your issue.
- INVALID - You did not do something specifically documented, causing the bug.
- DUPLICATE - You didn't use this guide and reported a duplicate bug.
- WORKSFORME - Developer/person assigned the bug cannot reproduce your error.
- CANTFIX - Somehow the bug cannot be solved because of certain circumstances. These circumstances will be noted by the person taking the bug.
- WONTFIX - This is usually applied to new ebuilds or feature requests. Basically the developer does not want to add a certain feature because it is not needed, a better alternative exists, or it's just plain broken. Sometimes you may be given a solution to get said issue resolved.
- UPSTREAM - The bug cannot be fixed by the Gentoo development team, and have requested you take the problem upstream (the people that actually made the program) for review. Upstream has a few ways of handling bugs. These include mailing lists, irc channels, and even bug reporting systems. If you're not sure how to contact them, ask in the bug and someone will point you to the right direction.
Sometimes, before the bug can be resolved, a developer may request that you test an updated ebuild. In the next chapter we'll take a look at testing ebuilds.
Getting The Files
Let's say that you reported a bug for the foobar2 compile fix from earlier. Now developers might find out what the problem is and might need you to test the ebuild for them to be sure it works for you as well:
Some rather confusing vocabulary is used here. First off, let's see what an overlay is. An overlay is a special directory like /usr/portage , the difference being that when you
emerge sync , files contained within it will not be deleted. Luckily, a special /usr/local/portage directory is created for that purpose. Let's go ahead and set our portage overlay in /etc/portage/make.conf . Open make.conf up in your favorite editor and add this towards the end.
Now we'll want to create the appropriate directories to put our test ebuild files in. In this case, we're supposed to put them in sys-apps/foobar2. You'll notice that the second comment asks for a files directory for the patch. This directory holds other required files that aren't included with the standard source archive (patches, init.d scripts, etc). This is a subdir in the package directory called files . Go ahead and create these directories:
Ok now, we can go ahead and download the files. First, download the ebuild into /usr/local/portage/sys-apps/foobar2 , and then add the patch to /usr/local/portage/sys-apps/foobar2/files . Now that we have the files, we can begin working on testing the ebuild.
Testing The ebuild
The process to create an ebuild that can be used by emerge is fairly simple. You must create a Manifest file for the ebuild. This can be done with the ebuild command. Run it as shown.
Now let's test to see if it works as it should.
It does seem to have worked! You'll notice the  next to the [ebuild] line. That points to /usr/local/portage , which is the overlay we created earlier. Now we go ahead and emerge the package.
In the first section we see that the emerge started off as it should. The second section shows our patch being applied successfully by the "[ ok ]" status message to the right. The last section tells us the program compiled ok. The patch works! Now we can go and let the developer know that their patch works fine, and that they can commit the fix to portage.
This concludes the howto on working with Bugzilla. Hopefully you find this useful.
We would like to thank the following authors and editors for their contributions to this guide:
- Chris White
- Shyam Mani
Special thanks go to moreon for his notes on -g flags and compile errors, the people at #gentoo-bugs for helping out with bug-wrangling, Griffon26 for his notes on maintainer-needed, robbat2 for general suggestions and fox2mike for fixing up the doc and adding stuff as needed.