Wednesday, May 8, 2013

Some Features are Easy, or, How to Get Involved in Open Source

I made this contribution to wxMaxima today. I was initially going to file a feature request in the tracker and come back to it later, since I'm in the middle of finals right now, but I realized it was a small thing and I already knew how to fix it. (Ironically I'm about to spend an hour writing this blog. Oh well.)

It surprised me that it only took me about 10 minutes. Two attempts to build and run the program with the change and it was working. Submitted pull request.

Don't get me wrong: I'm not trying to brag about this. It was a small thing, and easy to add, especially since I already knew my way around that part of the software from a previous contribution.

The point is that some features are easy once you know your way around the code, and that's really empowering to hear if you're new to open source. Think about software you've worked on for yourself. You know the code really well, and there are a lot of things you change or add as you go that take almost no time at all. Maybe those same things would have taken hours or days for someone who had never seen your code before.


Getting Involved in Open Source Software


There are a lot of ways you can get involved in Open Source, and most people who want to contribute have no idea where to start. So, thought I would write a little guide about how to get involved in Open Source, focusing on the fears that inexperienced developers experience when they first get started.

I'll start with a list of what I believe the biggest of these fears is.
  1. The perception that developers/maintainers are hostile toward new contributors.
  2. The code base is huge and confusing, and it's hard to know where to start.
  3. "I don't have the skills to contribute."
  4. "No one will care about my contribution."
  5. Feeling that your project isn't good enough.



The perception that developers/maintainers are hostile toward new contributors.


This is, unfortunately, true in too many cases. The big open source projects that people know about, like Linux, Java, Eclipse, and GCC, are difficult to get started in because they are so well-known that the maintainers can't devote extra attention to your contributions, or they have plans and aren't very friendly to bug fixes or feature contributions without some kind of intense bureaucracy and decision-making about the future of the project. Even if you file a good bug, and even provide a good fix, it could be months or years before it is finally accepted, if it is not ignored entirely because some procedure wasn't followed.

The best thing I can say to this is: If you want to get Open Source experience quickly, don't get involved in a huge, well-established project with a visible bureaucracy.


The code base is huge and confusing, and it's hard to know where to start.


There's not much you can do about this, especially with large projects. One thing that is nice about GNU projects though is that their philosophy is that every program should do one thing and do it very well. So, getting involved in something like that will likely make this problem seem a lot smaller.

The truth is, though, that when you want to start participating in a project, you're not going to go from zero to expert in no time.

It helps if you have a vested interest in the program. That is, you should know something about what it does, how it does it, and what kind of users there are. If the program is something like GIMP, it would help if you know something about digital graphics or artwork (or any of the other uses of GIMP) and you have used a similar product like Photoshop, which you aspire to help GIMP become. It also helps if you have used a similar program with fewer features or users (or more bugs), so you know what mistakes to avoid and so that you can say you are dedicating your time to a project where your investment will be more visible, and thus, help more people.

After all, contributing to Open Source is largely humanitarian. We work on open source as volunteers because we want to improve the lives of people like us, who use this product and want or need it to be better, to be more stable, or to do more than it currently does. You will be able to do this most effectively if you care about the product you are working on.

So, go out and start using Open Source programs, learn their intricacies, discover and report their bugs, make note of any features you find are missing. Talk to the developers through the issue tracker. Look up existing bugs in the issue tracker and try to reproduce those bugs. If you find that a bug can't be reproduced, comment on the issue and say what version of the project you are using, and give some basic details about your system. Just because it works for you doesn't mean it works for everyone. But knowing that it works for someone may help the developers get an idea of the severity, and may even help track down the issue.

One great place to start learning the code base is to find text in the source code. Menu labels, messages, UI elements, portions of program output. The great thing about static text is that you can search for it in the program. Once you have found it, you can look at how that text is involved in the program, and figure out how interacting with that text is processed by the program. Before you know it, you can find the code you need to change to fix a bug in a Menu function, or begin to add your own Menu Items. Through that process, you'll learn more about the rest of the code. You won't become an expert in everything, but if you stick around long enough, you will be surprised how much you can learn.

The steepest part of the learning curve is finding (and maybe even fixing) your first bug. After that, armed with what you know, it is so much easier to contribute that you may never stop. Welcome to Open Source.


"I don't have the skills to contribute."


Another problem is that people believe that it will be difficult to add a feature they want. They take one look at the code and say, "I have no idea what's going on here!" And they never come back. Well, of course you don't know what's going on. You've never seen the code before. You didn't help design it. It has probably been around for some time, accumulating bug fixes and platform-specific code segments, and all other kinds of nonsense that makes it look utterly frightening at first.

Following the advice in the section above, you may be able to learn exactly where the issue is, or how to add the feature, and then it becomes a lot easier. Something you learn with experience is that if a problem seems really big, just add a layer of abstraction and fill in the details later. It's a lot easier to manage your own code that way, so it follows that it would be a lot easier to think about an unfamiliar code base in a similar way.

If the problem is that you don't know the programming language, or you've never done GUI work before, or you feel like you're missing a programming skill. That's okay. You have time. And you'll never learn unless you get started, so why not try starting now? Start by filing your issue or feature request in the issue tracker for the project. That way, someone else may be able to get to it, or at least you can remember about the feature when you finally get back to it. Then, either go and get experience with whatever you felt you were lacking, or just dive in and start making mistakes. If you're using GitHub, you can always revert your repository to the working state. Don't worry about breaking things.

If you're a Java programmer and you want to get involved in a C++ project, don't worry about it. The syntax is pretty similar. There is stuff you won't know, sure. But you can still probably figure out how to add an if statement to handle that special case that will fix your bug. In a larger project, developers tend to rely less and less on library functions, and more on calls to other parts of the same code base. Take advantage of that. You don't have to learn the STL to add C++ code to a project. Just figure out what function calls to make and you'll be good to go.


"No one will care about my contribution."


It occurred to me that sometimes people are afraid to contribute because they think no one will care about their contributions. This is simply not true, especially when working on projects that have a small number of developers. Those developers can't think of everything. Just as with literature, developers design and write what they know. If it never occurred to them that someone needed a feature, it will never be added. Even if it's a really easy feature to add.

This is the great benefit of crowd-sourcing the users of a project through an issue tracker. You find out what people want, and many of those features are surprisingly easy to implement. You find out bugs that you never would have noticed because your software gets used in a way you never intended, and therefore, never tested.

So, add bugs and feature requests to an issue tracker. That's the easiest way to get involved.

Also, remember that modifying open source isn't just about getting your code added to a project. It's about adding a feature you need, or fixing a bug that is driving you crazy, so that you can continue to use the software as you would like. You should make the contribution in your fork of the repository, and submit a pull request so the main project is aware of your contribution, and then not worry about whether it gets accepted. You've solved your problem for yourself, and eventually others may benefit from it, too.

In wxMaxima, we've had this pull request pending for a while now time. The contributor had realized that there was a problem related to one of his primary use cases, and he wanted to modify the project to fix it. The fix apparently worked for him, but you can see our conversation about how I would have liked it to be made more general so that it doesn't potentially negatively effect other users of wxMaxima. I actually thought it was a good contribution and cloned his branch in my fork so that I could explore it a bit more. Ultimately, the feature didn't seem like a big deal to me, and would take a lot of effort to make viable for integration, so I've focused my energy on other features.

But there's an example of a guy who fixed his own problem and didn't worry about whether it was accepted to the main repository at the end of the day.

A lot of people forget that the original reason for Open Source was not simply to allow public development on the same project, but to allow people to take source code and put it to use for themselves. Adding your feature in a branch of the project which belongs only to you is a perfectly reasonable thing to do.

Don't be afraid to be the developer who made it happen for himself.


Feeling that your project isn't good enough.


If you have created a project that you think other people could use, and you don't foresee a way to make profit from it, make it open source--allow the world to benefit. If you "only" get 10 users, you didn't fail. You helped 10 people, which is so much more than you would have done if you let the project rot on your hard drive.

Now you have 10 users and a reason to keep adding features and fixing bugs. Eventually you might have 100 users, and some other developers might join in. Doesn't that sound awesome?