Hacking on Hacktoberfest

Hacking on Hacktoberfest

Have you ever wanted to contribute to Open Source software but weren’t sure where to start?

Do you like winning T-Shirts?

Then I have something for you!

What is it?

Hacktoberfest is a month long celebration of open source software and specifically about contributing to the packages we all use.

It is a great chance to jump into open source and contribute back to it! It is a great way to improve projects or to learn a new skill. The best part is you don’t have to be an expert or a developer to make a contribution.

How can I help out?

It is super easy!

One misconception is that you have to be a developer in order to contribute. This isn’t the case! There’s plenty of need for documentation, testing, and just plain discussions.

This year I decided to try and learn a new language by working through an issue (or two) for Hacktoberfest. I have been really curious about
and JWT. I noticed that the Clojure library for JWT had some features that were missing.

As it happened, I also was able to do some Pull Requests for some work related libraries which was a really nice bonus!

How do I find things to work on?

If there’s any programs or libraries you use on a regular basis, I would encourage you to go check those out first. The issues tab and see if there’s anything there that looks interesting!

Some project will mark issues with special tags like “hacktoberfest” or “beginner” and those are great issues to look at first. They typically are things that would really help the project so if you can jump in there it would make the world a little bit better.

One of the lesser known things about GitHub is the powerful search engine. It can be used to find a lot of different things such as specific code chunks, or conversations about a topic, or even specific types of files!

We can use this to find all kinds of interesting and easy-to-work-on problems.

For example you can search for issues labeled “hacktoberfest” to see things that maintainers have specifically marked as being a good for contributing.

Searching pro-tips

Here’s a good base search to help get you started:


This search will look for any open issues that are labeled “hacktoberfest”. As I type this there are over 29,000 open issues!

To help narrow it down, try selecting a language from the list on the left side of the screen. This will narrow the list significantly to only the language you are interested in.

I used that to look for Clojure and Python issues which helped me get to interesting problems faster.

Bonus tip: If you are nervous about contributing code, many project also welcome documentation updates! Try searching for commonly misspelled words and you will find tons of repositories that you could make a pull request to.

Wrapping up

Open Source software is awesome. If you know it or not, you are probably using it right now.

Why not try and contribute to it? Its easy, and can be a lot of fun!

And winning a free t-shirt is always awesome. 🙂

Testing AppEngine cron jobs locally

Lately I’ve been doing a lot with Google AppEngine. It has a lot of great features, but to get those you need to give up a few things. Sadly I discovered that included the ability to locally run “protected” API endpoints. At least until I discovered this one strange trick to make everything work…

The setup

So AppEngine applications need an app.yaml file that defines a lot of things needed to run the code. It is also defines the routing for the app’s endpoints, and who is allowed to access them. (Basically either administrators, or the whole world)

My app is making use of the cron.yaml file to periodically ping certain endpoints in the app. The catch is that I don’t want just anyone hitting those endpoints, a bad actor could hammer that sensitive endpoint and kill my API access.

Did someone say "Bad Actor"?
Did someone say “Bad Actor”?

Thankfully, Google recognized this and allows you to setup endpoints in the app.yaml file with a login: parameter. Setting this to “admin” tells AppEngine that only logged in users who have admin rights to the domain are allowed to hit that end point.

Yay! I don’t have to write any custom login/user management code. But….

The problem

If you are running the code locally, say doing development, you are probably going to need to hit those end points to make sure the damn thing is working. Right?

Well, the dev_appserver.py script doesn’t know about who is and isn’t logged into Google… because it is only running on localhost! Therefore having the login set to “admin” means you will never be able to access that endpoint.

Boo Hoo, HTTP 302 for you.

So, what do we do? Commenting out the login: field will let you access it locally, but what if you accidently deploy that into production? (Spoiler alert: You are :screwed:)

Run to the console

Although dev_appserver.py is the cause of our problems, it also turns out to be the solution too!

When dev_appserver.py boots, it not only starts your app, but it also starts a lightweight admin app too. This app by default runs on localhost:8000 and provides all kinds of useful tools like a DataStore viewer and… a cron utility!

Going to localhost:8000/cron brings up a page that lists all of the (AppEngine application) registered cron jobs, what schedule they are setup to run on, and…. wait for it… a…. button to kick off that job!

Yes, by clicking on that button the admin console will trigger your cron job for you so that you can run and see the results locally! Yay for debugging locally not in production!

Other tricks

The admin console is pretty awesome and has lots of other useful tricks up it sleeves. Here’s some of what I use it for:
* Doing quick checks on entities stored in the DataStore
* Faking incoming XMPP and SMTP messages (I’ve never tried this, but it looks pretty cool for one off testing)
* A memcache viewer/editor
* An interactive console

That last one is pretty sweet. Since I can’t seem to startup an IPython terminal AND connect it up to my app, this is the next best thing. From the webpage it will let you type in some Python code and it will execute it for you.

Perfect for those times when you just want to delete all of your entries because you had a horrible misspelling in one of the field names.

Not that I’ve ever done that.

If you are curious to see the app I built using AppEngine, check out RemoteMatcher! It is a remote job aggregator that scans a bunch of job sites and only emails you the ones that match your interests. No more scanning tons of boards, instead just check your inbox for the best matches.

The curse of knowledge: Finding os.getenv()

Recently I was working with a co-worker on an unusual nginx problem. While working on the nginx issue we happened to look at some of my Python code. My co-worker normally does not do a lot of Python development, she tends to do more on the node.js side. But this look at the Python code lead to a rather interesting conversation.

The code we were looking at had some initialization stuff that made my coworker said “Hey why are using os.environ.get() in order to read in some environment variables?” She asked “Why aren’t you using os.getenv()?” I stared blankly for a second and said “huh?”

I was a bit puzzled by this question because this developer is really good with node and also with Ruby. Perhaps they were thinking of a command in a different language and not Python I thought to myself. Together we looked it up real quick and much to my surprise I discovered there actually was a command there in the standard library called os.getenv() and it does exactly what you think it would. It gets a environment variable if it exists, and returns None (or a specified value) if it doesn’t exist.

Using os.getenv() is a few characters shorter than using os.environ.get() and in the code we were looking at it just looked better. Since the code didn’t need to modify the environment variables, it just made sense to use it. But it got me thinking: I’ve been working in Python for a few years now, how did I not know about this?

You don’t know what you don’t know

For me this was a real educational moment. It is very easy to think that we know it all, especially with things that you use day-in and day-out. But, you should never think that you know everything about a language even if you are an expert. There are people around you who, even though they might be experts in different languages or technology, still have something interesting to offer to you and your code.

Have a conversation with someone who is either junior or senior to your skill level. Very quickly one of you will discover something new. For example, the junior person could discover a new approach to solving a problem. And a senior person can get a new perspective.

The curse of knowledge: how I discovered os.getenvThe second situation is one that I really identify with. As you become more “senior” in most things you begin to suffer from “the curse of knowledge”. This means your knowledge advances to a point where you can no longer realize that something is beyond a beginner. The danger with that is that you develop a new set of assumptions about everything and you stop questioning things in the manner you used to.

If you are not aware of this, it can lead to some nasty things. (Think arrogance, blind spots in the code/system, etc.) It also can lead to conversations that unintentionally intimidate others from participating in your development process in an effective manner. No matter how you slice it, this is a very bad thing.

Having a second set of eyes, especially those that come from a different background, can really help surface issues in your code. That is always useful. In this case I was very fortunate and was able to get some insight into code that was working but perhaps a little bit inefficient. Now I have code that looks a lot better when it gets to the code review.

Learn from this

So, today go and talk with someone who has different areas of knowledge or experience levels than you. Something good will probably come of it soon.


pip and private repositories: vendoring python

At work I am working on a project to migrate a series of Python apps into the cloud. Docker is a perfect fit for some of the apps, but one problem we ran into is getting our apps to build when they have a dependency on a private repository. Using a technique called vendoring we are able to work around this problem and ensure that our dependencies are well known. Let’s look at vendoring python code.

Vendoring Python: The basic problem

When docker builds an image we have it execute pip install -r requirements.txt to have install all of our Python dependencies. Inside of our requirements.txt file we have the normal dependencies like this:


But we also have some dependencies that live in private repositories and those have entries that look like this:

-e git+https://github.com/company-name/private-python-utils.git

This line tells pip to go to github and pull down that project. The catch is that for a private repo pip has to access to an ssh key that has access. If you run pip from the command line the operating system will supply that ssh key and pip is able to install the project.

When docker runs pip, it does not have access to those ssh keys. As a result, the pip install fails because it can’t see the repository.

Python vendoring: put your dependency in a safe place!

Source: https://flic.kr/p/52ZAMB

Shopping local with people you trust

It might be possible to add a key to docker to allow it access, but then this becomes a management pain: every thing that tries to run docker build is going to have to be setup with that key. (Think about CI services, new developers, etc.)

Instead, a better solution is to “vendor” the code. This means taking a specific snapshot of the project and putting it into your project. As in checking it into git. I first saw this technique being used by people in the Go Lang community. They were doing it as a way to guarantee they were working with a “known” piece of code. (“known” meaning that they had done a security audit on it, etc.)

Let’s walk through the high level steps and then discuss the reasons and details.

Package up the dependency

In Python, there is a special file called setup.py that lives in the root directory of a project. For libraries this is a useful file to have, it describes the project and its dependencies. (Side note: if you are going to put a project into pypi.python.org having this file is a requirement)

For details about setup.py I will refer you to this excellent article. This will get you up and running with a bare-bones file which is good enough for this exercise.

With that file in place, the next step is to package up your code using the command:

python setup.py sdist

That will create a directory called dist which holds a copy of your project in an install-able form. I work almost exclusively on Linux systems and by default there it seems to produce .tar.gz files.

Adding the dependency

The next step is to take that distributable file and put it into a directory in the base of your project. As a convention, most people will call this directory “vendor”. This identifies it as things that are external-yet-essential to the project.

Once the distributable file is there, the next step is to commit it to add it to version control. By doing this you guarantee that your code is now working against a known version of the dependency. This is a big deal in environments where immutability and repeatable builds are valuable.

Updating the requirements.txt

The final step is to update the requirements.txt file so that pip will be able to find and install the library. This is surprisingly easy to do. Simply change the line (see above) to:


And now when pip runs, it will look in the vendor directory for that file and then install it from there. At this point you are vendoring python! The code should be ready to go.


One thing I like to do when creating a setup.py file for a library is to include something to get the current git tag and commit information. This can be included into the name of the distributable file which helps identify which version of the library you are working with.

Sometimes a gist is worth a thousand words, so here’s an example of how to do this. (If you are not using git as your source control there is probably a similar way to do this.)

Wrapping up vendoring python

By this point you should have everything in place for an “external” system like docker or a CI server to be able to build your project. As long as it can run pip it should be able to find the dependency and install it.

If you want to see another example of vendoring packages from github repositories, check out this link here for a great overview of using some of pip’s lesser known features.

With this in place you should be able to feel more secure about the code you are running because now the version really locked down.

Cleaning up legacy python code

Python is growing in popularity which is a great thing! And with that growth we now are seeing more and more legacy python project. Occasionally you are going to inherit one of these “legacy” projects. Here’s some tips on how to get it under control.

What is a legacy project?

Over the years I have heard lots of definitions of what a legacy project is. Here’s a few of the gems I’ve heard used to describe these projects:

  • An “older” project that has been around forever
  • A code base without any kind of tests
  • The project that no one wants to work on
  • “Everyone who worked on this left the company years ago…”

And sometimes the code isn’t old. A lot of times a project will be done real quick and put into production before everyone realizes that there’s a better way. (e.g. using a different framework) By the time that happens the “legacy” project is doing an adequate job and management is afraid to touch it. This is a pretty legitimate concern to higher ups in a company: “If it ain’t broke, why try to fix it?

So, what are the best things to do with a legacy project?

Make sure it is in version control

Before you do anything, make sure the project is in some kind of version control system. Too often I have seen “proof of concept” programs that were thrown together and pushed into production without any thought given to making sure that the code was somewhere safe.

This is doubly true if the code is anyway important to business operations: You do not want to be the last person who was seen with the only copy of the code. If for some reason this project is not in git/subversion/etc. put it in there NOW. If your company doesn’t have a version control system, beg your manager to invest in one ASAP.

Delete commented out code

Once a project is under version control, one of my favorite tasks is to delete any commented out code.

The older the code base the more commented out code there tends to be. I’m not talking about 1 or 2 lines here and there. I encounter dozens to hundreds of lines of code commented out on a fairly regular basis.

Commented out code is a waste of cognitive time. New developers (like you) who look at the code will see all of that code and try to understand why it is there but hidden in a block of comments.

In my experience it is there because something in the project changed and someone is hedging their bets that the old code will be needed again. So it gets commented out and then haunts the code base forever and ever.

Once the code is under source control, the fact that it got deleted will be recorded in the repository. If for some reason it becomes necessary to revive that code, any developer on the team can go and revisit the commit history and pull out just what is needed. (Spoiler alert: Most of the time you will never need that commented out code.)

Running tests/adding tests

Now that your legacy python project is under version control, we can start do some more interesting things. One of the first things I like to do is to try and run any unit tests that might be there.

A WORD OF CAUTION: Sometimes a project will have unit tests that aren’t quite “unit” tests. In other words, beware of any tests that might reach out and talk to a live system. I was bitten by this recently when I ran a set of unit tests that were doing destructive things to a production cache system. Thankfully we were able to recover it quickly, but I still get grief about it every few weeks.

If there are no tests, this is a great time to add some. Most bosses are cool with test because they usually don’t impact the existing code. Of course, check first before you add anything.

As you are running the tests, consider using coverage.py to see how well the code base is being covered by the tests. If there are any “critical” spots where the tests aren’t hitting, those should be the first spots you should write tests for.


Something to consider doing at this point is running some type of linter over the code to see how “healthy” it is.

A linter is a bit of software that examines code and looks for anything suspicious or “wrong”. Pylint is a great python specific linter that can look at python code and offer some suggestions.

By default pylint is pretty verbose and will flag all kinds of things that might not really be that important (such as lines longer than 79 chars). There are ways to control the output, and you should check here for more information about that.

So what should you look for in the pylint output? Personally I like to look for:

  • Unused variables
  • Anything that is notes as a potential bug

If this project is going to be worked on, then these are things that you probably should consider fixing them as you add features. Removing unused variables is no brainer. It reduces visual noise and helps developers reason with the code.

Things that are flagged as potential bugs: These need to be handled carefully. Sometimes the code is working in spite of the bug.

Flake8? Not so fast!

Since I’ve mentioned pylint and it’s awesomeness, I should also mention formatters like pep8 or my favorite flake8. These tools can be used to reformat python code to make it more pep8 complaint.

While this is normally a good thing, I don’t recommend doing this on a legacy project right away. My reason for saying this is because if the code is working (especially if it is in production) any changes made to it should be minimal so that you preserve the code as it is.

While most of the time the tools will not modify the code in any destructive way, I have seen strings that were too long get mangled a little bit. This can lead to confusion about what the line was actually doing.

My personal approach would be to have unit tests in place first, and then apply flake8. Also, if you are developing new feature to add to the code base then you should be using flake8.

More idiomatic python

By this point you probably have a really good grip on your legacy project. Depending on what the future holds for the project (new features, or just maintenance) you might want to consider revisiting some of the suggestions from pylint.

One thing I have seen pop up in legacy code projects is code that is “unpythonic”. This includes things like badly formed for loops, checking if “something == None”, and other spots where things could just be more idiomatic.

This is a topic I really enjoy learning more about as I feel I am still learning the true way. If you would like to learn more, I highly recommend reading Effective Python as it is full of great examples of pythonic coding techniques.

Wrapping up

With the exploding popularity of python we are now starting to see more and more legacy python projects. Thanks to some basic tools and the beauty of the language itself, this doesn’t have to be a scary proposition like it is with other languages. Java, I am looking directly at you.

Pssst… Quick favor?

I’m putting together a course on Debugging Python code. If you want to get in on this, sign up below to learn more!

Sign up to learn more!

Becoming a better programmer

Becoming a better programmer one step at a time

Navy SEALs jump from the ramp of a C-17 Globemaster III over Fort Picket Maneuver Training Center, Va. (Air Force photo by Staff Sgt. Brian Ferguson)

In the Navy SEALs they have a saying: Everyday you have to earn your trident. The trident is the symbol that sailors earn as they complete the training that makes them part of the elite SEALs. It is possible to lose one’s trident however. To prevent a behavior that might cause this, the SEALs remind themselves that everyday they have to “earn (the right to wear) their trident”.

As I spend more time in the programming world I have come to realize there is great wisdom in this approach. A good friend of mine once told me that “Experience and skills are expiring assets.” In other words, if you don’t use them, you loose them. Just because your job title has the word “Senior” in it, you don’t automatically get a pass. You need to earn that title every day.

So as a programmer, how can you earn your place? How can you improve who you were yesterday? What does it take to make sure you are becoming a better programmer? Here’s what I’ve been doing. Continue reading

My experiences with 5 different Continuous Integration servers

I’ve become the Johnny Appleseed of Continuous Integration servers.

After I was bit by the testing bug, I quickly developed an interest in CI and began setting up servers at the places I worked. The benefits of letting a computer run the test automatically are so appealing. It cuts down the number of “dumb” bugs that are generated. It also helps ensure that the code is still working the way it used to.

What is a “dumb” bug? It is one of those bugs that is very simple (like a missing parens) and is found as soon as another person takes a look at your code. The discovery usually leads to a facepalm by the developer who did it.

Over the years as I’ve traveled to different companies I’ve been able to help introduce various Continuous Integration servers to other developers. If you would like to learn more about why CI is important, I will to point you to THE resource that I learned from, Continuous Delivery by Jez Humble. Its a big read, but it really covers the topic well and offers a lot of useful strategies.

Here’s a run down of my experiences with 4 different Continuous Integration servers. Continue reading

On moving from Java into Python

Before coming to Python, I did a lot of work in Java. Java is pretty good language and From Java into Pythonenvironment, but it is different than Python. Beyond the language syntax there’s a ton little differences to be aware of. Sometimes when we move from Java into Python it shows by some of the things that we do.

Here’s some things I’ve learned over the years, (or things that I’ve stubbed my toes on recently). Continue reading

Agile mindset: Making agile more than just a development methodology

agile mindset

Mindset is everything

To software developers, the word “agile” usually conjures up thoughts of project tracking, story points, and team velocity. To truly be the most effective developer that you can be, the word agile should also remind you that change is constant and that you must adapt. An agile mindset is the greatest tool a software developer can possess.

Too many software developers fall into the routine of their process. We work our stories, finish our sprints, and move on to the next iteration. While this does allow software to be produced in a predictable manner, it also can shackle developers. When facing a new challenge that challenges their existing mindset, these shackles can hold a developer back. Continue reading

Debugging like Elon Musk

Every time I think I’m doing pretty good with my projects, I take look over and see what Elon Musk is doing. Then I instantly feel like I’m wasting my life. That guy does such huge things and he does them so quickly it is mind boggling. What is his secret? Can I be like that?

The answers is yes, if you use “First Principles” reasoning. I have found that when you apply first principles to debugging software that you will get to better solutions. So what is first principles? Here’s Elon explaining the how and what of it:

So here’s the take away:

Continue reading