Fixing up your gits with Python (Part 2)

Brad Johnson
3 min readNov 25, 2019

--

Checking on Github

Here’s a quick example of how I used Python to fix a mistake with my git repos (that I had made). I presented one version of a solution earlier. This one is a bit more advanced, using both the Python git and Github API packages.

The problem, and how to fix it manually

An early version of githubtools.clone_repos() set the remote ‘origin’ to be the git url (‘git://’) of the Github repo, not the clone url (‘https://’). So that meant that I couldn’t git push my work.

And I had run that early version to generate hundreds of local repos, all with the wrong remote ‘origin’.

A manual fix is:

  • traverse to the directory of the local repo
  • look up the origin push url with git remote -v
  • if the url begins with git://..., change it using git remote set-url origin https://....
    (We could just change the push url by setting the --push flag.)

Our first programmatic Python solution was nice and straightforward, but it assumed that we can simply replace git:// with https://. It didn’t check on Github that the remote exists, nor did it check what the correct clone_url is.

It seems like a good idea to do that. Our stripped-down dummy code for our function looks like:

def fix_remote_origin(dirName):
# find the repo at dirName
# origin_repo = find the Github repo at repo.remotes['origin']
# clone_url = origin_repo.clone_url
# git remote set-url origin clone_url

But the first thing we need to figure out how to do is connect to Github.

Connecting to Github

If we want to be able to check what’s going on Github, we need access to Github! First, we need to have an access token (which should be securely saved not in your script code) that can see our repositories. I’m not going to get into that here.

For Python to have access to Github, we’ll be using the packages PyGithub and githubtools.

I’m not going to go into detail here how install PyGithub or how to set your ACCESS_TOKEN. You can look at the actual code of githubtools and fixorigin.py for more.

from github import Github
from githubtools import *
# ... set ACCESS_TOKENg=Github(ACCESS_TOKEN)

Getting a Github repo from a local repo’s ‘origin’ remote

Because your local repo just knows about the Github url, but the Github API identifies repos based on their name (of the form :owner/:repo), we’re going to use githubtools.get_github_repo_from_url to get the Github repo name from the url before we can ask the Github repo what its clone_url is.

Without any error checking, our code now looks like this:

def fix_remote_origin(repo):
origin_repo = get_github_repo_from_url(g,repo.remotes['origin'].url,config)
clone_url = origin.repo.clone_url
repo.remotes.set_url('origin',clone_url)

What does get_github_repo_from_url look like? Fortunately, it’s pretty simple (using the existing module giturlparse). The Github object g has the method get_repo().

import giturlparsedef get_github_repo_from_url(config, g, url): 
p = giturlparse.parse(url)
if config[‘ — test’]:
print(f”Retrieving {p.owner}/{p.repo}”)
return g.get_repo(p.owner+”/”+p.repo)

Error-checking

We need to do error-checking both on our local and Github steps.

  • make sure that there is an ‘origin’ remote set
  • make sure the remote repo exists at Github
  • check if the ‘origin’ remote is wrong before we set it
def fix_remote_origin(repo):
try:
repo.remotes['origin']: # check if remote origin exists
except:
print(f"{repo.path} doesn't have a remote origin.")
else:
try:
origin_repo = get_github_repo_from_url(g,repo.remotes['origin'].url,config)
except:
print(f"Can't find {repo.remotes['origin'].url} on Github.")
else:
if repo.remotes['origin'].url != origin_repo.clone_url:
repo.remotes.set_url('origin',origin_repo.clone_url)

Solution

Let’s put this all together.

I’m eliding over the ACCESS_TOKEN code, so this doesn’t actually run. See fixorigin.py on Github for the actual full code.

"""Usage:
fixorigin.py DIR
fixorigin.py (-h|--help)
Search for Github repositories in DIR; for every git found under DIR make sure remote origin set to clone_url.Arguments:
DIR root directory to search for gits
Options:
-h --help show this screen.
"""
from docopt import docopt
if __name__ == '__main__':
config = docopt(__doc__)
topDir = config['DIR'] # get starting directory
import os
from github import Github
from githubtools import *
# ... set ACCESS_TOKENg=Github(ACCESS_TOKEN)def fix_remote_origin(repo):
try:
repo.remotes['origin']: # check if remote origin exists
except:
print(f"{repo.path} doesn't have a remote origin.")
else:
try:
origin_repo = get_github_repo_from_url(g,repo.remotes['origin'].url,config)
except:
print(f"Can't find {repo.remotes['origin'].url} on Github.")
else:
if repo.remotes['origin'].url != origin_repo.clone_url:
repo.remotes.set_url('origin',origin_repo.clone_url)
for root, dirs, files in os.walk(topDir):
if pygit2.discover_repository(root):
repo = pygit2.Repository(pygit2.discover_repository(root))
fix_remote_origin(repo)
dirs[:] = [] # ignore subdirectories

--

--

Brad Johnson
Brad Johnson

Written by Brad Johnson

Climate strategist, HillHeat.News. Former Climate Hawks Vote ED, Campaign Manager for Forecast the Facts, ThinkProgress Green Editor.

Responses (1)