Fixing up your gits with Python (Part 2)
Checking on Github
Here’s a quick example of how I used Python to fix a mistake with my git repos (that I had made). I presented one version of a solution earlier. This one is a bit more advanced, using both the Python git and Github API packages.
The problem, and how to fix it manually
An early version of githubtools.clone_repos()
set the remote ‘origin’ to be the git url (‘git://
’) of the Github repo, not the clone url (‘https://
’). So that meant that I couldn’t git push
my work.
And I had run that early version to generate hundreds of local repos, all with the wrong remote ‘origin’.
A manual fix is:
- traverse to the directory of the local repo
- look up the origin push url with
git remote -v
- if the url begins with
git://...
, change it usinggit remote set-url origin https://...
.
(We could just change the push url by setting the--push
flag.)
Our first programmatic Python solution was nice and straightforward, but it assumed that we can simply replace git://
with https://
. It didn’t check on Github that the remote exists, nor did it check what the correct clone_url
is.
It seems like a good idea to do that. Our stripped-down dummy code for our function looks like:
def fix_remote_origin(dirName):
# find the repo at dirName
# origin_repo = find the Github repo at repo.remotes['origin']
# clone_url = origin_repo.clone_url
# git remote set-url origin clone_url
But the first thing we need to figure out how to do is connect to Github.
Connecting to Github
If we want to be able to check what’s going on Github, we need access to Github! First, we need to have an access token (which should be securely saved not in your script code) that can see our repositories. I’m not going to get into that here.
For Python to have access to Github, we’ll be using the packages PyGithub
and githubtools
.
I’m not going to go into detail here how install PyGithub or how to set your ACCESS_TOKEN. You can look at the actual code of githubtools
and fixorigin.py
for more.
from github import Github
from githubtools import *# ... set ACCESS_TOKENg=Github(ACCESS_TOKEN)
Getting a Github repo from a local repo’s ‘origin’ remote
Because your local repo just knows about the Github url, but the Github API identifies repos based on their name (of the form :owner/:repo
), we’re going to use githubtools.get_github_repo_from_url
to get the Github repo name from the url before we can ask the Github repo what its clone_url
is.
Without any error checking, our code now looks like this:
def fix_remote_origin(repo):
origin_repo = get_github_repo_from_url(g,repo.remotes['origin'].url,config)
clone_url = origin.repo.clone_url
repo.remotes.set_url('origin',clone_url)
What does get_github_repo_from_url
look like? Fortunately, it’s pretty simple (using the existing module giturlparse
). The Github
object g
has the method get_repo()
.
import giturlparsedef get_github_repo_from_url(config, g, url):
p = giturlparse.parse(url)
if config[‘ — test’]:
print(f”Retrieving {p.owner}/{p.repo}”)
return g.get_repo(p.owner+”/”+p.repo)
Error-checking
We need to do error-checking both on our local and Github steps.
- make sure that there is an ‘origin’ remote set
- make sure the remote repo exists at Github
- check if the ‘origin’ remote is wrong before we set it
def fix_remote_origin(repo):
try:
repo.remotes['origin']: # check if remote origin exists
except:
print(f"{repo.path} doesn't have a remote origin.")
else:
try:
origin_repo = get_github_repo_from_url(g,repo.remotes['origin'].url,config)
except:
print(f"Can't find {repo.remotes['origin'].url} on Github.")
else:
if repo.remotes['origin'].url != origin_repo.clone_url:
repo.remotes.set_url('origin',origin_repo.clone_url)
Solution
Let’s put this all together.
I’m eliding over the ACCESS_TOKEN code, so this doesn’t actually run. See fixorigin.py
on Github for the actual full code.
"""Usage:
fixorigin.py DIR
fixorigin.py (-h|--help)Search for Github repositories in DIR; for every git found under DIR make sure remote origin set to clone_url.Arguments:
DIR root directory to search for gitsOptions:
-h --help show this screen.
"""
from docopt import docoptif __name__ == '__main__':
config = docopt(__doc__)
topDir = config['DIR'] # get starting directoryimport os
from github import Github
from githubtools import *# ... set ACCESS_TOKENg=Github(ACCESS_TOKEN)def fix_remote_origin(repo):
try:
repo.remotes['origin']: # check if remote origin exists
except:
print(f"{repo.path} doesn't have a remote origin.")
else:
try:
origin_repo = get_github_repo_from_url(g,repo.remotes['origin'].url,config)
except:
print(f"Can't find {repo.remotes['origin'].url} on Github.")
else:
if repo.remotes['origin'].url != origin_repo.clone_url:
repo.remotes.set_url('origin',origin_repo.clone_url)for root, dirs, files in os.walk(topDir):
if pygit2.discover_repository(root):
repo = pygit2.Repository(pygit2.discover_repository(root))
fix_remote_origin(repo)
dirs[:] = [] # ignore subdirectories