My Git Talk At Austin On Rails

Posted on December 7th, 2009 by benhamill | No Comments »

At the last Austin on Rails meeting (Nov 17), I gave a talked entitled Practical Git Quickstart (Prezi link). The slides don’t have a lot of content and mostly underscored what I hoped to talk about. I blew through them in about ten minutes or less. The short of it is that I feel like a lot of git tutorials and introductions start off with the high-level stuff and that, especially for people new to git, that that’s confusing. My goal was to give git newbies the most basic commands they’d need to be able to use git on a daily basis so that they could build their own abstractions before diving into the more heady stuff. I was aiming for an 80% solution to that, anyway.

After I finished the slides, I fired up a command line and an editor and just worked through some stuff. This post should sum up what I talked about, more or less. I started out covering the same stuff I covered in my previous git tutorial post, so maybe go check that out first. It should get you through setting up a new repository, adding files to the staging area, making a commit, checking your status and committing to a remote repository.

So let’s pick up there, with remote repositories. The way you get code up to your repo is with git push origin master. Once it’s up there, other people can get at it. If you recall, you told git where your remote repo was going to be with git remote add origin git@github.com:<username>/<project>.git. Someone who wants their own local copy of your repo does so with the clone command like so: git clone git@github.com:<username>/<project>.git. That will create a directory wherever the command is issued, named &lt;project&gt; and pull down the current state of the remote repo. Then, that person will be able to push their own changes, etc. This is all, of course, assuming they’ve got permission to do so.

So this new second person makes some changes and pushes them on up. How do you get them? Well, sensibly, the opposite of push is pull, so you issue git pull origin master. This is actually a two step process that’s just for convenience. I don’t want to get into the plumbing too much, but it basically grabs the state of the remote repo (git fetch) and then attempts to merge (git merge) it with your local stuff. So that’s the most basic case of working with someone else on a project, or working alone on one using different machines, if you like. I use that case all the time.

So what about conflicts? If you both make a change to the same file and they push it first, you’ll not be allowed to push because git can’t handle the merge on it’s own. Similarly, if you try to pull, it will do the fetch part, but be unable to merge and will tell you so. You can use git diff to see what the changed were and do the merge yourself. You can also use git difftool which is awesome, but takes some setup, so you should look into it later on (I skipped it in my presentation).

Once you handle the conflicts, you’ll add the conflicting files to the staging area and make a commit. With all merges, I should note, git makes a commit just for the merge, so when you have conflicts, it’ll have staged the things it can merge on its own and left the conflicts unstaged. As you fix them, you stage them and then you commit the merge commit. Git doesn’t know if you really fixed the conflicts, so you can git add whatever version of the file you want, even a broken, not-conflict-resolved one. Just be aware.

That was more or less the end of my ordered presentation. There were some questions afterward and I’m going to attempt to sum up the discussion that followed, here:

First off, I wanted to mention how you ignore files in git. Unlike subversion, there is no git ignore. If you want git to ignore a file, you have to add it to a .gitignore file. This file is a list of patterns that git will ignore for the directory it’s in and all directories below it. So you might have one for a python project like this:

tmp/*
*.pyc

This will ignore all compiled python code (*.pyc) and everything in your tmp/ directory. I was baffled by this when I first came to git, but it’s not really that hard. Note that you generally commit your .gitignore so that others can share it. If there’s something you want to ignore on a per-machine basis, rather than a per-project basis, then you need to turn to my next topic.

Which is global git preferences. On Linux and Mac, git will look for a file in your root directory called .gitconfig and take global behaviors from it (it’s tricky on Windows, and I haven’t figured it out to my own satisfaction, sorry. If someone asks about it, I’ll try to sum up what I know in the comments). In my other git post, I had gone through setting up a repo on GitHub and said to follow the directions there. Two of those steps were these:

git config --global user.name "<your name>"
git config --global user.email <your_email>

Those created entries in your ~/.gitconfig telling git your name and email address. You can also declare a global ignore file there. I like to call mine .gitignore. This is shockingly original, I know. On the machine I’m typing on right now, my ~/.gitconfig looks like this:

[user]
email = blah@blah.blah
name = Ben Hamill
[core]
excludesfile = /home/ben/.gitignore

I bet you can guess it, but just in case, you can either put your excludesfile in manually or do git config --global core.excludesfile /whatever/file/path/you/want. For reference, my ~/.gitignore looks like this:

*.kpf
*.swp

A .kpf file is a project file created by Komodo Edit, which I used to use for all my code editing needs, but not since I switched to vim, which is what creates *.swp files.

Finally, someone had asked about git stash. It’s what I’d consider a more advanced command, but a lot of git fanboys sell it hard because it’s cool and svn doesn’t have it. However, as cool as it is, I think it can get you into a lot of trouble. Basically, you can be working on something and issue git stash and git will store whatever changes you’re in the middle of and hide them away, putting your repo back in the state it was right after the last commit. You can then work on something more pressing, make commits, merges, new branches, whatever and when you’re done, issue git stash pop and it applies your changes back (if it can).

The really hairy bit is that you can name stashes and so have more than one stash going at once. While a super organized developer might find this really useful, I find that it’s easy to get stuff lost in there. You don’t want to have tons and tons of stuff stashed and not remember, anymore, what changes were in which stash, etc. I advise, as a basic rule of thumb, that if you’ve already got one thing stashed and find yourself wanting to stash something else, then you should be looking at branching, not stashing.

I think that about covers it. I think someone recorded audio of my talk or maybe video. If it ends up posted somewhere, I’ll come edit this post with a link to it. If you were at my talk and notice something I talked about then that I haven’t covered here, let me know and I’ll try to amend. Or, if you weren’t there and feel there’s a topic you have questions about, drop it in the comments and I’ll do what I can.

Git Tutorials Suck, A Sucky Git Tutorial

Posted on March 18th, 2009 by benhamill | No Comments »

Context… Perhaps Too Much Of It

So I was reading this blog post about learning and explaining because @carl_youngblood tweeted about it. I think Carl’s right: I had a hard time learning git (by which I don’t mean to imply I’m some sort of expert now, but the learning is going easier now).

I think the main problem that I had was this: Having learned Subversion, with it’s central repository, it was a hard abstract thing to understand. And some (I feel many of the ones I read, anyway) of the tutorials out there try to start at the abstract. Little help that did me (see above-linked article. Really, it’s very good). And even ignoring those, I had to read a lot lot lot of the practical ones before things started sinking in.

So I’ve sort of come to understand that, actually, the tutorials don’t suck; learning abstract things just takes time and, at the time, that can be frustrating. So I’m going to offer my own little sucky tutorial, which will focus on the practical aspects and, if you read this and don’t get it, you can follow some links at the end to other articles I found helpful and maybe, after roughly a week, you’ll have your ‘ah-Ha!’ moment and think about how git is just like monads… whatever the heck those are.

A lot of tutorials for git newbies start out explaining the Staging Area with some kind of metaphor so that it seems friendly or, I suspect, out of some subconscious wish to actually obscure it from Subversion converts so that git seems more familiar–more like SVN, which it is not very much like at all. I’m not going to really talk about it much. When we get to the commands that affect it (shortly, here), I’ll explain what they do. You can make the abstraction your self.

I’m intentionally writing this off the top of my head for two reasons: If I have to look up a command, then you might as well read whatever tutorial I looked it up on and if I have to look it up, then I clearly don’t use it all the time and thus, you don’t need to know it to get going on Git.

The Tutorial

I’ve got six sections to this thing with (I hope) at least vaguely descriptive names. They are:

  1. Setup
  2. Initial Commit
  3. SitRep
  4. Staging Area
  5. Remote Repo
  6. Conclusion/Links
Setup

You have a project you just started in a directory called ‘notes’. This isn’t even code, it’s just notes about something that you want to version control and back up. It’s a collection of text files and the directory structure is something like this.

$ pwd
~/notes/
$ ls
contact_info.txt  general.txt  outline.txt

After installing git as appropriate for your operating system, you start out by typing in the command line git init. This will create a directory called .git in notes/. There’s some stuff in there, but for the most part, you can ignore this for now. Suffice to say it’s where git does it’s book-keeping. What you’ve got now is a local git repository or, as the kids say, a “local repo”, but nothing’s in it.

Initial Commit

So you do a git add . (note the trailing period). This will toss everything (that’s what the period means) in notes/ into the staging area (including stuff that’s in directories that’re in directories that’re in notes/ etc.). The repo is still empty. To actual save stuff once it’s been staged, you do like this:

$ git commit -m 'Initial commit.'
[master (root-commit)]: created 7db8343: "Initial commit."
0 files changed, 0 insertions(+), 0 deletions(-)
create mode 100644 contact_info.txt
create mode 100644 general.txt
create mode 100644 outline.txt

The -m option says you’re going to specify your commit message right after. Sometimes, you’ll want to leave a longer message, in which case, you forget the -m and git will automatically fire up a default text editor where you can put in longer stuff. Since a lot of that varies widely from OS to OS, I’m going to skip it and you can read more details on other tutorials (see below). Notice that you get a list of what’s changed (you created 3 new files in the repo) and you get your comment back in the output. Splendid.

SitRep

Now you’ve made your initial commit, and your stuff is in version control. Go into contact_info.txt and add something (doesn’t matter what for these purposes). Imagine you’ve made that change and then walked away and forgotten about it. You can use git status to see what’s new, thusly:

$ git status
# On branch master
# Changed but not updated:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working directory)
#
#       modified:   contact_info.txt
#
no changes added to commit (use "git add" and/or "git commit -a")

Using git status is just like a reminder. It doesn’t tell you much, but it can jog your memory about what you’ve already staged or what you changed and didn’t stage or what files you added. To get the real scoop about how a file changed, you use git diff. When you run git diff contact_info.txt the output will vary depending on what you had initially and what you added, but the gist is this: It will show you the changes (all of them) with a + before the line for additions and a – before the line for deletions. Generally, it gives a few lines before and after a change for context.

So let’s add our new contact_info change to the staging area and commit it, yeah? Do git add contact_info.txt and then git commit -m 'Updated contact info' or similar. Whatever comment you write is fine. Note we could’ve used git add . but I wanted to show the single-file syntax.

Staging Area

Now let’s put in some stuff into the outline.txt. Whatever you want. Just some stuff. Save it. But wait! We should also add some stuff to the general notes, just a quick overview at least, so put some stuff in there. We’ll finish the outline changes in a second. This is so much more pressing. Obviously.

Now, it’s good repo etiquette to only commit stuff atomically, which is to say that all the changes have to do with each other. Some people will say that you should only commit stuff that works (code compiles or whatever), but with git that’s less of a concern. I’ll come back to this point. What I’m getting at now is that you started one change and realized another needed to be made before you finished the first one. Now you want to commit only the second one, right? Simple: git add general.txt then git commit -m 'Added overview'. Because you never staged the outline (with your half-way-made changes), it doesn’t get committed. Later, if you need to revert that commit or whatever, you won’t have to worry that something else is mixed in there. Now, go ahead and finish your outline changes, and commit them. You should be able to do it on your own now.

Remote Repo

So, then… we’re version controlling this stuff. What if you want to get at it from another computer or let someone else get at it or… something? Pop on over to Git Hub which is my remote repo host of choice. There are others. Shop around, if you like. After you create an account, you can create a new remote repo called whatever you want. You’ll then be shown a page with some directions. Follow the ones under the heading “Existing Git Repo?”

The git remote add origin git@github.com:<username>/<project>.git command basically tells git where your remote repo is. You can have more than one if you like and, actually, do all sorts of crazy things with naming if you like, but I just want to handle the default, assumed case with this tutorial. One interesting thing: Github gives you two addresses for each repository (other hosts may do the same, I don’t know). The one that starts git@github.com is your read/write address and there’s one that starts git://github.com which is your read-only address. Since this is your own repo, you want to make sure to use the read/write address.

The git push origin master command is what actually moves your commits to the remote repo. This is where I recommend you adhere to the “only stuff that works” doctrine. If this is code, and you’re sharing the repo with your team or whatever, this is where they can get at it, so you don’t want to hand them broken stuff or half-finished ideas or whatever. So only push code that compiles/works. Pushing your code updates the remote repo with all the commits you’ve made since your last push.

The way you (or someone else) gets commits out of a repo is by using git pull. It takes the same arguments as git push. It will pull the commits down and then try to reconcile those changes with any that you’ve made since the last time your local repo was in the same state as the remote repo.

Conclusion/Links

I feel like this has gotten pretty long and I don’t want to put too much information all at once. That should be enough to get you started and, really, just try it out for a while and get comfortable with the basics. Don’t be afraid, if you get something out of whack and realize you’ve done something wrong, to kill your .git directory (which will delete the local repo) and start again from the top. I’ve intentionally left a lot of stuff out (like push/pull and branches and multiple remote repos can get kind of hairy), so here’s some documentation, blog posts and articles that I’ve found helpful. These are in no particular order and some are more advanced than others, so just start clicking and see what you like:

If you want to ask me about git or whatever, feel free to email me or leave something in the comments. Also, if you spot a mistake or something here doesn’t make sense, _please_ let me know. Hope this is helpful to someone.

Version Control Your Computer

Posted on February 12th, 2009 by benhamill | No Comments »

I’ve mentioned @carl_youngblood here before. Someone once was trying to buy him something with his name on it. I think it was a key chain. You know the kind, right? However, then didn’t have “Carl” only “Carlos”. So we joked that, one day, he needs to write an operating system and name it CarlOS. Aren’t we funny? I know. I’m sorry. Anyway, the other day, we actually got into some OS discussion that I thought had some interesting enough ideas to post here.

So how many computers do you own and use? I’ve got a desktop at home, a laptop and a machine at work. It’s sort of a bummer to have different stuff or different versions of stuff, or stuff with different preferences on different computers. At least, for me it can really jack up my work flow. Especially if there is some application I use a lot with non-default preferences. Man, that bugs me! Remembering it all, bleh.

One thing Carl’s fantasized about is having a computing environment the same everywhere you go. That’s sort of a mainframe or dumb-workstation idea, which is not new at all. However, what if your whole computer were version controlled? You could branch it (so you don’t have your work apps at home, etc.) and merge changes from one branch to another, if you wanted. You could check out a different branch on one machine and it would feel like you were on another.

Clearly an OS would have to be built from the ground up for this idea. You’d also have to have some kind of provision about storing the non-checked out branches locally. Also cloning the repo would be a hassle at current average (even high speed) connection speeds. But how cool would it be to install, say, Textmate at work and get all your settings right, etc. and then go home and merge that change in (You could merge it from work, I guess and then just pull from home. Whatever.)? You could get diff data (hard to implement, but with metadat not impossible):

$os diff gaming HEAD
+ Steam
+ Half-Life 2
+ X-Fire
- Textmate

Or whatever. You get the idea. Reverting would making backing up and creating, uh… what does Windows call them? Recovery Points? It would make all that easy and moot. Clearly Linus Torvalds needs to be in on this “project”; he has the experience in both OS design and version controlling that would be invaluable. Not that, you know, Carl or I are actually considering doing anything with this idea. It’s an interesting thought experiment, though.

Save Versus GitHub!

Posted on January 9th, 2009 by benhamill | No Comments »

I’m reaching a point where I want everything in my life to be version controlled. I made a desktop making fun of my friend in the GIMP the other day and realized that I wanted it version controlled. I don’t think Git will handle .xcf’s. Pity, though.

My most recent action on this front doesn’t actually have to do with Dungeons & Dragons, but that’s only because my system of choice is GURPS. I game master role playing games as a hobby and keeping track of campaign ideas has, in the past, been very disorganized and messy. I had scraps of paper all over and various emails to myself. If I have some brilliant idea at work, I can’t just incorporate it into my notes or whatever, I’d have to email it to myself and then hope it was clear enough to remember what the actual idea was later, etc. If I was on the bus, I had my laptop (which I use to assist in GMing) and I could put it right into the notes, but recently I had some major stability issues with that machine and so became concerned about backups, etc.

Thus, I had an idea. I’ve converted the essentials from my current campaign and put them in a repo and I’m working on notes for my next campaign there. There are several benefits, here:

The first is that before, I was using OpenOffice documents for my notes. This allowed for some pretty formatting, but when I had my timeline open, my NPC list, my session notes and my location notes all open, well… Open Office isn’t a lean program and my lappy isn’t the beefiest of machines. So now everything is a .txt and that’s super lean. Yay. I’m aware, by the way, that this is very tangentially related to version controlling my notes, but still.

Secondly, I can check out a copy on any machine I’m sitting at when I have an idea. Or, if I really want to, I can edit them right on GitHub. Neat. As a sort of corollary to this is the fact that if my lappy were to get dropped, say, off a mountain, I could borrow anyone else’s laptop and be ready to roll in about 20 minutes as long as I had internet access.

Thirdly, since I’m using git as opposed to, say, SVN, I don’t have to have internet access. Local repos means I can make a commit on my laptop while on the bus and then push when I get home. Very handy since I do a lot of my thinking about campaigns on the bus.

Fourthly (this list is getting longer than I thought it would), is character data. So there is a piece of software that you can buy to help you create and track GURPS characters (whether non-player character or player character). Handily, it saves them in plaintext (I think it’s actually .Net code or something unhelpful, not YAML or XML or similar, but nothing’s perfect), so I can version control the characters, too, not just notes.

Fifthly (good grief), a friend of mine and sort of my GM mentor moved away and doesn’t have a gaming group. In order to get his fix, he’s convinced me (for the good and the bad of it) to let him help me plan and brainstorm my next campaign. He can check out a copy, branch it, issue a pull request (or I’ll give him push access, not sure). Collaborative GMing is something that often can go wrong, but this tool, teamed up with some other guidelines we’ve adopted will help ensure that we get only the benefits out of this.

About the only things that I’d use for a campaign that it won’t version control are pictures and sound files, but I don’t expect to do a lot of changing of those over the course of things and, any way, it will save them, so it at least acts as a backup. While we’re talking about negatives… My players could snoop the notes. Oh noes! In reality, I’ll have to just trust them to stay out. They’d only be ruining their own fun, anyway.

So, a sixth, I guess, benefit is that I can share my notes with the world and if someone else sees something cool they want to steal or sees something sucky that they can do better then I’ve inspired them or at least helped them out a bit. If you like (and aren’t one of my players), feel free to check it out. If you have questions about anything in there, feel free to shoot me an email. I make no promises that anything in there will be better than total suck.

A Hub for Gits

Posted on December 16th, 2008 by benhamill | No Comments »

I’ve recently started using git to version control my personal projects. I’ve also recently started using GitHub for hosting remote repos of that stuff. So I’m new to it all and I might be wrong, here. But, having read a few articles here and there and talked to some other people (most notably, another git newbie @carl_youngblood), aren’t cherry picking and rebasing really, really horrible things to do to a repo? Even if it’s just your local one? They destroy history, which is sort of the point of version control, no?

I’ve seen, in the last few days, two articles on GitHub that make me wonder which of us (me or GitHub) doesn’t get it. My inclination is to assume it’s me who’s missing something. If so, I’d love for someone to tell me exactly where I’ve missed a step.

The first article I want to talk about is the Fork Queue announcement. It’s basically a tool that makes it really easy to see which of the people that’ve forked your project have pushed commits you don’t have in your repo and then to cherry pick them in. You can pick your branch, etc. This is to keep you from having to create a lot of remotes, I guess. It’s supposed to “[allow] you to do a email patch style workflow without actually having to deal with patches over email”. I thought that part of the point was that that work-flow was a pain? I also feel like it’s missing the part where the person making the patch tells you about it, rather than you going and getting it from them. A pull-request is much more like that.

I don’t know… I sort of feel like we should be putting roadblocks in the way of cherry picking; make it harder for people to adopt work-flows where cherry-picking is common. My understanding (and again let me stress that this may be incorrect) is that the best work-flow is for the patcher to fetch your code (because it represents some kind of “core” or “official” release, yes?), merge it into a new branch, make his(her, etc.) changes, test them, fetch your code again to make sure he has the latest, perhaps retest, then issue a pull request. When you’re acting on the pull request, you fetch his stuff down to a new branch, test, possibly merge in any changes you’ve made to master since he issued the pull request, then _merge_ into master. This preserves all the history. It’s all fetches and merges.

So the other article is the Changing Git History article. This one is about going and messing with old commits. I’ll try not to rant as much on this one. I’m not as adamant, but I do find it kind of silly, this idea of making a commit a perfect little gem. I can see fixing a typo in the commit you just made, so git commit --amend doesn’t seem so bad. However, the git rebase -i portion after that… bleh. I realize this isn’t a GitHub feature like the above; it’s a part of Git, but I wonder why. If the commits were that horrible, revert them all and do it over. If they weren’t that bad, just live with the typo. No? Even if you haven’t pushed it, rebasing just seem icky and to be avoided, especially if it’s just to fix commit messages.

Okay… So, maybe I’m out of line or off base or insane. It’s entirely possible. Some would say probable. It wouldn’t surprise me if I’ve failed in some very basic way to understand the Git philosophy. It also might be that I’ve misunderstood what GitHub’s written and that the clash between these articles and Git philosophy is all imagined by me. I’m open to these possible realities. Correct me or soliloquize or slam me in the comments, if you like. I am all ears… or eyes. Whatever.

Update: Before anyone gets the wrong idea, here… I’ve been loving using git and GitHub. They’re both spectacular. Without them, I wouldn’t have found enki for use in powering this blog. This post is about trying to understand something confusing in something great; I’m not trying to imply that either should be done away with or that I could do better. I just wanted to head off the most major take-it-the-wrong-way that occurred to me on the way in this morning.

New Blog

Posted on December 13th, 2008 by benhamill | No Comments »

Well, I finally got this thing working. Don’t book mark it yet, I plan on changing the URL over, but have to deal with registration transfer, etc. and don’t feel like mucking with that yet. Please, if you like, comment on the aesthetics or if you notice anything doesn’t work.

I’ve got the source code for this blog up on GitHub, which you can check out if you like. I’m sure people will say that I’ve got some stuff in source control that I shouldn’t. If you post something like that in the comments, I’ll at least have a look at it and consider. Deploying’s a pain if you leave important stuff out of version control, no?

Tagged git, meta | No Comments »