A Smarter has_many :through?

Posted on August 25th, 2011 by | 1 Comment »

At work, I’ve been expending a lot of effort on this complicated search functionality where you can enter a search phrase that will full-text search over one model’s fields (we’re using texticle [github], which is awesome) and limit the results by which other models are involved relationally. Sort of like searching Amazon for “green converse” and choosing the “shoes” category.

The object graph behind this is pretty complicated and it’s been a real education in SQL trying to make sure the query that gets generated is both reasonably speedy and right. Several times, I’ve gotten it “working” only to realize I was joining in some table more than once and so either returning some record twice or excluding it when I shouldn’t've or joining all rows against all rows and, thus, making everything pass all constraints. My SQL skill has leveled up several times throughout, though, which has been really awesome. This is mostly because I was hand-writing a lot of the join SQL with table aliases and whatnot.

Example

The other day, I realized that Rails 3 (or, anyway, the 3.1 release candidates, which is what this app is using) will let you do something that earlier versions would not: do a has_many :through relation on another has_many :through. Say you’ve got Departments composed of Employees. Employees work in groups to create Widgets and which, in turn, get Tags. You can do this number:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class Widget < AR::Base
  has_many :tags
  has_many :employees
 
  has_many :departments, :through => :employees
end
 
class Tag < AR::Base
  belongs_to :widget
 
  has_many :employees, :through => :widget
  has_many :departments, :through => :employees
end
 
class Employee < AR::Base
  belongs_to :widget
  belongs_to :department
 
  has_many :tags, :through => :widget
end
 
class Department < AR::Base
  has_many :employees
 
  has_many :widgets, :through => :employees
  has_many :tags, :through => :widgets
end

Which enables stuff like:

1
Department.joins(:tags).where(:tag => { :id => params[:tag_id] })

The SQL

So, though, since I was hand-writing my JOIN statements before, I’m clearly concerned with what, exactly, it’s going to execute against the database. So I pulled out good ol’ ActiveRecord::Base#to_sql to see. Here’s what I got (edited without all the quoting and with newlines):

1
2
3
4
5
SELECT departments.* FROM departments
INNER JOIN employees ON employees.department_id = departments.id
INNER JOIN widgets ON widgets.id = employees.widget_id
INNER JOIN tags ON tags.widget_id = widgets.id
WHERE tags.id = 3

Hopefully, that query is pretty straight forward and you can see how ActiveRecord has decided how to make all those joins. However, something struck me: I’m joining through with widgets table, but both employees and tags already have widget_id on them. I’d rather have seen something like:

1
2
3
4
SELECT departments.* FROM departments
INNER JOIN employees ON employees.department_id = departments.id
INNER JOIN tags ON tags.widget_id = employees.widget_id
WHERE tags.id = 3

The result set should be the same and it’s slightly faster. In this example, joining through the extra table wouldn’t be a big hit, probably, but if we’ve got more objects all related to Widgets and many are, like Departments, related through some other object, we might be (and in my case often are) joining many more tables, so if we can eliminate middle-man joins, it can have an appreciable effect on the query’s speed.

How We Do It

So it turns out you can make ActiveRecord generate the above SQL. You don’t want has_many :through for the second association. If you do like this:

1
2
3
4
5
6
7
class Department < AR::Base
  has_many :employees
 
  has_many :widgets, :through => :employees
  # has_many :tags, :through => :widgets
  has_many :tags, :foreign_key => :widget_id, :primary_key => :widget_id
end

You can use the same ActiveRecord query syntax from above to generate the second SQL example. It’s a lot of typing, though, so I wondered: Wouldn’t it be awesome if ActiveRecord knew you when you had this matching-middle-man-foreign-key situation in a query and generated the leaner SQL?

I’m not sure if there are pitfalls to this I’m not seeing (especially related to uses outside what I’m doing with it right now), but I’ve started digging around in the Rails source to see where it’s thinking about these kinds of things (led me to lib/active_record/associations/join_dependency/join_association.rb:72 so far). I’d love some thoughts and feedback on these ideas or guidance in my code-diving efforts. I expect I may end up in the Arel source at some point… we’ll see where it takes me.

Falsiness and Null Objects

Posted on June 1st, 2011 by | 2 Comments »

Recently, I went to RailsConf. I saw a bunch of talks and met some cool people. This is not a RailsConf post-mortem post (if you’re interested, though, I’ve collected some notes from myself and some others here). This post is about what, in retrospect, was probably the best talk I went to (and I went to several really awesome talks). I’ve been mulling over it since I got back, basically, and that seems like a good result from a talk. That talk was Avdi Grimm‘s Confident Code (slides, my notes).

One thing in particular sort of caught at the edge of my thought patterns: The Null Object Pattern. I don’t have a CS degree and so I’m missing a lot of the formal training about design patterns that many programmers have (and, probably, forget), so I’d never heard of it. When Avdi started talking about how ActiveRecord’s try method is a code smell, I was like, “Yes!” I would not say that I hate it, but I have seen several times some line of code that looks like this:

1
@user.try(:posts).try(:recent).try(:first)

I mean… bleh. But I didn’t know a way that looked any better to me, really. Anyway, you can look at the notes and slides to learn about what Avdi says about the Null Object Pattern. I thought it was awesome and so when I went home, I decided to take it for a test drive.

The Test Course

So in a project at work, we have Users and they may or may not have one Subscription. Hopefully, you can picture this complex object graph. Subscriptions may or may not be “current” based on various business rules mostly to do with whether you paid us or not. So, naturally, we have a Subscrption#current? method. But we’re using the User as a sort of presenter for Subscriptions. So you don’t want to call @user.subscription.current?. That’s a code smell. So on User we had this method:

1
2
3
def current?
  subscription.try(:current?)
end

There’s that rascal try. “This,” I thought, “is a perfect spot for that Maybe method from Avdi’s talk.” So I rewrote it thusly:

1
2
3
def current?
  Maybe(subscription).current?
end

W00t, right? Wrong. The accompanying NullObject class looks like this:

1
2
3
4
5
6
7
8
9
class NullObject
  def method_missing(*args, &block)
    self
  end
 
  def nil?
    true
  end
end

That method missing treatment, so handy in avoiding the chain of trys in my first example, is the gotcha. It means that if I have a User without a Subscription for whatever reason, calling User#current? returns an instance of NullObject, which will pass, say, the boolean clause of an if statement.

So, not sure as to whether I’d misunderstood something, was making some dumb mistake or what, I emailed Avdi. He said, basically, “Awesome question. I will answer it in a blog post.” And, lo, he did. Go read that post to see what he said. The comments also have some good ideas.

Noodles

If you read my comment, I said I was going to noodle on stuff and post again. I had more thoughts than it seemed like would fit in a blog comment. Hence this post. So my initial thought was disappointment. It turned out the Null Object Pattern wasn’t as powerful (in Ruby) as I’d hoped, since if you might have something (calling Maybe) the chances that you’ll have some conditional asking a boolean business-rule question about it is not low.

So I thought about how to get around that. You could, for instance, make a more complex method_missing definition that grepped the message name for /\?$/ and returned false. That’s fail, though. It falls down the moment you have something like this:

1
2
3
if Maybe(@posts).empty?
  # Intuitively, you'd expect NullObject#empty? to have put you in here.
end

But then I realized that Avdi was making a higher-level point: since it is not possible to make your own objects look falsey in Ruby, you have to have another solution. Trying to define various question-mark methods on NullObject is trying to untie the knot, but I should be looking for a way to cut it. So it got me thinking: Why the hell to I have Users without Subscriptions, anyway? Shouldn’t User#current? express that business logic clearly, rather than just express the logic that enforces it? Yes. Yes, it should.

We have some Users who are also admins, who have special rights. It’s also conceivable that we could give away a free account for whatever reason. So, really, we want something like this:

1
2
3
def current?
  self.free_account? || subscription.current?
end

But, this thought it incomplete. It expresses the business logic cleanly: The User is current if they’re flagged as free or if their Subscription is up to date. However, if the weird case of a User who is neither free nor has an associated Subscription crops up, we still have to hunt down the “Undefined method ‘current?’ for nil” error. It sort of has be reaching for Maybe again.

Or maybe (heh, you see what I did there) I want to steal another trick from Avdi’s presentation and have User#subscription return :no_subscription_defined_for_user so that the error message makes some more sense. I don’t like redefining ActiveRecord‘s default accessor methods, though, to transparently return the symbol if the real object is missing.

If you’ve got any thoughts, I’d love to hear them.

Release: twitter_atm 1.0

Posted on January 14th, 2011 by | No Comments »

Holy crap. Did I really start this blog in December 2008? That would make it more than 2 years old. Wow. Well, good on me, I guess. That feels sort of absurd. Anyway…

An Itch…

The other day at work, I had to get the OAuth credentials for a twitter account that our application would use to send programmatic tweets to. For those of you not familiar with OAuth, a brief description: The usual way you OAuth with Twitter is that you have a web page where a user clicks something indicating they’d like to OAuth you to their account. You then send your consumer key and secret off to Twitter to get a request token and, using that, you send the user off to a url over on twitter.com.

Once there, they sign in (or are already signed in) and click “Allow”. Twitter then hits your callback url with some more tokens, which you use to make a final reply and then they respond with the access token and secret you’ll need to do whatever it is you’re doing with the user’s account. If the user changes username or password, you’re still authorized and if they want to revoke your access, they can without changing their username or password. Great! (If you’re thinking, “What?!? Not great! That made no sense!” then maybe the image in this article will help).

However, when your program is a desktop client or when you’ve only got one account you’ll ever be tweeting from (or maybe a small handful), it’s not really practical to build a web interface and a callback url to hit so that you can do the whole dance and get the tokens. So Twitter has an alternate path that replaces the redirects in the middle of the dance. Instead of redirecting users over to twitter.com, you show them the URL and they go there manually. When they click “Allow”, they’re given a PIN, which they then give back to you and you can then finish off the dance as if the PIN were the callback.

…Scratched

So at work, I hacked around in the console for a bit and eventually figured out how to work the PIN-based method, ran it for the account I wanted and then got the access token and secret for our account. But, I thought, I shouldn’t have to hunt all around to figure out how it works (the documentation is almost all focused on the callback path). Heck, if I know my consumer key and secret and I own the account, I shouldn’t need to know how it works at all. So, having figured out how it works already, I decided I’d write a little command to do it for me and publish it as a gem. Thus was born twitter_atm.

The Tool

As the README states, it’s pretty simple. You invoke twitter_atm get_creds with your consumer key and secret as arguments, then it interactively gives you directions on how to finish out the process and spits out the access token and secret at the end. I do want to note, about the name, it’s not about cash. It’s about inputting a PIN and getting something in return. It’s not very exciting to use, so I won’t talk about it much. I’d rather move on to…

An Old God of Asgard

This was the first time I’d written a program with a command line interface, so I asked around a little about gems that were good at that and Jonathan Otto (of Dealzon) pointed me at thor. In short, thor seems awesome. It’s got a nice DSL for describing the various subcommands of your application and it looks deep enough to handle something more complex than my purposes with twitter_atm.

As a brief example, consider git pull --rebase origin master. If you were writing something that would support this syntax in thor, it would look something like this (I made up the git commands inside off the top of my head, so it’s a bit naive):

1
2
3
4
5
6
7
8
9
10
11
12
13
class Git < Thor
  desc "git pull", "Fetches and merges stuff into the current branch."
  method_options :rebase => :boolean
  def pull(remote, branch)
    `git fetch #{remote}`
 
    if options[:rebase]
      `git rebase #{remote}/#{branch}`
    else
      `git merge #{remote}/#{branch}`
    end
  end
end

You can also declare different types of options, default values, etc. Have a look at the fairly extensive readme and look at how I used it in twitter_atm if that helps. I quite recommend the gem and already have another project I’d like to use it on.

bundle gem twitter_atm

There’s this little project–I don’t know if you’ve heard of it–called bundler. It was started by a couple of up-and-coming young programmers who really might go somewhere some day. Bundler is great for managing gems in a big project and it does this really impressive dependency resolution thing. But there’s a lesser known command that I’ve fallen in love with: bundle gem <gem_name>. It just makes a skeleton for a gem project for you. Unlike Jeweler, it only gives you the bare minimum and really just gets out of your way. You manage your own version number and write your own gemspec (gasp!).

It has three handy rake tasks with obvious functions: rake build, rake install and rake release. Each of those for the most part just issue various gem commands. I basically like it because it builds you a little foundation and then doesn’t really manage anything else for you. One thing that’s important to note: The current version of bundler doesn’t add Gemfile.lock to the .gitignore that it generates (but future versions will), and it is important that you do so. Yehuda has a blog post explaining why.

So, based on this experience, here’s what I took away: Thor is good to use for making a CLI, bundle gem is good to use for making a gem and sometimes you can make something small and cool for yourself in one sitting which gives you a good feeling and is a well invested 5 hours.

My First Ruby Gem

Posted on May 7th, 2010 by | 1 Comment »

I just released my first ruby gem, twitter_alert. It’s intended to be a component of the side project I mentioned forever ago, HeyGoVote. It grabs all the followers for a Twitter account and DMs them a message. The messages have dates on them because their intended to be schedule ahead of time. I’ll crib from the readme near the end, but if you want, you can go read the whole thing for yourself. First, I’m going to talk a bit about my experience writing my first gem.

Jeweler

I used the jeweler gem to create a gem template for this guy. It’s very simple and was a huge boon considering this was my first time. It scaffolds out the directory structure for you, sets up some handy rake tasks and generally gives you guidance on how to structure your gem. The readme for jeweler is very helpful. I won’t bother to restate what it says, but seriously, new to gems or not, you should have a look at jeweler if you haven’t already. One of the nicest things is that it’ll generate your gemspec file for you and also handle version bumping in a semantic versioning compatible way. It also integrates with git and GitHub in some interesting ways.

Gemcutter

Jeweler will also handle publishing your gem to “gemcutter” for you, which is nice. I say that in quotes because, of course, gemcutter.org is no longer a thing and their stuff all got officially adopted by rubygems.org, so it actually publishes to there. The jeweler documentation, though, all acts as if gemcutter were still a separate thing. This could be potentially confusing, I guess, but it ends up working right, so it’s fine. It does use the gemcutter gem to manage the publishing, so you’ll have to have that installed, and make sure you have a rubygems.org account and api key set up. The api key goes in ~/.gem/credentials as you’ll learn on your rubgems profile.

twitter_alert

I’ll crib from my readme example to show the most basic setup:

1
2
3
4
5
6
7
8
9
10
11
require 'twitter_alert'
 
account = TwitterAlert::Account.new :user_name => 'benhamill', :password => 'thisisnotmyrealpassword'
 
class Alert
  include TiwtterAlert::Alert
end
 
alert = Alert.new 'Very important message.', DateTime.now
 
account.announce alert

In the wild, I don’t imagine that your Alert class will be so simple. For instance, when I plug this into HeyGoVote, it’ll be included in an ActiveRecord model so I can run a cron job that pulls out tweets that should go out today (based on the date) and sends them all.

I haven’t plugged this into code yet, and note the version number 0.1. The tests pass, but they may not be comprehensive and I might have botched something up in my publishing, but it looks like everything’s working to me. My next step is to start building up HeyGoVote and using twitter_alert in it, which might reveal some needed features. In the meantime, I welcome feedback. Leave a comment here, or fork it and issue me a pull request, if you have an idea.

My Twitter Project: atreply

Posted on February 10th, 2009 by | No Comments »

I use Twitterfox to read and create tweets most of the time. I follow enough people that, when I open my browser for the first time for the day, more than 20 tweets have accumulated and, really, I don’t want to go back and read all 60-odd or whatever that have accumulated overnight. Twenty, I should note, is just what Twitterfox picks up when it first turns on.

Occasionally, I’ll come in and see the last few tweets in a conversation between two people I’m following (I only see @replies by others who are to people I’m also following). If it seems interesting enough, I’ll go back and page through to see what they were talking about, reading in reverse order. Sort of like reading a chat log written by the guys that made Memento. It’s not horrible, but neither is it ideal.

So I had an idea about it and I’ve started work. Twitter tracks what tweet (technically called a “Twitter status”, apparently) any given tweet was a reply to. And, I figured, it would be relatively simple to, given a Twitter status ID, recursively follow the reply chain back and get the whole conversation. Turns out, I was right.

A proof of concept:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
require 'rubygems'
require 'twitter'
 
class Reply
 attr_accessor :text, :author, :in_reply_to, :time, :atreply
 
 def initialize status_id
   status = Twitter::Client.new.status :get, status_id
 
   self.text = status.text
   self.author = if status.user.name then status.user.name else status.user.screen_name end
   self.time = status.created_at
   self.in_reply_to = status.in_reply_to_status_id
   self.atreply = Reply.new self.in_reply_to unless self.in_reply_to.nil?
 end
 
 def each_reply &amp;block
   reply_chain.each do |reply|
     yield reply
   end
 end
 
 def to_s
   self.author + ' - ' + self.time.to_s + "\n" + self.text
 end
 
 protected
 
 def reply_chain
   return [self] unless self.atreply
 
   self.atreply.reply_chain &lt;&lt; self
 end
end

This has a dependency on Joshuamiller’s version of twitter4r. My medium-term plan is to make a one-trick-website that will take an ID or twitter URL and give you the replies all pretty-like. Maybe make a bookmarklet for convenience’s sake. I plan on using Rails, even though that’s overkill because I figure it’ll be a good learning experience on that front. Find it on Github.