Vanity Profile URLs in Rails

One feature shared by many social networking sites is "vanity" short profile
URLs. My Twitter page could have easily been the RESTfully predictable http://twitter.com/users/riscfuture, but thanks to short profile URLs it is http://twitter.com/riscfuture.

Even Facebook got in the game recently with their "Facebook Usernames" feature. Of course, in classic Facebook style, getting the vanity URL is a multi-step process with an application and the associated land-grab. At Scribd I kept it a little simpler, and I'm assuming you'd like to keep it simple for your Rails website as well.

In order for this system to work, we're going to have to lay down a few ground rules:

  • No user whose username conflicts with a controller name can have a short URL. You can't sign up on Scribd with the username "documents" and prevent anyone from seeing their document list.
  • No user whose username conflicts with another defined route can have a short URL. Remember that the routes file defines named or custom routes and resources, but with the default routes, normal controllers do not need an entry in that file.
  • Users with reserved characters in their names must have these characters escaped or dealt with. If I sign up with the username "foo/bar", that slash can't be left unescaped, or the router will misunderstand the address.
  • Usernames must be case-insensitively unique. Every browser expects scribd.com/foo to be the same as scribd.com/FOO.
  • Any user who cannot be given a short URL for the above reasons must have a fallback URL. This is where you fall back to your less pretty /users/123 URL. (Or perhaps /users/123-foo-bar for SEO purposes.)

Note that it's not enough to simply build a list of your controllers and stick them in a validates_exclusion_of validation. You want to be able to claim new routes for yourself even if users have already signed up with conflicting logins, and gracefully revert those users to a fallback profile URL.

Ultimately the question we need to answer is this: Given a user name, will a vanity URL conflict with an existing route? There are a lot of really hard ways of going about this, many of which will break over time. I opted to go with the a reliable (if somewhat slow) way of doing this: I build a list of known routes, strip them down to their first path component, then build an array of these reserved names. A known route might be, for instance, /documents/:id; its first path component is "documents." Thus, a user whose login is "documents" cannot have a vanity URL.

There are some points to note for this system:

  • You'll get a few false positives. If /documents/:id is a valid route, but /documents is not (say you had no index action), this system would still disallow a user named "documents". You can easily solve this by tweaking the code below, though.
  • No attention is paid to HTTP methods. Theoretically, if you had a route like /upload whose only acceptable method is POST, you could still use GET /upload to refer to a user named "upload". I have intentionally avoided doing this, however; good web design dictates that varying the HTTP method of a request only varies the manner in which you interact with the resource represented by the URL; a single URL should represent the same resource regardless of which method is used in the request.

In order to eke speed out wherever we can, we generate the list of reserved routes once, at launch, and cache it for the lifetime of the process. We do this in a module in lib/:

 
module FancyUrls
  def self.generate_cached_routes
    # Find all routes we have, take the first part (/xxx/) and remove some unwanted ones
    @cached_routes = ActionController::Routing::Routes.routes.map do |route|
      segs = route.segments.inject("") { |str, s| str << s.to_s }
      segs.sub! /^\/(.*?)\/.*$/, '\\1'
 
      # Some routes accept a :format parameter (ratings.:format).
      segs.sub! /\.:format$/, ''
      segs
    end
 
    # All possible controllers for /:controller/:action/:id route
    @cached_routes += ActionController::Routing.possible_controllers.map do |c|
      # Use only the first path component for controllers with multiple path components
      c.sub /^(.*?)\/.*$/, '\\1'
    end
    @cached_routes.uniq!
    # Remove routes whose first path component is a variable or wildcard
    @cached_routes.reject! { |route| route.starts_with?(':') or route.starts_with?('*') }
    # Remove the root route.
    @cached_routes.delete '/'
  end
 
  def self.cached_routes
    @cached_routes
  end
end

The top method combines two arrays: the first, a list of routes from the defined routes, and the second, a list of the app's controllers. It then filters out some non-applicable routes and stores the list in an instance variable. The list consists of only the first path component of a route.

The method is called generate_cached_routes because it's called when the server process starts, as part of the environment.rb file. The cached results are accessed with the cached_routes method.

So given this method, how do we test if a user is eligible for URL "vanitization?" It's simple:

 
module FancyUrls
  def user_name_valid_for_short_url?(login)
    not FancyUrls.cached_routes.include?(login)
  end
end

The method is simple: If the user's name is in our list of reserved routes, then it's not valid for URL shortening. Easy peasy.

So now we can reasonably quickly determine whether or not a user gets a vanity profile URL. The next step is to write a user_profile_url method that, given a user, returns either the vanity or full profile URL, as appropriate. To do this, first we will need to add our vanity URLs to the bottom of our routes.rb file:

 
# Install the non-vanity user profile route above the vanity route so people
# who don't have shortenable logins can still have a URL to their profile page.
map.long_profile 'users/:id', :controller => 'users', :action => 'show', :conditions => { :method => :get }
# Install the vanity user profile route above the default routes but below all
# resources.
map.short_profile ':login', :controller => 'users', :action => 'show', :conditions => { :method => :get }
 
# Install the default routes as the lowest priority.
map.connect ':controller/:action/:id'
map.connect ':controller/:action/:id.:format'

What's going on here? Well, at the very bottom of the routes.rb file, we are installing the old Rails standby, the :controller/:action routes. Newer Rails ideology is often to leave these routes out, so adjust your routes file as appropriate. Above those routes, but otherwise of the lowest priority, is our vanity route. Anywhere above that route is our traditional profile URL. (If you have a RESTful users controller, you could of course replace the top route with a resources call.)

At first glance there's a chicken-and-egg problem: We're checking if a user is "vanitizable" using the routes file, but now the routes file contains the vanity URL route. We solved this problem earlier in the generate_cached_routes method:

 
# Remove routes whose first path component is a parameter or wildcard
regular_routes.reject! { |route| route.starts_with?(':') or route.starts_with?('*') }

This line of code filters out any routes that start with a parameter or wildcard, among them the short_profile named route.

With the routes squared away, we move on to the problem of users with logins containing reserved characters. RFC 1738 defines what characters must be encoded in a URL:

Thus, only alphanumerics, the special characters "$-_.+!*'(),", and
reserved characters used for their reserved purposes may be used
unencoded within a URL.

Characters aside from these in usernames must either be encoded or otherwise dealt with. Beyond RFC 1738, we should additionally consider the dollar sign and plus characters ("$" and "+") reserved because they often serve special roles in URLs as well. And because this is a Rails app, we should consider the period (".") reserved as well, as it is used by Rails to indicate the format parameter.

So if a user has any reserved character in his login, what do we do? The obvious solution is to percent-encode it, creating a string like "foo%2Fbar", but some might find that ugly. You could also replace these characters with dashes (or some other stand-in character), creating "foo-bar", but then you run into trouble if someone actually signs up with the username "foo-bar". If you're making a new website, you may opt to disallow these characters from usernames. At Scribd we use a combination of approaches: Some reserved characters (like spaces) are simply not allowed in usernames; others are allowed but by using one of these characters you "give up" your vanity URL, instead using the fallback profile URL.

If you choose to allow certain reserved characters in your usernames, but disallow those people vanity URLs, you will have to modify the user_name_valid_for_short_url? like so:

 
def user_name_valid_for_short_url?(login)
  not (login.include?('.') and FancyUrls.cached_routes.include?(login))
end

This example allows users to have periods in their login, but disallows those users their vanity URLs.

With our vanity routes defined, we can implement the user_profile_url method:

 
module FancyUrls
  def user_profile_url(person, options={})
    login = login_for_user(person)
    raise ArgumentError, "No such user #{person.inspect}" unless login
 
    if user_name_valid_for_short_url?(login) then
      short_profile_url options.merge(:id => login)
    else
      long_profile_url options.merge(:id => person)
    end
  end
 
  private
 
  def login_for_user(user_or_id)
    return (if user_or_id.is_a?(User) then
      user_or_id.login
    else
      Rails.cache.get("login:#{user_or_id}") { User.find_by_id(user_or_id, :select => 'login').try(:login) }
    end)
  end
end

The method is simple enough: We check if the user an have a vanity URL, and if so, we return it; otherwise we return the standard profile URL. I included two small optimizations: We cache the login to avoid database lookups with each method call, and we only select the fields we care about from our users table.

And with that, we've got our URLs! Simply include your module as a helper and call user_profile_url to generate profile URLs as opposed to url_for or the named resource routes or whatever else you might have been using.

We're not quite done yet, though. What happens when a user who haplessly registered the username "ratings" gets screwed because we just launched our ratings feature? With the system I've shown above, the moment we deploy our new feature, any links to that user's profile page would automatically revert to the normal profile URLs.

Good web practice teaches us that when we change the URL for a resource, we should respond with a 301 to any client that tries to access the old URL. Obviously, since the /ratings URL now points to a different web page, we can't do that. Any users who visit external web pages and click a link to that user's profile URL will find themselves on your brand new ratings page. I have implemented no particular fix for this problem, as I believe most websites add very, very few controllers and named routes in comparison to the number of users they have. In other words, the problem is small enough that it's probably not worth solving.

We can solve the flip side of this problem, though: Once a website launches its vanity URL feature, there will still be bunches of external links to the old, longer profile URLs. We can respond to these requests with 301s to inform people that those links are now outdated. This also helps assist with SEO, getting people's new profile URLs on the Google index and getting the old ones off.

We do this by including code in the profile page's controller action to redirect if necessary:

 
class UsersController
  def show
    if params[:id] then
      @user = User.find(params[:id])
      return head(:moved_permanently, :location => user_profile_url(@user)) if user_name_valid_for_short_url?(@user)
    elsif params[:login] then
      @user = User.with_login(params[:login]).first || raise ActiveRecord::RecordNotFound
    else
      raise ActiveRecord::RecordNotFound
    end
  end
end

We have this if statement at the start of our show method because the method is doing double-duty: It responds to both the short_profile and long_profile named routes. In the former, the variadic portion of the URL is stored in the id parameter; in the latter, the login parameter. You could of course opt to dispatch the two URLs to two separate actions; either way, make sure you respond to unnecessarily long profile URLs with a 301.

And with that, you've got your vanity URLs. All it comes down to is a little bit of route-foo and some speed optimizations here and there. The solution here is tailored to the needs of Scribd; I've done my best to outline those needs and how they impacted our code. You should think about how you want to do vanity URLs on your website and take this code as a guide to implementing your own solution. Vanity URLs take a little extra time to implement, but in return you are rewarded with users who are more willing to share their profile pages, improved SEO, and that glowy feeling you get when you increase your site's Web 2.0-ishness.

9 responses to “Vanity Profile URLs in Rails

  1. Nathan Phelps

    I wanted to point out the very well done Friendly ID Gem that does a lot of this for you. You’ll still have to manage your routes a bit to get the user profile to application root but you can use FriendlyId’s exclusion capability to exclude some of the standard URLs in your application.

    • Wow, a Random Wednesday with alolbutesy no links! It must have been a busy day.Lunch was breakfast. Had to find something else for lunch.I wonder if the dizziness was caused by the spinning?

  2. I have no expert, but I want have to know more and more, on your blog just interesting and useful information. Visite my site bdsm sex video

  3. I intended to produce you one little note to be able to say thanks a great deal the moment again for your gorgeous guidelines you have shown right here. It was certainly strangely generous of people like you to offer unreservedly all that some people could have supplied as an e book to create some money on their personal, specifically now that you simply could possibly have carried out it in situation you wanted. Those concepts in addition acted like an excellent way to fully grasp that other individuals have the identical eagerness just as mine to understand entire great deal more related to this problem. I’m sure you will find several much more enjoyable moments ahead for folks who find out your blog publish.

  4. Super-Duper website! I’m loving it!! Will be back later to read more. I’m bookmarking your feeds as well.

  5. I personally needed to show this specific blog, “Vanity Profile URLs
    in Rails | coding@scribd” along with my personal friends on fb.

    Ijust simply wished to pass on ur tremendous posting!

    Thanks, Melvin

  6. Pingback: Vanity URLs with Rails | Gemfile

  7. Hi there, I check your blogs daily. Your story-telling style is witty, keep
    doing what you’re doing!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 55 other followers

%d bloggers like this: