We have been running an internal 'production' deployment of LearnHub.com for over a month. With the private beta starting in a couple weeks, we need to scrub the database of bad objects that accumulated during development and testing.
The most common source of this db cruft is destroying objects when we don't have the ActiveRecord associations properly set to do cascading deletes (aka :dependent=>:destroy). We are now better at writing tests for these (its easy to overlook) but that doesn't help the current state of our database.
So here's a little rake task that we whipped up to help our scrubbing:
require "db_tasks"
namespace :models do
desc "Report any invalid ActiveRecord objects in the database."
task :find_invalid => :environment do
# Iterate over all constants and find just ActiveRecord models
Object.constants.each do |c|
klass = eval(c, TOPLEVEL_BINDING)
if klass.is_a?(Class) && klass < ActiveRecord::Base
invalid_object_ids = klass.find(:all).reject { |o| o.valid? }.collect { |o| o.id }
puts "#{invalid_object_ids.size} invalid #{klass.to_s} objects: #{invalid_object_ids.join(', ')}" unless invalid_object_ids.empty?
end
end
end
end
It will run through all your ActiveRecord models, and for each print out the IDs of any invalid objects.
We even found some invalid objects in our fixtures! And we run a pretty tight ship. I bet you'll find some in your project.
We will likely integrated this into a weekly routine to monitor the health of our production database. We could even run it post-deployment with Capistrano, and perhaps email the results.

