Wednesday, November 14

Partitioning, Shards, or Federating with ActiveRecord?

Dear Lazyweb,

My Google-fu has proven to be weak sauce and I've not been able to find a nice clean solution to the following dilemma.

From Wikipedia's database partitioning definition...

"Horizontal partitioning involves putting different rows into different tables."

"Range partitioning selects a partition by determining if the partitioning key is inside a certain range."

It seems that acts_as_partitioned might be able to do this for a single table across multiple databases, but I need to partition data across multiple tables in a single database, and it gets worse...

For my particular needs, I want to get even a little bit crazier than the standard partitioning strategies. I want new records to be inserted into tables specific to the date on which they are created. For example, if I create a new Address record on November 12th, 2007, I want that record inserted into a table named ADDRESSES_20071112. OK, that's admittedly a bit screwy, but there's a big gotcha there: that table might not exist, which means I need the ActiveRecord to create tables on an as-needed basis!

Am I completely nuts here? Wait, don't answer that, but do answer me this: can it be done? Based on the amazing things I've seen done with Ruby thus far, I'm pretty confident this is possible, but my Ruby-fu hasn't yet reached black belt status. I'll try building it myself if I must, but I'm hoping to avoid reinventing any wheels.

4 comments:

Blake said...

Creating the tables isn't a problem, just grant your user the permissions and then use the ActiveRecord helpers or raw SQL to mint the new tables.

The more interesting problem is going to be in dealing with the model classes themselves. Off the top of my head, you have two options. 1) Use a naming convention and reflect on the tables and create new model classes on the fly at run-time. or 2) Check for the existence of a real model source file and create it if it doesn't exist.

With option 1, you can use a module to define the behavior or inheritance. Then just call reset_column_information on the new class to cause ActiveRecord to reflect on the table and give you the column accessors. The down-side here is that all such tables _must_ behave exactly the same, which may be fine if you are doing data warehousing.

And if you are doing data warehousing, you might want to look at http://activewarehouse.rubyforge.org/ and leverage their work.

Option 2 just seems insanely fragile to me and although workable, I can't see any real use case.

Cheers,
Blake

Anonymous said...

Yeah. You can just leverage const_missing? to create the AR model on the fly. I've seen some code that does that. It sucks, but it might be what you need.

Allan said...

Yes, it can be done, though it's a pain at times.

I have database tables named like:

VAL_MODEL_DM19_REGRESSIONS
VAL_MODEL_DM19_RESULTS
...
VAL_MODEL_DM20_REGRESSIONS
VAL_MODEL_DM20_RESULTS
...

and DM21, DM22, etc.

I just use a base class which has a method that changes the table name based on the prefix (passed into the method) and the uppercased, tableized name of the model.

I did toy with the approach of creating a model for each table but I couldn't find a way to create the associations:

regression has_many results

needs to be:

VALMODELDM19REGRESSIONS has_many VALMODELDM19RESULTS

and even though I fiddled with the associations I couldn't find a way to override them to what they needed to be.

Good luck.

Damon said...

Hi, did you ever come up with a good solution to this? I'm trying to do something similar, but I've ended up just banging my head against a wall.

Thanks!