Project Hydraulics

Workflow management for digital libraries

Serialized Attributes != NULL

To determine what Orders are ready for delivery, it is good to have a scope that test two attributes. If an email exists (i.e. email is not null) and the customer has not been notified (i.e. date_customer_notified is null), then it can be reasonably inferred that an order is complete and awaits being delivered.

1
2
3
4
5
class Order
  serialize :email
  scope :ready_for_delivery, where("email is not null").where(:date_customer_notified => nil)
  ...
end

In order to store a copy of the email sent to the customer, we want to serilaize the email attribute so it can store natively a Ruby Mail object. However, there is a critical bug in all current versions of Rails that has only been updated on rails:master, but not in any released versions.

In order to accomplish our test of “email is not null”, we are presuming that the data in the underlying database is stored as NULL. (Wouldn’t it be nice if we could do something directly in rails like :email != nil, but I digress.) However, given current Rails implementation, the value of email in all new objects is an empty YAML string. Note the update query:

1
2
3
4
5
1.9.3p125 :006 > o.save!
   (0.1ms)  BEGIN
   (0.6ms)  UPDATE `orders` SET `updated_at` = '2012-06-15 22:36:31', `email` = '--- \n...\n' WHERE `orders`.`id` = 5784
   (10.7ms)  COMMIT
 => true 

` Rails believes :email to be nil, but MYSQL knows differently so on the above Order object order.where(‘email is not null’) will never work. Therefore the necessary scope will always fail.

To solve this, I’ve cribbed from the master branch of Rails. Here is the commit.

In Tracksys, the local app, I’ve extended ActiveRecord to extend the YAMLColumn class.

Create lib/fix_null_in_serialized_attributes.rb

1
2
3
4
5
6
7
8
9
10
module ActiveRecord
  module Coders
    class YAMLColumn
 
      def dump(obj)
        YAML.dump(obj) unless obj.nil?
      end
    end
  end
end

Create config/initializers/active_record_extensions.rb

1
require 'fix_null_in_serialized_attributes'

Welcome to Hydraulics (Again)

For the past seven months, Hydraulics has undergone significant change, so much in fact that a new blogging interface comes with it! There is a lot of ongoing work at the Unversity of Virginia Library to make the Hydraulics engine (no longer a plugin folks, cause we are using Rails 3.2) available for public use and collaboration. Through this blog, I will discuss ongoing changes, development decisions and keep the greater community abreast of our progress.

There are a significant number of large changes to the Hydraulics code since my original presentation at Open Repositories 2011:

  • Upgrade to Rails 3.2.2
  • Upgrade to Ruby 1.9.3
  • Release as a Rails Engine so as to make the core code portable
  • Code comprises only models, migrations, processors and helpers. This gives outside institutions the ability to develop local UI and implementations.
  • The UVA Library version, Tracksys, will serve as a model implementation of the Hydraulics code but local UVA decisions will not govern how other intstitutions can use the core code.

Code for both Hydraulics and Tracksys are available on Github. Documentation both there and on this blog are ongiong, so please stay tuned to both locations.

Welcome to Hydraulics

Digital Curation Services of the University of Virginia Library is proud to announce a public beta release of a digital workflow and asset management system Hydraulics. In development since 2007, Hydraulics is a Ruby on Rails application that integrates a request module, management system for digital production and a series of automated workflows for archiving, quality assurance, patron delivery and ingestion into a Fedora repository.

On June 11th, 2011, Andrrew Curley presented The Hydraulics Project: Empowering Communities to Build a Digital Library Utilizing Fedora and an Event-Driven Service-Oriented Messaging Framework. This paper, available through the University of Virginia’s institutional repository Libra, is a high-level overview of the history of the project, the code and a brief synopsis of how objects are ingested from the University of Virginia’s instance of Hydraulics (named Tracksys) into a production Fedora repository and discoverable through the library’s online catalogue VIRGO. The code, available through Github, is in a beta-release; UVA is actively working on developing a generalized version of the code and will continue the process of documenting, increasing the scope of the test suite and expanding its usefulness for a wider audience.