Ruby, Rails, fixtures and fails

After a long time working on top of Java and its beautiful JVM and robust ecosystem my professional path lead me to another powerful web development system – Ruby and Rails. So how did I felt during the ‘transformation’ or did I metamorphose into a vermin ?

From programmsers point of view, very comfortable. Ruby was/is too easy to learn and grasp especially if you have worked with Groovy in the last couple of years. On the other hand Rails paradigms are very close and similar to that of Grails (it’s MVC and it was inspired by Rails so) and most of my web development is based on it. However changing the language and the framework happened to be more challenging from another aspect. I’m talking about the tools, helpers, libraries/gems and what the community have built until now in the field of helping and improving the development processes. Both Groovy & Grails and Ruby & Rails are open source, so you know what I mean.

As I dived more deeply into the presented problems I found about a feature that makes Rails great tool for TDD (Test Driven Development) – fixtures. Basically, a set of YAML files that contain mock data that can be loaded during the testing process. That means tables, rows, associations presented in a YAML format. Pretty cool.

But the transformation from YAML data into RDS principles for storing data isn’t so simple. One of it’s issues is the foreign key references. The problem that occurred for me with fixtures is the loading order of the files and their mapping into a database. Let’s assume that we have three tables: items, users, and user_items and the latter contains references to both user and item table. With fixtures we will have three different files: users.yml, items.yml, and user_items.yml that contain the mock data. With the start of the test process the Active Record fixture module starts loading this files into memory and executes the queries responsible for inserting data into the database. And everything is cool, no problem, but happens if we first have the user_items.yml loaded into the DB? Well it will fail. Why? Because of the database referential constraint system we will face the problem of non existing foreign key values for the user and item references.

Well Rails developers weren’t so stupid and they thought of this. If you dive into the fixtures implementation it will be revealed that Rails is invoking the method disable_referential_integrity. That means that Rails will try to remove the constraints for the test database and just insert the data. But on most RDS system the database users need to have super user privileges to execute those commands.

Since I stumbled upon on this problem and it reflected both locally and on the CI system, I needed to find a ‘workaround’ solution. So I started thinking that if we control the order of loading of the YAML files then we can control the inserting flow and like a consequence bypass the logical problem of referential integrity (the default loading is randomly alphabetical, but I’m not sure). Then started googling, reading blogs, scanning through stackoverflow and you know all of that monkey attempts to solve your problem. And luckily then – eureka! I found the solution: override fixtures method for loading yaml files and control the order of deleting the data (since that is influenced by the referential constraint too).

The solution is simple and can be found in this gist. Extending the ActiveSupport:FixtureSet class with the purpose to override the method create_fixtures (but not totally reimplement it thanks to the help of ruby aliases) that is responsible for the obvious, creating the fixtures, nails it. We implement this code sample in a file that is required by the tests. With the UserItem.delete_all we care that all user items are first deleted before the User and Item tables are dropped. The variable fs_names holds the names of the items/files that will be loaded and gives priority to users and items before any other. That means that they will be processed and loaded before the user_items yaml file and there won’t be any referential integrity issue.

Till I came to this which is just modified version of a proposed solution in the stackoverflow network I read and peeked into a lot of online resources. With this in mind I hope someone with similar problem will face this blog post first and save his/her time for something more productive that scraping half of the internet for a solution.

Cheers, I.


Used resources:

Deciphering an API documentation

JavaScript on the Web has a lot of APIs to work with. Some of them are fully supported some are still in draft. One of the things I worked with lately is the FileSystemDirectoryReader Interface of the File and Directory Entries API. It defines only one method called readEntries that returns an array containing some number of the directory’s entries. This is draft proposed API and is not supposed to be fully browser compatible. However I’ve tested it almost on all the latest versions of browsers and it works fine on each one except for the Safari browser (I think ;)).

This post will focus on one example that shows that sometimes reading an API documentation can be a little tricky. So the example in the documentation shows a common use of this API where the source of the FileSystemEntry items that we read from are passed with a (drag and) drop event in the browser. The FileSystemEntry can be either a file or a directory. What we want to do is build a file system tree of the dropped item . If the dropped item is a directory then the item is actually FileSystemDirectoryEntry object than defines the createRender method that creates the FileSystemDirectoryReader object on which we will call the readEntries method.

The demo example can be tested in this fiddle. What I want you to do is to drop a directory that contains more than 100 files. If you do that you can notice that the readEntries method returns only the first 100 queued files. That is the main reason for writing this post. The description on the successCallback argument of the readEntries method is a little bit confusing, it says: “A function which is called when the directory’s contents have been retrieved. The function receives a single input parameter: an array of file system entry objects, each based on FileSystemEntry. Generally, they are either FileSystemFileEntry objects, which represent standard files, or FileSystemDirectoryEntry objects, which represent directories. If there are no files left, or you’ve already called readEntries() on this FileSystemDirectoryReader, the array is empty.”

In their example we can see the scanFiles method that reads the items and creates html elements:

function scanFiles(item, container) {
        var elem = document.createElement("li");
        elem.innerHTML = item.name;
        container.appendChild(elem);

        if (item.isDirectory) {
            var directoryReader = item.createReader();
            var directoryContainer = document.createElement("ol");
            container.appendChild(directoryContainer);

            directoryReader.readEntries(function (entries) {
                entries.forEach(function (entry) {
                    scanFiles(entry, directoryContainer);
                });
            });
        }
    }

It seems that the successCallback functions returns the entries partially in a packages of 0 to max 100 items.  If we use this function we will never iterate more that 100 items in the given directory. What we need to do is to decipher this part: “If there are no files left, or you’ve already called readEntries() on this FileSystemDirectoryReader, the array is empty.”.  Translated this into JavaScript code is:

function scanFiles(item, container) {
        var elem = document.createElement("li");
        elem.innerHTML = item.name;
        container.appendChild(elem);

        if (item.isDirectory) {
            var directoryReader = item.createReader();
            var directoryContainer = document.createElement("ol");
            container.appendChild(directoryContainer);

            var fnReadEntries = (function () {
                return function (entries) {
                    entries.forEach(function (entry) {
                        scanFiles(entry, directoryContainer);
                    });
                    if (entries.length > 0) {
                        directoryReader.readEntries(fnReadEntries);
                    }
                };
            })();

            directoryReader.readEntries(fnReadEntries);
        }
    }

The change that we need to apply is to check after the iteration of the entries if the length of the entries array is bigger than 0.  Case that’s the truth then we should call the readEntries method again. If the entries size is zero, then the iteration is finished – all the file system items are iterated.

This fiddle has the improved version of the scan file method that will list all the files in the directory (more that 100), and won’t trick you.

Now finally we can conclude what the part “returns an array containing some number of the directory’s entries” meant. 🙂

Cheers