As Linux administrators, we generally host a range of multi-purpose scripts that are designed for the purposes of duct-taping the infrastructure together. Nobody else in the organization knows how/when/where 95% of these scripts are run and on what interval — it’s a great tactic for job security. Inevitably, we’re going to get a request at some point (and most of us already have, that’s for sure) to have a script watch one folder on a server for a file to appear, and then move it somewhere else (usually with some data or filename massaging in the process). So, all of us have our own variation of this “move-file” script… Some are written in BASH, some in PERL; the more programming-oriented admins have theirs in Python… It doesn’t matter what it’s written in as long as it gets the job done. And pretty much the way that we’ve agreed to do it is by putting a script into a loop, and performing a diff on the directory listing every 10 seconds or so… That’s cool, right? Not anymore!

Kernel 2.6.13 introduced us to something entirely new: exposed Kernel hooks for all file operations. It’s called inotify (not to be confused with iNotify, which would be a really apple-esque spelling mistake … Remember, folks, we’re far more elite than even Apple…). What inotify allows us to do is have a script that will listen for kernel hooks related to file operations on a directory of our choosing. We’re polling the kernel here, not the filesystem, so I/O and load factors are completely gone when we use this method. Also, when we have a huge directory, we don’t have to worry about getting the initial directory listing, and that makes everybody happy… We’re told by the kernel when things happens with files, and we’re given the full path of the file that had something done to it… Brilliant. This is great too, because we get up-to-the-second notification.

Let’s do this with PERL…

The bare metal:

#!/usr/bin/perl -w

use strict;
use warnings;

# http://search.cpan.org/~mlehmann/Linux-Inotify2-1.21
use Linux::Inotify2;

# Autoflush ON
$|++;

# Variables -- what directory are we going to look for file operations on
my $directory = "/tmp/dan";

# We're object oriented, so let's start with a new object
my $inotify = new Linux::Inotify2
                or die "Unable to create new inotify object: $!";

# I really only care when a process has finished writing to the file
# and closed it, so I will only use the IN_CLOSE_WRITE hook to watch for file
# modifications
$inotify->watch($directory, IN_CLOSE_WRITE,
                    # Anonymous functions for the win... You can also
                    # call back another function with a reference. \&doStuff, for example.
                    sub {
                        # The event object is supplied by the process.
                        # We get information about the file from this object,
                        # but since we're hackers and not real programmers, we
                        # probably only care about the full path to the file...
                        # We'll give it the $name variable...
                        my $event = shift;
                        my $name = $event->fullname;

                        # I'm going to do something super elite here...
                        #   ... But, what?
                    }
) or die "watch creation failed: $!";

# Loop polling the kernel.
1 while $inotify->poll;

For the purposes of this example, as I stated in the code’s comments, I only care when a process is done writing to a file — this can be a new file that’s just been created, an existing file that’s been modified, a file who’s had an attribute changed or modified… you get the idea. Here is the comprehensive list of hooks that we can jump in on:

IN_ACCESS            object was accessed
IN_MODIFY            object was modified
IN_ATTRIB            object metadata changed
IN_CLOSE_WRITE       writable fd to file / to object was closed
IN_CLOSE_NOWRITE     readonly fd to file / to object closed
IN_OPEN              object was opened
IN_MOVED_FROM        file was moved from this object (directory)
IN_MOVED_TO          file was moved to this object (directory)
IN_CREATE            file was created in this object (directory)
IN_DELETE            file was deleted from this object (directory)
IN_DELETE_SELF       object itself was deleted
IN_MOVE_SELF         object itself was moved
IN_ALL_EVENTS        all of the above events
IN_ONESHOT           only send event once
IN_ONLYDIR           only watch the path if it is a directory
IN_DONT_FOLLOW       don't follow a sym link
IN_MASK_ADD          not supported with the current version of this module
IN_CLOSE             same as IN_CLOSE_WRITE | IN_CLOSE_NOWRITE
IN_MOVE              same as IN_MOVED_FROM | IN_MOVED_TO

And there you have it! Scrap that old filesystem-loop script that you have, and get with the times!

-dan

Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

© 2013 Dan's Blog Suffusion theme by Sayontan Sinha