Line Disciplines


  Maintainer: Simon Cozens <simon@brecon.co.uk>
  Date: 25 Sep 2000
  Mailing List: perl6-language-io@perl.org
  Number: 311
  Version: 1
  Status: Developing


This is what line disciplines are.


Line disciplines have been a much vaunted feature of 5.6, despite the fact that nobody actually got around to implementing them. This time, for sure!

First things first, what are they? Line disciplines are a way of specifying how data gets into Perl. They're based on the concept of streams, invented by Dennis Ritchie, and first appeared in Korn and Vo's sfio library. Stevens writes that

    I/O streams are generalized to represent both files and regions of
    memory, processing modules can be written and stacked on an I/O
    stream to change the operation of a stream, and better exception

How's this different from, for instance, generalizing source filters? Well, that's how I first tried to implement them in Perl, but line disciplines actually give you far, far more control over the file handling; your processing modules may dictate how line endings are parsed, whereas source filters have to go either before or after the data is split up into lines. Line discipline processing modules may alter the buffering behaviour of the stream, which you can't do in standard IO. (That's a hint that we're going to have to provide our own IO library to get these things working.)

OK, back to Perl. We'll want it to be possible to add these processing modules onto filehandles from Perl, and (maybe) to create them in Perl. We started doing this with use open and the extensions to the binmode syntax. Benjamin Stuhl has done lots of good work on this (and this RFC owes a huge amount to his suggestions) and he's come up with the following API. From C, a processing module is registered like this:

    PerlIO_register_discipline(char * name, int level, VTABLE functable,
    void * data);

(We'll look at what level means when we come to implementation)

Once registered, a processing module can be attached to a file handle through binmode

    binmode($FH, ":+name"):

(Note: BKS originally suggested +:name, but I reversed this. Seemed a good idea at the time.)

Here are a few examples:

    open ($FH, "<", "japanese.euc.gz");
    binmode($FH, ":+decompress");
    binmode($FH, ":+euc_to_utf8");
    $foo = <$FH>; # This now UTF8.

Note that due to the concept of levels, this will still work:

    open ($FH, "<", "japanese.euc.gz");
    binmode($FH, ":+euc_to_utf8");
    binmode($FH, ":+decompress");

I also propose that user-definable "sets" of processing modules can be specified on the open line:

    use open 'decompress_euc' => [ '+decompress', '+euc_to_utf8' ];
    open ($FH, "< :decompress_euc", "japanese.euc.gz");


Benjamin has identified 5 different types of transformation. Imagine that the data goes through 5 "rooms" before it gets to Perl-space. Each room can, in theory, have any numbers of processing modules in them, but that's not actually workable at all in practise. Only levels 1 and 3 can have several modules in them, and these modules will be implemented as a stack.

Perl also needs to provide a default module for each "room", and we'll explain that as we look into the rooms.

(The example given is for input; simply walk through the rooms backwards for output.)


W. Richard Stevens: Advanced Programming in the Unix Environment.

D. Ritchie: "A Stream Input-Output System", AT&T Bells Labs. Tech. Journal, vol. 63, no. 8, pp.1897-1910

Korn and Vo: "SFIO: Safe / Fast String / File IO", Proceedings of the 1991 Summer USENIX Conference, pp.235-255.

The sfio library.