[% setvar title Additions to 'use strict' to fix syntactic ambiguities %]

This file is part of the Perl 6 Archive

Note: these documents may be out of date. Do not use as reference!

To see what is currently happening visit http://www.perl6.org/

TITLE

Additions to 'use strict' to fix syntactic ambiguities

VERSION

   Maintainer: Nathan Wiger <nate@wiger.org>
   Date: 24 Sep 2000
   Last Modified: 30 Sep 2000
   Mailing List: perl6-language@perl.org
   Number: 278
   Version: 2
   Status: Frozen

ABSTRACT

Several RFCs and many people have voiced concerns with different parts of Perl's syntax. Most take issue with syntactic ambiguities and the inability to easily tokenize Perl.

This RFC shows how these all boil down to a three central issues, and how they can be solved with some simple additions to use strict. By default, Perl should remain as flexible as possible. By adding these flags to use strict, those who desire them can have all the benefits of a stricter syntax, without hurting those that like these features.

DESCRIPTION

The Problems

Indirect Objects

RFC 244 proposes eliminating the bareword indirect object syntax because this:

   print STDERR @stuff;

Can be parsed as either of these:

   STDERR->print(@stuff);
   print(STDERR(@stuff));

Depending on your usage of STDERR other places in your program. However, many of us like writing:

   $q = new CGI;

Quite a bit, and consider this DWIMish.

Barewords vs. Functions

RFC 244 and others mention several problems with barewords such as:

   Class->stuff(@args);   # Class()->stuff or 'Class'->stuff ?

Again, the fact that Perl can figure this out correctly is quite DWIMish, and this functionality should not be removed by default.

Special Cases

Many special cases abound, such as the bare // mentioned in RFC 135. Again, this is stuff that makes Perl fun, and should not be taken out of the language.

The Solutions

At first, these may not seem related. However, they very much are, and in fact all boil down to only three issues which can be resolved with additions to use strict.

Function Parens - use strict 'words'

This imposes a very simple restriction: barewords are not allowed. They must be either quoted or specified with parens to indicate they are functions. Note this solves the %SIG problem from Camel:

   use strict 'words';
   $SIG{PIPE} = Plumber;    # syntax error
   $SIG{PIPE} = "Plumber";  # use main::Plumber
   $SIG{PIPE} = Plumber();  # call Plumber()

In addition, this also forces users to disambiguate certain functions:

   use strict 'words';

   name->stuff(@args);      # syntax error
   'name'->stuff(@args);    # 'name'->stuff
   name::->stuff(@args);    # ok too, same thing
   name()->stuff(@args);    # name()->stuff

   $result = value + 42;    # syntax error
   $result = value() + 42;  # value() + 42
   $result = value(  + 42); # value(42)
   $result = 'value' + 42;  # ok, if you think this is Java...

It's simple: barewords are not allowed.

Indirect Objects - use strict 'objects'

Another major problem is ambiguous indirect objects. Under use strict 'objects', the indirect object must be surrounded by braces:

   use strict 'objects';
   no strict 'words';

   print STDERR @stuff;     # print(STDERR(@stuff))
   print 'STDERR' @stuff;   # syntax error
   print {'STDERR'} @stuff; # 'STDERR'->print(@stuff)

   print $fh @junk;         # syntax error
   print {$fh} @junk;       # $fh->print(@junk)

This eliminates the possibility of ambiguity with indirect objects. When combined with strict 'words', code becomes even less ambiguous:

   use strict qw(words objects);

   $q = new 'CGI';          # syntax error
   $q = new {'CGI'};        # 'CGI'->new
   $q = new ('CGI');        # new('CGI')
   $q = new (CGI());        # new(CGI())

   $q = new 'CGI' @args;    # syntax error
   $q = new {'CGI'} (@args);# 'CGI'->new(@args)
   $q = new (CGI (@args));  # new(CGI(@args))

Syntactic Problems - use strict 'syntax'

There are many other "little ambiguities" throughout Perl. Adding strict 'syntax' would remove these and require the user to specify them explicitly. In this category fits the bare // problem mentioned in RFC 135, as well as several common "bugs" (mistakes).

Under this rule, the following would apply:

   1. No more // by itself, you must use m//

   2. Trailing conditionals would require parens

   3. Precedence other than for basic math and boolean ops would
      not apply

This is designed to force you to write clean, unambiguous code that borders on being non-Perlish:

   use strict 'syntax';
   next if /^#/ || /^$/;       # syntax error
   next if m/^#/ || m/^$/;     # syntax error
   next if (m/^#/ || m/^$/);   # ok

   use strict 'syntax';
   $data = $a + $b / $c - $d || $default or die;      # no way
   $data = ($a + $b / $c - $d) || $default or die;    # nope
   ($data = ($a + $b / $c - $d) || $default) or die;  # ok

Basically, the idea is to impose a truly unambiguous style so that people don't get carried away with precedence and special cases.

Combining all these together

Let's look at an example of how all these would be combined. With the additions to use strict proposed in this RFC, the existing Perl 5 code snippet:

   $r = Apache->request;

   sub geturl {
       shift->{DATA}->{uri};
   }

   sub seturl {
       shift->{DATA}->{uri} = $_[0];
   }

   seturl $r->uri;
   print STDERR geturl or die "Can't print to STDERR: $!\n";

Would have to be rewritten:

   use strict;

   my $r = Apache::->request();

   sub geturl {
       shift()->{DATA}->{uri};
   }

   sub seturl {
       shift()->{DATA}->{uri} = $_[0];
   }

   seturl($r->uri());
   print {'STDERR'} (geturl()) or die("Can't print to STDERR: $!\n");

The syntax remains consistent, even though it's much more strict. If you don't like a particular aspect, such as indirect objects requiring braces, you can always turn that off with no strict.

Problems with RFC 244 and special-casing ->

While syntactic ambiguities are problems in Perl, going about it in the way that RFC 244 proposes - by special-casing -> and bareword indirect objects - is not the solution. In the process it introduces new syntactic ambiguities, where sometimes a bareword is a function and other times it's a word, depending on context:

   print STDERR @data;            # error?
   print 'STDERR' @data;          # syntax error
   print {'STDERR'} @data;        # Joe Scripter: "huh?!"

   $u = Apache->request->uri;     # error?
   $u = Apache->request()->uri;   # huh? parens just in the middle?

Particularly annoying is that last one, since the second to last cannot involve ambiguity because of its context. However, an approach like RFC 244's would require us to explicitly use the last form, even where it wouldn't be needed.

Instead, we need to attack the real problem: The ambiguity throughout Perl between unquoted barewords and functions.

Perl 6 should remain as flexible as Perl 5 by default, so that people don't run away screaming. To tighten up code on important apps, the above rules can simply be invoked with use strict:

   use strict;

   print {'STDERR'} @data;              # works, is 'STDERR'->print
   my $u = Apache::->request()->uri();  # works, and more consistent

Plus, use strict is lexically scoped, allowing us to have blocks of loosely-evaluated code mixed with strict code, and vice versa.

IMPLEMENTATION

New hooks in use strict.

MIGRATION

Simplest case: For those scripts that have a use strict, add a no strict qw(words objects syntax) right after it. Then code doesn't have to be changed.

REFERENCES

RFC 244: Method calls should not suffer from the action on a distance

RFC 277: Method calls SHOULD suffer from ambiguity by default

perl6storm issues #0003, #0035, #0050