Synopsis 3: Perl 6 Operators
Luke Palmer <luke@luqui.org>
Maintainer: Larry Wall <larry@wall.org> Date: 8 Mar 2004 Last Modified: 7 Nov 2008 Number: 3 Version: 146
For a summary of the changes from Perl 5, see /Changes to Perl 5 operators.
Not counting terms and terminators, Perl 6 has 23 operator precedence levels (same as Perl 5, but differently arranged). Here we list the levels from "tightest" to "loosest", along with a few examples of each level:
A Level Examples
= ===== ========
N Terms 42 3.14 "eek" qq["foo"] $x :!verbose @$array
L Method postfix .meth .+ .? .* .() .[] .{} .<> .«» .:: .= .^ .:
L Autoincrement ++ --
R Exponentiation **
L Symbolic unary ! + - ~ ? | +^ ~^ ?^ \ ^ =
L Multiplicative * / % +& +< +> ~& ~< ~> ?& div mod
L Additive + - +| +^ ~| ~^ ?| ?^
L Replication x xx
X Concatenation ~
X Junctive and & also
X Junctive or | ^
L Named unary sleep abs sin
N Nonchaining infix but does <=> leg cmp .. ..^ ^.. ^..^
C Chaining infix != == < <= > >= eq ne lt le gt ge ~~ === eqv !eqv
X Tight and &&
X Tight or || ^^ // min max
L Conditional ?? !! ff fff
R Item assignment = := ::= => += -= **= xx= .=
L Loose unary true not :by(2)
X Comma operator , p5=>
X List infix Z minmax X X~X X*X XeqvX ...
R List prefix : print push say die map substr ... [+] [*] any $ @
X Loose and and andthen
X Loose or or xor orelse
N Terminator ; <==, ==>, <<==, ==>>, {...}, unless, extra ), ], }
Using two ! symbols below generically to represent any pair of operators
that have the same precedence, the associativities specified above
for binary operators are interpreted as follows:
Assoc Meaning of $a ! $b ! $c
===== =========================
L left ($a ! $b) ! $c
R right $a ! ($b ! $c)
N non ILLEGAL
C chain ($a ! $b) and ($b ! $c)
X list infix:<!>($a; $b; $c)
For unaries this is interpreted as:
Assoc Meaning of !$a!
===== =========================
L left (!$a)!
R right !($a!)
N non ILLEGAL
Note that list associativity (X) only works between identical operators.
If two different list-associative operators have the same precedence,
they are assumed to be left-associative with respect to each other.
For example, the X cross operator and the Z zip operator both
have a precedence of "list infix", but:
@a X @b Z @c
is parsed as:
(@a X @b) Z @c
Similarly, if the only implementation of a list-associative operator is binary, it will be treated as left associative.
If you don't see your favorite operator above, the following sections cover all the operators in precedence order. Basic operator descriptions are here; special topics are covered afterwards.
This isn't really a precedence level, but it's in here because no operator can have tighter precedence than a term. See S02 for longer descriptions of various terms. Here are some examples.
Int literal
42
Num literal
3.14
Non-interpolating Str literal
'$100'
Interpolating Str literal
"Answer = $answer\n"
Generalized Str literal
q["$100"]
qq["$answer"]
Heredoc
qq:to/END/
Dear $recipient:
Thanks!
Sincerely,
$me
END
Array composer
[1,2,3]
Provides list context inside. (Technically, it really provides a "semilist" context, which is a semicolon-separated list of statements, each of which is interpreted in list context and then concatenated into the final list.)
Hash composer
{ }
{ a => 42 }
Inside must be either empty, or a single list starting with a pair or a hash,
otherwise you must use hash() or %() instead.
Closure
{ ... }
When found where a statement is expected, executes immediately. Otherwise always defers evaluation of the inside scope.
Capture composer
\(@a,$b,%c)
An abstraction representing an argument list that doesn't yet know its context.
Sigiled variables
$x
@y
%z
$^a
$?FILE
@@slice
&func
&div:(Int, Int --> Int)
Sigils as contextualizer functions
$()
@()
%()
&()
@@()
Regexes in quote-like notation
/abc/
rx:i[abc]
s/foo/bar/
Transliterations
tr/a..z/A..Z/
Note ranges use .. rather than -.
Type names
Num
::Some::Package
Subexpressions circumfixed by parentheses
(1+2)
Parentheses are parsed on the inside as a semicolon-separated list
of statements, which (unlike the statements in a block) returns the results
of all the statements concatenated together as a List of Capture.
How that is subsequently treated depends on its eventual binding.
Function call with parens:
a(1)
Pair composers
:by(2)
:!verbose
Signature literal
:(Dog $self:)
Method call with implicit invocant
.meth # call on $_
.=meth # modify $_
Note that this may occur only where a term is expected. Where a
postfix is expected, it is a postfix. If only an infix is expected
(that is, after a term with intervening whitespace), .meth is a
syntax error. (The .=meth form is allowed there only because there
is a special .= infix assignment operator that is equivalent in
semantics to the method call form but that allows whitespace between
the = and the method name.)
Listop (leftward)
4,3, sort 2,1 # 4,3,1,2
As in Perl 5, a list operator looks like a term to the expression on its left, so it binds tighter than comma on the left but looser than comma on the right--see List prefix precedence below.
0-ary functions
self
undef
rand
All method postfixes start with a dot, though the dot is optional for subscripts. Since these are the tightest standard operator, you can often think of a series of method calls as a single term that merely expresses a complicated name.
See S12 for more discussion of single dispatch method calls.
Standard single-dispatch method calls
$obj.meth
Variants of standard single-dispatch method call
$obj.+meth
$obj.?meth
$obj.*meth
In addition to the ordinary . method invocation, there are variants
.*, .?, and .+ to control how multiple related methods of
the same name are handled.
Class-qualified method call
$obj.::Class::meth
$obj.Class::meth # same thing, assuming Class is predeclared
As in Perl 5, tells the dispatcher which class to start searching from, not the exact method to call.
Mutating method call
$obj.=meth
The .= operator does inplace modification of the object on the left.
Meta-method call
$obj.^meth
The .^ operator calls a class metamethod;
foo.^bar is short for foo.HOW.bar.
Method-like postcircumfixes
$routine.()
$array.[]
$hash.{}
$hash.<>
$hash.«»
The dotless forms of these have exactly the same precedences.
Dotted form of any other postfix operator
$x.++ # postfix:<++>($x)
Dotted postfix form of any other prefix operator
$x.:<++> # prefix:<++>($x)
There is specifically no infix:<.> operator, so
$foo . $bar
will always result in a compile-time error indicating the user should
use infix:<~> instead. This is to catch an error likely to
be made by Perl 5 programmers learning Perl 6.
As in C, these operators increment or decrement the object in question either before or after the value is taken from the object, depending on whether it is put before or after. Also as in C, multiple references to a single mutating object in the same expression may result in undefined behavior unless some explicit sequencing operator is interposed. See /Sequence points.
As with all postfix operators in Perl 6, no space is allowed between a term and its postfix. See S02 for why, and for how to work around the restriction with an "unspace".
As mutating methods, all these operators dispatch to the type of
the operand and return a result of the same type, but they are legal
on value types only if the (immutable) value is stored in a mutable
container. However, a bare undefined value (in a suitable Scalar
container) is allowed to mutate itself into an Int in order to
support the common idiom:
say $x unless %seen{$x}++;
Increment of a Str (in a suitable container) works similarly to
Perl 5, but is generalized slightly.
A scan is made for the final alphanumeric sequence in
the string that is not preceded by a '.' character. Unlike in Perl 5, this
alphanumeric sequence need not be anchored to the beginning of the
string, nor does it need to begin with an alphabetic character;
the final sequence in the string matching <!after '.'> <rangechar>+
is incremented regardless of what comes before it.
The <rangechar> character class is defined as that subset of
\w that Perl knows how to increment within a range, as defined
below.
The additional matching behaviors provide two useful benefits: for its typical use of incrementing a filename, you don't have to worry about the path name or the extension:
$file = "/tmp/pix000.jpg";
$file++; # /tmp/pix001.jpg, not /tmp/pix000.jph
Perhaps more to the point, if you happen to increment a string that ends with a decimal number, it's likely to do the right thing:
$num = "123.456";
$num++; # 124.456, not 123.457
Character positions are incremented within their natural range for any Unicode range that is deemed to represent the digits 0..9 or that is deemed to be a complete cyclical alphabet for (one case of) a (Unicode) script. Only scripts that represent their alphabet in codepoints that form a cycle independent of other alphabets may be so used. (This specification defers to the users of such a script for determining the proper cycle of letters.) We arbitrarily define the ASCII alphabet not to intersect with other scripts that make use of characters in that range, but alphabets that intersperse ASCII letters are not allowed.
If the current character in a string position is the final character in such a range, it wraps to the first character of the range and sends a "carry" to the position left of it, and that position is then incremented in its own range. If and only if the leftmost position is exhausted in its range, an additional character of the same range is inserted to hold the carry in the same fashion as Perl 5, so incrementing '(zz99)' turns into '(aaa00)' and incrementing '(99zz)' turns into '(100aa)'.
The following Unicode ranges are some of the possible rangechar ranges. For alphabets we might have ranges like:
A..Z # ASCII uc
a..z # ASCII lc
Î..Ω # Greek uc
α..Ï # Greek lc (presumably skipping U+03C2, final sigma)
×..ת # Hebrew
etc. # (XXX out of my depth here)
For digits we have ranges like:
0..9 # ASCII
Ù ..Ù© # Arabic-Indic
०..९ # Devangari
০..৯ # Bengali
੦..੯ # Gurmukhi
૦..૯ # Gujarati
à¦..௠# Oriya
etc.
Other non-script 0..9 ranges may also be incremented, such as
â°..â¹ # superscripts (note, cycle includes latin-1 chars)
â..â # subscripts
ï¼..ï¼ # fullwith digits
Conjecturally, any common sequence may be treated as a cycle even if it does not represent 0..9:
â
..â
« # clock roman numerals uc
â
°..â
» # clock roman numerals lc
â ..â³ # circled digits 1..20
â..âµ # parenthesize lc
â..â
# die faces 1..6
â¶..â¿ # dingbat negative circled 1..10
etc.
While it doesn't really make sense to "carry" such numbers when they reach the end of their cycle, treating such values as incrementable may be convenient for writing outlines and similar numbered bullet items. (Note that we can't just increment unrecognized characters, because we have to locate the string's final sequence of rangechars before knowing which portion of the string to increment. Note also that all character increments can be handled by lookup in a single table of successors since we've defined our ranges not to include overlapping cycles.)
Perl 6 also supports Str decrement with similar semantics, simply by
running the cycles the other direction. However, leftmost characters
are never removed, and the decrement fails when you reach a string like
"aaa" or "000".
Increment and decrement on non-<Str> types are defined in terms of the
.succ and .pred methods on the type of object in the Scalar
container. More specifically,
++$var
--$var
are equivalent to
$var.=succ
$var.=pred
If the type does not support these methods, the corresponding increment or decrement operation will fail. (The optimizer is allowed to assume that the ordinary increment and decrement operations on integers will not be overridden.)
Increment of a Bool (in a suitable container) turns it true.
Decrement turns it false regardless of how many times it was
previously incremented. This is useful if your %seen array is
actually a KeySet, in which case decrement actually deletes it
from the KeySet.
Autoincrement prefix:<++> or postfix:<++> operator
$x++
++$x;
Autodecrement prefix:<--> or postfix:<--> operator
$x--
--$x
infix:<**> exponentiation operator
$x ** 2
If the right argument is not a non-negative integer, the result is likely to
be an approximation. If the right argument is of an integer type,
exponentiation is at least as accurate as repeated multiplication on
the left side's type. (From which it can be deduced that Int**UInt
is always exact, since Int supports arbitrary precision.) If the
right argument is an integer represented in a non-integer type, the
accuracy is left to implementation provided by that type; there is
no requirement to recognize an integer to give it special treatment.
prefix:<?>, boolean context
?$x
Evaluates the expression as a boolean and returns True if expression
is true or False otherwise.
See "true" below for a low-precedence alternative.
prefix:<!>, boolean negation
!$x
Returns the opposite of what ? would.
See "not" below for a low-precedence alternative.
prefix:<+>, numeric context
+$x
Unlike in Perl 5, where + is a no-op, this operator coerces to
numeric context in Perl 6. (It coerces only the value, not the
original variable.) For values that are not already considered
numeric, the narrowest appropriate type of Int, Num, or
Complex will be returned; however, string containing two integers
separated by a / will be returned as a Rat. Exponential notation
and radix notations are recognized.
prefix:<->, numeric negation
-$x
Coerces to numeric and returns the arithmetic negation of the resulting number.
prefix:<~>, string context
~$x
Coerces the value to a string. (It only coerces the value, not the original variable.)
prefix:<|>, flatten object into arglist
| $capture
Interpolates the contents of the Capture (or Capture-like) value
into the current argument list as if they had been specified literally.
prefix:<+^>, numeric bitwise negation
+^$x
Coerces to integer and then does bitwise negation (complement) on the number.
prefix:<~^>, string bitwise negation
~^$x Coerces to string buffer and then does bitwise negation (complement) on each element.
prefix:<?^>, boolean bitwise negation
?^$x
Coerces to boolean and then flips the bit. (Same as !.)
prefix:<\>, Capture constructor
\$x
\@x
\%x
\($invocant: $pos1, $pos2, :named($arg))
Defers the contextualization of its argument or arguments till it is bound into some other context.
prefix:<^>, upto operator
^$limit
Constructs a range of 0 ..^ $limit or locates a metaclass as a shortcut
for $limit.HOW. See /Range semantics.
prefix:<=>, iterate iterator
=$iterator
Unary = reads lines from a filehandle or filename, or
iterates an iterator, or in general causes a scalar to explode its guts
when it would otherwise not. How it does that is context sensitive.
For instance, =$iterator is item/list context sensitive and will
produce one value in item context but many values in list context.
The production of values is abstract here, so a lazy list merely
remembers that it should iterate the iterator to completion upon
demand. Use an eager list to force completion.
Use @$iterator to produce a list of all the values even in item
context, and $$iterator to produce a single value even
in list context. On the other hand, =$capture produces all
parts of the capture that makes sense in the current list context,
depending on what controls that list context. [Conjecture: the
previous sentence is non-sensical.]
infix:<*>
$x*$y
Multiplication, resulting in wider type of the two.
infix:</>
$numerator / $denominator
If either operand is of Num type, converts both operands to Num
and does division returning Num. If the denominator is zero,
returns an object representing either +Inf, NaN, or -Inf
as the numerator is positive, zero, or negative. (This is construed
as the best default in light of the operator's possible use within
hyperoperators and junctions. Note however that these are not
actually the native IEEE non-numbers; they are undefined values of the
"unthrown exception" type that happen to represent the corresponding
IEEE concepts, and if you subsequently try to use one of these values
in a non-parallel computation, it will likely throw an exception at
that point.)
If both operands are of integer type, you still get a Num, but the
Num type is allowed to do the division lazily; internally it may
store a Rat until the time a value is called for. If converted
to Rat directly no division ever need be done.
infix:<div>, generic division
$numerator div $denominator
Dispatches to the infix:<div> multi most appropriate to the operand
types. Policy on what to do about division by zero is up to the type,
but for the sake of hyperoperators and junctions those types that
can represent overflow (or that can contain an unthrown exception)
should try to do so rather than simply throwing an exception. (And in
general, other operators that might fail should also consider their
use in hyperops and junctions, and whether they can profitably benefit
from a lazy exception model.)
Use of div on two Int values results in a ratio of the Rat type.
infix:<%>, modulus
$x % $mod
Always floor semantics using Num or Int.
infix:<mod>, generic modulus
$x mod $mod
Dispatches to the infix:<mod> multi most appropriate to the operand types.
infix:{'+&'}, numeric bitwise and
$x +& $y
Converts both arguments to integer and does a bitwise numeric AND.
infix:{'+<'}, numeric shift left
$integer +< $bits
infix:{'+>'}, numeric shift right
$integer +> $bits
By default, signed types do sign extension, while unsigned types do not, but
this may be enabled or disabled with a :signed or :!signed adverb.
infix:<~&>, buffer bitwise and
$x ~& $y
infix:{'~<'}, buffer bitwise shift left
$buf ~< $bits
infix:{'~>'}, buffer bitwise shift right
$buf ~> $bits
Sign extension is not done by default but may be enabled with a :signed
adverb.
infix:<?&>, boolean bitwise and
$x ?& $y
Any bit shift operator may be turned into a rotate operator with the
:rotate adverb. If :rotate is specified, the concept of
sign extension is meaningless, and you may not specify a :signed adverb.
infix:<+>, numeric addition
$x + $y
Microeditorial: As with most of these operators, any coercion or type
mismatch is actually handled by multiple dispatch. The intent is that
all such variants preserve the notion of numeric addition to produce a
numeric result, presumably stored in suitably "large" numeric type to
hold the result. Do not overload the + operator for other purposes,
such as concatenation. (And please do not overload the bitshift
operators to do I/O.) In general we feel it is much better for you
to make up a different operator than overload an existing operator for
"off topic" uses. All of Unicode is available for this purpose.
infix:<->, numeric subtraction
$x - $y
infix:<+|>, numeric bitwise inclusive or
$x +| $y
infix:<+^> numeric bitwise exclusive or
$x +^ $y
infix:<~|>, buffer bitwise inclusive or
$x ~| $y
infix:<~^> buffer bitwise exclusive or
$x ~^ $y
infix:<?|>, boolean bitwise inclusive or
$x ?| $y
infix:<?^> boolean bitwise exclusive or
$x ?^ $y
infix:<x>, string/buffer replication
$string x $count
Evaluates the left argument in string context, replicates the resulting string value the number of times specified by the right argument and returns the result as a single concatenated string regardless of context.
If the count is less than 1, returns the null string.
The count may not be * because Perl 6 does not support
infinite strings. (At least, not yet...) Note, however, that an
infinite string may be emulated with cat($string xx *).
infix:<xx>, list replication
@list xx $count
Evaluates the left argument in list context, replicates the resulting
Capture value the number of times specified by the right argument and
returns the result in a context dependent fashion. If the operator
is being evaluated in ordinary list context, the operator returns a
flattened list. In slice (@@) context, the operator converts each Capture
to a separate sublist and returns the list of those sublists.
If the count is less than 1, returns the empty list, ().
If the count is *, returns an infinite list (lazily, since lists
are lazy by default).
infix:<~>, string/buffer concatenation
$x ~ $y
infix:<&>, all() operator
$a & $b & $c ...
infix:<also>, short-circuit junctional and operator
EXPR also EXPR also EXPR ...
Can be used to construct ANDed patterns with the same semantics as
infix:<&>, but with left-to-right evaluation guaranteed, for use
in guarded patterns:
$target ~~ MyType also .mytest1 also .mytest2
This is useful when later tests might throw exceptions if earlier tests don't pass. This cannot be guaranteed by:
$target ~~ MyType & .mytest1 & .mytest2
infix:<|>, any() operator
$a | $b | $c ...
infix:<^>, one() operator
$a ^ $b ^ $c ...
Functions of one argument
int
sleep
abs
sin
... # see S29 Functions
Note that, unlike in Perl 5, you must use the .meth forms to default
to $_ in Perl 6.
There is no unary rand function in Perl 6, though there is a .rand
method call and a 0-ary rand term.
prefix:<int>
Coerces to type Int. Floor semantics are used for fractional
values, including strings that appear to express fractional values.
That is, int($x) must have the same result as int(+$x) in all
cases. All implicit conversions to integer use the same semantics.
(Note that, despite the fact that int is a valid native type
name, this function does not express conversion to that native type.
Such subtype conversions are done automatically upon assignment to
a subtyped container, and fail if the container cannot hold the value.)
prefix:<sleep>
Suspends the current thread of execution for the specified number of seconds, which may be fractional.
prefix:<abs>
Returns the absolute value of the specified argument.
infix:<but>
$value but Mixin
infix:<does>
$object does Mixin
Sort comparisons
$num1 <=> $num2
$str1 leg $str2
$obj1 cmp $obj2
These operators compare their operands using numeric, string,
or eqv semantics respectively, and depending on the order return
one of Order::Increase, Order::Same, or Order::Decrease
(which numerify to -1, 0, or +1). See /Comparison semantics.
Range object constructor
$min .. $max
$min ^.. $max
$min ..^ $max
$min ^..^ $max
Constructs Range objects, optionally excluding one or both endpoints. See /Range semantics.
Note that these differ:
0 ..^ 10 # 0 .. 9
0 .. ^10 # 0 .. (0..9)
(It's not yet clear what the second one should mean, but whether it succeeds or fails, it won't do what you want.)
All operators on this precedence level may be chained; see /Chained comparisons.
infix:<==> etc.
== != < <= > >=
As in Perl 5, converts to Num before comparison. != is short for !==.
infix:<eq> etc.
eq ne lt le gt ge
As in Perl 5, converts to Str before comparison. ne is short for !eq.
Generic ordering
$a before $b
$a after $b
Smart match
$obj ~~ $pattern
Perl 5's =~ becomes the "smart match" operator ~~, with an
extended set of semantics. See /Smart matching for details.
To catch "brainos", the Perl 6 parser defines an infix:<=~>
operator which always fails at compile time with a message directing
the user to use ~~ or ~= (string append) instead if they meant
it as a single operator, or to put a space between if they really
wanted to assign a stringified value as two separate operators.
A negated smart match is spelled !~~.
Container identity
VAR($a) =:= VAR($b)
Value identity
$x === $y
For objects that are not value types, their identities are their values.
(Identity is returned by the .WHICH metamethod.) The actual contents of
the objects are ignored. These semantics are those used by hashes that
allow objects for keys. See also /Comparison semantics.
Canonical equivalence
$obj1 eqv $obj2
Compares two objects for canonical equivalence. For value types compares the values. For object types, compares current contents according to some scheme of canonicalization. These semantics are those used by hashes that allow only values for keys (such as Perl 5 string-key hashes). See also /Comparison semantics.
Negated relational operators
$num !== 42
$str !eq "abc"
"foo" !~~ /^ <ident> $/
VAR($a) !=:= VAR($b)
$a !=== $b
$a !eqv $b
infix:<&&>, short-circuit and
$a && $b && $c ...
Returns the first argument that evaluates to false, otherwise
returns the result of the last argument. In list context forces
a false return to mean (). See and below for low-precedence
version.
infix:<||>, short-circuit inclusive-or
$a || $b || $c ...
Returns the first argument that evaluates to a true value, otherwise returns the result of the last argument. It is specifically allowed to use a list or array both as a boolean and as a list value produced if the boolean is true:
@a = @b || @c; # broken in Perl 5; works in Perl 6
In list context this operator forces a false return to mean ().
See or below for low-precedence version.
infix:<^^>, short-circuit exclusive-or
$a ^^ $b ^^ $c ...
Returns the true argument if there is one (and only one). Returns
Bool::False if all arguments are false or if more than one argument
is true. In list context forces a false return to mean ().
See xor below for low-precedence version.
This operator short-circuits in the sense that it does not evaluate any arguments after a 2nd true result. Closely related is the reduce operator:
[^^] a(), b(), c() ...
but note that reduce operators are not macros but ordinary list operators, so c() is always called before the reduce is done.
infix:<//>, short-circuit default operator
$a // $b // $c ...
Returns the first argument that evaluates to a defined value, otherwise
returns the result of the last argument. In list context forces a
false return to mean (). See orelse below for a similar but
not identical low-precedence version.
Minimum and maximum
$a min $b min $c ...
$a max $b max $c ...
These return the minimum or maximum value. See also the
minmax listop.
Not all types can support the concept of infinity. Therefore any
value of any type may be compared with +Inf or -Inf values,
in which case the infinite value stands for "larger/smaller than any
possible value of the type." That is,
"foo" min +Inf # "foo"
"foo" min -Inf # -Inf
"foo" max +Inf # +Inf
"foo" max -Inf # "foo"
All orderable object types must support +Inf and -Inf values
as special forms of the undefined value. It's an error, however,
to attempt to store an infinite value into a native type that cannot
support it:
my int $max;
$max max= -Inf; # ERROR
Conditional operator
say "My answer is: ", $maybe ?? "yes" !! "no";
Also known as the "ternary" or "trinary" operator, but we prefer "conditional" just to stop people from fighting over the terms. The operator syntactically separates the expression into three subexpressions. It first evaluates the left part in boolean context, then based on that selects one of the other two parts to evaluate. (It never evaluates both of them.) If the conditional is true it evaluates and returns the middle part; if false, the right part. The above is therefore equivalent to:
say "My answer is: ", do {
if $maybe {
"yes";
}
else {
"no";
}
};
It is a syntax error to use an operator in the middle part that binds
looser in precedence, such as =.
my $x;
hmm() ?? $x = 1 !! $x = 2; # ERROR
hmm() ?? ($x = 1) !! ($x = 2); # works
Note that both sides have to be parenthesized. A partial fix is even wronger:
hmm() ?? ($x = 1) !! $x = 2; # parses, but WRONG
That actually parses as:
(
hmm() ?? ($x = 1) !! $x
) = 2;
and always assigns 2 to $x (because ($x = 1) is a valid lvalue).
And in any case, repeating the $x forces you to declare it earlier.
The best don't-repeat-yourself solution is simply:
my $x = hmm() ?? 1 !! 2; # much better
infix:<?>
To catch likely errors by people familiar with C-derived languages
(including Perl 5), a bare question mark in infix position will
produce an error suggesting that the user use ?? !! instead.
Flipflop ranges
start() ff end()
start() ^ff end()
start() ff^ end()
start() ^ff^ end()
Flipflop ranges (sed style)
start() fff end()
start() ^fff end()
start() fff^ end()
start() ^fff^ end()
infix:<=>
$x = 1, $y = 2;
With simple lvalues, = has this precedence, which is tighter than comma.
(List assignments have listop precedence below.)
infix:<:=>, run-time binding
$signature := $capture
A new form of assignment is present in Perl 6, called binding, used in
place of typeglob assignment. It is performed with the := operator.
Instead of replacing the value in a container like normal assignment, it
replaces the container itself. For instance:
my $x = 'Just Another';
my $y := $x;
$y = 'Perl Hacker';
After this, both $x and $y contain the string "Perl Hacker",
since they are really just two different names for the same variable.
There is also an identity test, =:=, which tests whether two names
are bound to the same underlying variable. $x =:= $y would return
true in the above example.
The binding fails if the type of the variable being bound is sufficiently inconsistent with the type of the current declaration. Strictly speaking, any variation on
my Any $x;
$x := [1,2,3];
should fail because the type being bound is not consistent with
Scalar of Any, but since the Any type is not a real instantiable
type but a generic (non)constraint, and Scalar of Any is sort of
a double non-constraint similar to Any, we treat this situation
specially as the equivalent of binding to a typeless variable.
infix:<::=>, compile-time binding
$signature ::= $capture
This does the same as := except it does it at compile time. (This implies
that the expression on the right is also evaluated at compile time; it does
not bind a lazy thunk.)
infix:{'=>'}, Pair constructor
foo => 1, bar => "baz"
Binary => is no longer just a "fancy comma". It now constructs
a Pair object that can, among other things, be used to pass named
arguments to functions. It provides item context to both sides.
It does not actually do an assignment except in a notional sense;
however its precedence is now equivalent to assignment, and it is
also right associative. Note that, unlike in Perl 5, =>
binds tighter than comma.
Assignment operators
+= -= **= xx= .= etc.
prefix:<true>
true any(@args) eq '-v' | '-V'
prefix:<not>
not any(@args) eq '-v' | '-V'
While the preceding are parsed as prefix operators, operator adverbs are parsed as trailing unary operators at this precedence level, just tighter than comma. (They're not officially "postfix" operators because those require the absense of whitespace, and these allow whitespace. These adverbs insert themselves in the spot where the parser is expecting an infix operator, but the parser continues to look for an infix after parsing the adverb and applying it to the previous term.) Thus,
$a < 1 and $b == 2 :carefully
does the == carefully, while
$a < 1 && $b == 2 :carefully
does the && carefully because && is of
tighter precedence than "loose unary". Use
$a < 1 && ($b == 2 :carefully)
to apply the adverb to the == operator instead. We say that
== is the "topmost" operator in the sense that it is at the
top of the parse tree that the adverb could possibly apply to.
(It could not apply outside the parens.) If you are unsure
what the topmost operator is, just ask yourself which operator
would be applied last. For instance, in
+%hash{$key} :foo
The subscript happens first and the + operator happens last,
so :foo would apply to that. Use
+(%hash{$key} :foo)
to apply :foo to the subscripting operator instead. Likewise
$x = 1..10:by(2)
will apply the adverb to the item assignment (and fail), but since
@x = 1..10:by(2)
is a (looser) list assignment, the adverb applies to the range operator as expected. And in general note that adverbs will attach the way you want when you say things like
1 .. $x+2 :by(2)
The new internal testing syntax makes use of these precedence rules:
$x eqv $y+2 :ok<$x is equivalent to $y+2>;
Here the adverb is considered to be modifying the eqv operator.
infix:<,>, the argument separator
1, 2, 3, @many
Unlike in Perl 5, comma operator never returns the last value. (In item context it returns a list instead.)
infix:«p5=>», the Perl 5 fatarrow
This operator, which behaves exactly like the Perl 5 fatarrow in being equivalent to a comma, is purely for easier migration from Perl 5 to Perl 6. It is not intended for use by programmers in fresh code; it is for use by the p5-to-p6 translator to preserve Perl 5 argument passing semantics without losing the intent of the notation.
This operator is purposefully ugly and easy to search for. Note that,
since the operator is equivalent to a comma, arguments come in as
positional pairs rather than named arguments. Hence, if you have
a Perl 5 sub that manually handles named argument processing by
assigning to a hash, it will continue to work. If, however, you edit
the p5=> operator in an argument list to Perl 6's =>
operator, it becomes a real named argument, so you must also change
the called sub to handle real named args, since the named pair will no
longer come in via @_. You can either name your formal parameters
explicitly if there is an explicit signature, or pull them out of %_
rather than @_ if there is no explicit signature.
List infixes all have list associativity, which means that identical infix operators work together in parallel rather than one after the other. Non-identical operators are considered non-associative and must be parenthesized for clarity.
infix:<Z>, the zip operator
1,2 Z 3,4 # (1,3),(2,4)
infix:<minmax>, the minmax operator
$min0, $max0 minmax $min1, $max1 # ($min0 min $min1, $max0 max $max1)
The minmax operator is for calculating both a minimum and maximum
in a single expression. Otherwise you'd have to write twice as
many expressions. Instead of
@a minmax @b
you'd have to say something like
($a[0] min $b[0], $a[1] max $b[1])
Note that there is no guarantee that the resulting minimum and maximum come from the same side. The two calculations are bundled but independent.
infix:<X>, the cross operator
1,2 X 3,4 # (1,3), (1,4), (2,3), (2,4)
In contrast to the zip operator, the X operator returns all possible
lists formed by taking one element from each of its list arguments. The
returned lists are ordered such that the rightmost elements vary most rapidly.
If there are just two lists, for instance, it forms all pairs
where one element is from the first list and the other one from
the second, with the second element varying most rapidly. Hence you may say:
<a b> X <1 2>
and you end up with
('a', '1'), ('a', '2'), ('b', '1'), ('b', '2')
This becomes a flat list in @ context and a list of arrays in @@ context:
say @(<a b> X <1 2>)
'a', '1', 'a', '2', 'b', '1', 'b', '2'
say @@(<a b> X <1 2>)
['a', '1'], ['a', '2'], ['b', '1'], ['b', '2']
The operator is list associative, so
1,2 X 3,4 X 5,6
produces
(1,3,5),(1,3,6),(1,4,5),(1,4,6),(2,3,5),(2,3,6),(2,4,5),(2,4,6)
On the other hand, if any of the lists is empty, you will end up with a null list.
Only the leftmost list may usefully be an infinite list. For instance
<a b> X 0..*
would produce
('a',0), ('a',1), ('a',2), ('a',3), ('a',4), ('a',5), ...
and you'd never get to 'b'.
Cross hyperoperators
@files X~X '.' X~X @extensions
1..10 X*X 1..10
@x XeqvX @y
etc.
See /Cross operators.
infix:<...>, the series operator.
This operator takes a concrete list on its left and a function to be iterated on its right when the list must be extended. Each time the function must be called, it extends the list with the values returned by the function (if any).
The value of the operator is the lazy list formed of the concrete list followed by the result of applying the function to the tail of the list as needed. The function indicates by its signature how many of the preceding values to pay attention to (and which the operator must track internally). Demonstration of this falls to the lot of the venerable Fibonacci sequence:
1, 1 ... { $^y + $^z } # 1,1,2,3,5,8...
1, 1 ... &infix:<+> # 1,1,2,3,5,8...
More typically the function is unary, in which case any extra values in the list may be construed as human-readable documentation:
0,2,4 ... { $_ + 2 } # same as 1..*:by(2)
<a b c> ... { .succ } # same as 'a'..*
The function need not be monotonic, of course:
1 ... { -$_ } # 1, -1, 1, -1, 1, -1...
False ... &prefix:<!> # False, True, False...
The function can be 0-ary as well:
() ... &rand # list of random numbers
The function may also be slurpy (*-ary), in which case all the preceding values are passed in (which means they must all be cached by the operator, so performance may suffer).
The arity of the function need not match the number of return values, but if they do match you may interleave unrelated sequences:
1,1 ... { $^a + 1, $^b * 2 } # 1,1,2,2,3,4,4,8,5,16,6,32...
If the right operand is * (Whatever) and the sequence is obviously
arithmetic or geometric, the appropriate function is deduced:
1, 3, 5 ... * # odd numbers
1, 2, 4 ... * # powers of 2
Conjecture: other such patterns may be recognized in the future,
depending on which unrealistic benchmarks we want to run faster. :)
Note: the yada operator is recognized only where a term is expected.
This operator may only be used where an infix is expected. If you
put a comma before the ... it will be taken as a yada list operator
expressing the desire to fail when the list reaches that point:
1..20, ... "I only know up to 20 so far mister"
If the yada operator finds a closure for its argument at compile time, it should probably whine about the fact that it's difficult to turn a closure into an error message. Alternately, we could treat an ellipsis as special when it follows a comma to better support traditional math notation.
The function may choose to terminate its list by returning ().
Since this operator is list associative, an inner function may be
followed by a ... and another function to continue the list,
and so on. Hence,
1 ... { $_ + 1 if $_ < 10 }
... { $_ + 10 if $_ < 100 }
... { $_ + 100 if $_ < 1000 }
produces
1,2,3,4,5,6,7,8,9,
10,20,30,40,50,60,70,80,90,
100,200,300,400,500,600,700,800,900
In slice context the function's return value is appended as a capture rather than as a flattened list of values, and the argument to each function call is the previous capture in the list.
Many of these operators return a list of Captures, which depending on context may or may not flatten them all out into one flat list. The default is to flatten, but see the contextualizers below.
infix:<=>, list assignment
@array = 1,2,3;
With compound targets, performs list assignment. The right side
is looser than comma. You might be wondering why we've classified
this as a prefix operator when its token name is infix:<=>.
That's because you can view the left side as a special syntax for a
prefix listop, much as if you'd said:
@array.assign: 1,2,3
However, the tokener classifies it as infix because it sees it when it's expecting an infix operator. Assignments in general are treated more like retroactive macros, since their meaning depends greatly on what is on the left, especially if what is on the left is a declarator of some sort. We even call some of them pseudo-assignments, but they're all a bit pseudo insofar as we have to figure out whether the left side is a list or a scalar destination.
In any case, list assignment is defined to be arbitrarily lazy, insofar as it basically does the obvious copying as long as there are scalar destinations on the left or already-computed values on the right. However, many list lvalues end with an array destination (where assignment directly to an array can be considered a degenerate case). When copying into an array destination, the list assignment continues to copy in known values immediately, but suspends when it hits an actively iterating iterator (but not one merely passed as an object within the list). The array location on the left is then set up as a self-extending array, with the remainder of the list on the right as the "specs" for its remaining values, to be reified on demand. Hence it is legal to say:
@natural = 0..*;
(Note that when we say that an iterator in list context suspends, it is not required to suspend immediately. When the scheduler is running an iterator, it may choose to precompute values in batches if it thinks that approach will increase throughput. This is likely to be the case on single-core architectures with heavy context switching, and may very well be the case even on manycore CPU architectures when there are more iterators than cores, such that cores may still have to do context switching. In any case, this is all more-or-less transparent to the user because in the abstract the list is all there, even if it hasn't been entirely computed yet.)
Though elements may be reified into an array on demand, they act like ordinary array elements both before and after reification, as far as the user is concerned. These elements may be written to if the underlying container type supports it:
@unnatural = 0..*;
@unnatural[42] = "Life, the Universe, and Everything";
Note that, unlike assignment, binding replaces the container, so the following fails because a range object cannot be subscripted:
@natural := 0..*; # bind a Range object
@natural[42] = "Life, the Universe, and Everything"; # FAILS
but this succeeds:
@unnatural := [0..*]; # bind an Array object
@unnatural[42] = "Life, the Universe, and Everything"; # ok
It is erroneous to make use of any side effects of reification, such as movement of a file pointer, since different implementations may have different batch semantics, and in any case the unreified part of the list already "belongs" to the array.
When a self-extending array is asked for its count of elements, it
is allowed to return +Inf without blowing up if it can determine
by inspection that its unreified parts contain any infinite lists.
If it cannot determine this, it is allowed to use all your
memory, and then some. :)
Assignment to a hash is not lazy (probably).
infix:<:>, the invocant marker
say $*OUT: "howdy, world"
The colon operator turns method calls and contextualizers into list operators. It's not really a general operator; much like list assignment, it takes a special syntax on the left side and turns it into a list operator over the list on the right. See /Invocant marker.
Normal listops
print push say join split substr open etc.
Listop forms of junctional operators
any all one none
Exception generators
fail "Division by zero"
die System::Error(ENOSPC,"Drive $d seems to be full");
warn "Can't open file: $!"
Stubby exception generators
...
!!! "fill this in later, Dave"
??? "oops in $?CLASS"
The ... operator is the "yada, yada, yada" list operator, which
among other things is used as the body in function prototypes.
It complains bitterly (by calling fail) if it is ever executed.
Variant ??? calls warn, and !!! calls die. The argument
is optional, but if provided, is passed onto the fail, warn,
or die. Otherwise the system will make up a message for you based
on the context, indicating that you tried to execute something that
is stubbed out. (This message differs from what fail, warn, and
die would say by default, since the latter operators typically point
out bad data or programming rather than just an incomplete design.)
Reduce operators
[+] [*] [<] [\+] [\*] etc.
See Reduction operators below.
Sigils as contextualizer listops
Sigil Alpha variant
----- -------------
$ item
@ list
@@ slice
% hash
As listops, these look like terms from the left, but raise their precedence on the right sufficiently to govern list infix operators:
$ 1,2 Z 3,4 # [\(1,3),\(2,4)]
@ 1,2 Z 3,4 # 1,3,2,4
@@ 1,2 Z 3,4 # [1,3],[2,4]
% 1,2 Z 3,4 # { 1 => 3, 2 => 4 }
$ 1,2 X 3,4 # [\(1,3),\(1,4),\(2,3),\(2,4)]
@ 1,2 X 3,4 # 1,3,1,4,2,3,2,4
@@ 1,2 X 3,4 # [1,3],[1,4],[2,3],[2,4]
These can also influence the result of functions that returns lists of captures:
$ map { $_, $_*2 }, ^4 # [\(0,0),\(1,2),\(2,4),\(3,6)]
@ map { $_, $_*2 }, ^4 # 0,0,1,2,2,4,3,6
@@ map { $_, $_*2 }, ^4 # [0,0],[1,2],[2,4],[3,6]
% map { $_, $_*2 }, ^4 # { 0 => 0, 1 => 2, 2 => 4, 3 => 6 }
The item contextualizer
item foo()
The new name for Perl 5's scalar contextualizer. Equivalent to $()
(except that empty $() means $($/), while empty item() yields Failure).
We still call the values scalars, and talk about "scalar operators", but
scalar operators are those that put their arguments into item context.
If given a list, this function makes an Array from it. The function
is agnostic about any Captures in such a list. (Use @ or @@
below to force that one way or the other).
Note that this is a list operator, not a unary prefix operator, since you'd generally want it for converting a list to an item. Single items don't need to be converted to items.
The list contextualizer
list foo()
Forces the subsequent expression to be evaluated in list context.
A list of Captures will be transformed into a flat list.
Equivalent to @() (except that empty @() means @($/), while
empty list() means an empty list).
The slice contextualizer
slice foo()
Forces the subsequent expression to be evaluated in slice context.
(Slices are considered to be potentially multidimensional in Perl 6.)
A list of Captures will be transformed into a list of lists.
Equivalent to @@() (except that empty @@() means @@($/), while
empty slice() means a null slice).
The hash contextualizer
hash foo()
Forces the subsequent expression to be evaluated in hash context.
The expression is evaluated in list context (flattening any Captures),
then a hash will be created from the list, taken as a list of Pairs.
(Any element in the list that is not a Pair will pretend to be a key
and grab the next value in the list as its value.) Equivalent to
%() (except that empty %() means %($/), while
empty hash() means an empty hash).
infix:<and>, short-circuit and
$a and $b and $c ...
Returns the first argument that evaluates to false, otherwise
returns the result of the last argument. In list context forces
a false return to mean (). See && above for high-precedence
version.
infix:<andthen>, proceed on success
test1() andthen test2() andthen test3() ...
Returns the first argument whose evaluation indicates failure (that is, if the result is undefined). Otherwise it evaluates and returns the right argument.
If the right side is a block or pointy block, the result of the left
side is bound to any arguments of the block. If the right side is
not a block, a block scope is assumed around the right side, and the
result of the left side is implicitly bound to $_ for the scope
of the right side. That is,
test1() andthen test2()
is equivalent to
test1() andthen -> $_ { test2() }
There is no corresponding high-precedence version.
infix:<or>, short-circuit inclusive or
$a or $b or $c ...
Returns the first argument that evaluates to true, otherwise returns
the result of the last argument. In list context forces a false return
to mean (). See || above for high-precedence version.
infix:<xor>, exclusive or
$a xor $b xor $c ...
Returns the true argument if there is one (and only one). Returns
Bool::False if all arguments are false or if more than one argument is true.
In list context forces a false return to mean ().
See ^^ above for high-precedence version.
infix:<orelse>, proceed on failure
test1() orelse test2() orelse test3() ...
Returns the first argument that evaluates successfully (that is, if the result is defined). Otherwise returns the result of the right argument.
If the right side is a block or pointy block, the result of the left
side is bound to any arguments of the block. If the right side is
not a block, a block scope is assumed around the right side, and the
result of the left side is implicitly bound to $! for the scope
of the right side. That is,
test1() orelse test2()
is equivalent to
test1() orelse -> $! { test2() }
(The high-precedence // operator is similar, but does not set $! or
treat blocks specially.)
As with terms, terminators are not really a precedence level, but looser than the loosest precedence level. They all have the effect of terminating any operator precedence parsing and returning a complete expression to the main parser. They don't care what state the operator precedence parser is in. If the parser is currently expecting a term and the final operator in the expression can't deal with a nullterm, then it's a syntax error. (Notably, the comma operator and many prefix list operators can handle a nullterm.)
Semicolon: ;
$x = 1; $y = 2;
The context determines how the expressions terminated by semicolon
are interpreted. At statement level they are statements. Within a
bracketing construct they are interpreted as lists of Captures,
which in slice context will be treated as the multiple dimensions of a
multidimensional slice. (Other contexts may have other interpretations
or disallow semicolons entirely.)
Feed operators: <==, ==>, <<==, ==>>
source() ==> filter() ==> sink()
The forms with the double angle append rather than clobber the sink's
todo list. The ==>> form always looks ahead for an appropriate
target to append to, either the final sink in the chain, or the next
filter stage with an explicit @(*) or @@(*) target. This means
you can stack multiple feeds onto one filter command:
source1() ==>>
source2() ==>>
source3() ==>>
filter(@(*)) ==> sink()
Similar semantics apply to <<== except it looks backward for
an appropriate target to append to.
Control block: <ws>{...}
When a block occurs after whitespace where an infix is expected, it is interpreated as a control block for a statement control construct. (If there is no whitespace, it is a subscript, and if it is where a term is expected, it's just a bare closure.) If there is no statement looking for such a block currently, it is a syntax error.
Statement modifiers: if, unless, while, until, for
Statement modifiers terminate one expression and start another.
Any unexpected ), ], } at this level.
Calls into the operator precedence parser may be parameterized to recognize additional terminators, but right brackets of any sort (except angles) are automatically included in the set of terminators as tokens of length one. (An infix of longer length could conceivably start with one of these characters, and would be recognized under the longest-token rule and continue the expression, but this practice is discouraged. It would be better to use Unicode for your weird operator.) Angle brackets are exempted so that they can form hyperoperators (see /Hyper operators).
Several operators have been given new names to increase clarity and better Huffman-code the language, while others have changed precedence.
${...}, @{...}, %{...}, etc. dereferencing
forms are now $(...), @(...), %(...), etc. instead.
Listop-like forms use the bare sigil following by whitespace.
Use of the Perl 5 curly forms will result in an error message
pointing the user to the new forms.
-> becomes ., like the rest of the world uses. There is
a pseudo postfix:{'->'} operator that produces a compile-time
error reminding Perl 5 users to use dot instead. (The "pointy block"
use of -> in Perl 5 requires preceding whitespace when the arrow
could be confused with a postfix, that is when an infix is expected.
Preceding whitespace is not required in term position.)
. becomes ~. Think of it as
"stitching" the two ends of its arguments together. String append
is likewise ~=.
The filetest operators are gone. We now use a Pair as a
pattern that calls an object's method:
if $filename ~~ :e { say "exists" }
is the same as
if $filename.e { say "exists" }
The 1st form actually translates to the latter form, so the object's
class decides how to dispatch methods. It just happens that
Str (filenames), IO (filehandles), and Statbuf (stat buffers)
default to the expected filetest semantics, but $regex.i might
tell you whether the regex is case insensitive, for instance.
Using the pattern form, multiple tests may be combined via junctions:
given $handle {
when :r & :w & :x {...}
when :!w | :!x {...}
when * {...}
}
When adverbial pairs are stacked into one term, it is assumed they are ANDed together, so
when :r :w :x
is equivalent to either of:
when :r & :w & :x
when all(:r,:w,:x)
The advantage of the method form is that it can be used in places that
require tighter precedence than ~~ provides:
sort { $^a.M <=> $^b.M }, @files
though that's a silly example since you could just write:
sort { .M }, @files
But that demonstrates the other advantage of the method form, which is that it allows the "unary dot" syntax to test the current topic.
Unlike in earlier versions of Perl 6, these filetests do not return
stat buffers, but simple scalars of type Bool, Int, or Num.
In general, the user need not worry about caching the stat buffer
when a filename is queried. The stat buffer will automatically be
reused if the same object has recently been queried, where "recently"
is defined as less than a second or so. If this is a concern, an
explicit stat() or lstat() may be used to return an explicit stat
buffer object that will not be subject to timeout, and may be tested
repeatedly just as a filename or handle can. A Statbuf object has
a .file method that can be queried for its filename (if known);
the .io method returns the handle (if known). If the Statbuf
object doesn't know its filename but does know its IO handle, then
.file attempts to return .io.file.
Note that :s still returns the filesize, but :!s is true
only if the file is of size 0, since it is smartmatched
against the implicit False argument. By the same token,
:s(0..1024) will be true only for files of size 1K or less.
(Inadvertent use of the Perl 5 forms will normally result in treatment as a negated postdeclared subroutine, which is likely to produce an error message at the end of compilation.)
All postfix operators that do not start with a dot also have
an alternate form that does. (The converse does not hold--just because
you can write x().foo doesn't mean you can write x()foo. Likewise
the ability to say $x.'foo' does not imply that $x'foo' will work.)
The postfix interpretation of an operator may be overridden by
use of a quoted method call, which calls the prefix form instead.
So x().! is always the postfix operator, but x().'!' will always
call !x(). In particular, you can say things like $array.'@'.
and $fh.'=', which
because of the quotes will not be confused lexically with $fh.=new.
~ now imposes a string (Str) context on its
argument, and + imposes a numeric (Num) context (as opposed
to being a no-op in Perl 5). Along the same lines, ? imposes
a boolean (Bool) context, and the | unary operator imposes
a function-arguments (Capture) context on its argument.
Unary sigils impose the container context implied by their sigil.
As with Perl 5, however, $$foo[bar] parses as ( $($foo) )[bar],
so you need $($foo[bar]) to mean the other way.
Bitwise operators get a data type prefix: +, ~, or ?.
For example, Perl 5's | becomes either +| or ~| or ?|,
depending on whether the operands are to be treated as numbers,
strings, or boolean values. Perl 5's left shift << becomes
+< , and correspondingly with right shift. Perl 5's unary ~
(one's complement) becomes either +^ or ~^ or ?^, since a
bitwise NOT is like an exclusive-or against solid ones. Note that
?^ is functionally identical to !, but conceptually coerces to
boolean first and then flips the bit. Please use ! instead.
?| is a logical OR but differs from || in that ?| always
evaluates both sides and returns a standard boolean value. That is,
it's equivalent to ?$a + ?$b != 0. Another difference is that
it has the precedence of an additive operator.
?& is a logical AND but differs from && in that ?& always
evaluates both sides and returns a standard boolean value. That is,
it's equivalent to ?$a * ?$b != 0. Another difference is that
it has the precedence of a multiplicative operator.
Bitwise string operators (those starting with ~) may only be
applied to Buf types or similar compact integer arrays, and treat
the entire chunk of memory as a single huge integer. They differ from
the + operators in that the + operators would try to convert
the string to a number first on the assumption that the string was an
ASCII representation of a number.
x splits into two operators: x (which concatenates repetitions
of a string to produce a single string), and xx (which creates a list of
repetitions of a list or item). "foo" xx * represents an arbitrary
number of copies, useful for initializing lists. The left side of
an xx is evaluated only once. (To call a block repeatedly, use a map
instead.)
? : conditional operator becomes ?? !!. A pseudo operator,
infix:<?>, catches migratory brainos at compile time.
qw{ ... } gets a synonym: < ... >, and an interpolating
variant, «...».
For those still living without the blessings of Unicode, that can also be
written: << ... >>.
, now constructs a List object from its
operands. You have to use a [*-1] subscript to get the last one.
(Note the *. Negative subscripts no longer implicitly count from
the end; in fact, the compiler may complain if you use [-1] on an
object known at compile time not to have negative subscripts.)
Capture in S02 for details. (No whitespace is allowed
after the backslash because that would instead start an "unspace", that is,
an escaped sequence of whitespace or comments. See S02 for details.
However, oddly enough, because of that unspace rule, saying \\ $foo
turns out to be equivalent to \$foo.)
The old .. flipflop operator is now done with
ff operator. (.. now always produces a Range object
even in item context.) The ff operator may take a caret on
either end to exclude either the beginning or ending. There is
also a corresponding fff operator with Perl 5's ... semantics.
You may say
/foo/ ff *
to indicate a flipflop that never flops once flipped.
The list assignment operator now parses on the right like any other list operator, so you don't need parens on the right side of:
@foo = 1, 2, 3;
You do still need them on the left for
($a, $b, $c) = 1, 2, 3;
since assignment operators are tighter than comma to their left.
"Don't care" positions may be indicated by assigment to the * token.
A final * throws away the rest of the list:
($a, *, $c) = 1, 2, 3; # throw away the 2
($a, $b, $c, *) = 1..42; # throw away 4..42
(Within signature syntax, a bare $ can ignore a single argument as well,
and a bare *@ can ignore the remaining arguments.)
List assignment offers the list on the right to each container on the
left in turn, and each container may take one or more elements from the
front of the list. If there are any elements left over, a warning is
issued unless the list on the left ends with * or the final iterator
on the right is defined in terms of *. Hence none of these warn:
($a, $b, $c, *) = 1..9999999;
($a, $b, $c) = 1..*;
($a, $b, $c) = 1 xx *;
($a, $b, $c) = 1, 2, *;
This, however, warns you of information loss:
($a, $b, $c) = 1, 2, 3, 4;
As in Perl 5, assignment to an array or hash slurps up all the remaining values, and can never produce such a warning. (It will, however, leave any subsequent lvalue containers with no elements, just as in Perl 5.)
The left side is evaluated completely for its sequence of containers before any assignment is done. Therefore this:
my $a = 0; my @b;
($a, @b[$a]) = 1, 2;
assigns 2 to @b[0], not @b[1].
The item assignment operator expects a single expression with precedence tighter than comma, so
loop ($a = 1, $b = 2; ; $a++, $b++) {...}
works as a C programmer would expect. The term on the right of the
= is always evaluated in item context.
The syntactic distinction between item and list assignment is similar to the way Perl 5 defines it, but has to be a little different because we can no longer decide the nature of an inner subscript on the basis of the outer sigil. So instead, item assignment is restricted to lvalues that are simple scalar variables, and assignment to anything else is parsed as list assignment. The following forms are parsed as "simple lvalues", and imply item assignment to the scalar container:
$a = 1 # scalar variable
$foo::bar = 1 # scalar package variable
$(ANY) = 1 # scalar dereference (including $$a)
$::(ANY) = 1 # symbolic scalar dereference
$foo::(ANY) = 1 # symbolic scalar dereference
Such a scalar variable lvalue may be decorated with declarators, types, and traits, so these are also item assignments:
my $fido = 1
my Dog $fido = 1
my Dog $fido is trained is vicious = 1
However, anything more complicated than that (including parentheses and subscripted expressions) forces parsing as list assignment instead. Assignment to anything that is not a simple scalar container also forces parsing as list assignment. List assignment expects an expression that is looser than comma precedence. The right side is always evaluated in list context:
($x) = 1,2,3
$x[1] = 1,2,3
@$array = 1,2,3
my ($x, $y) = 1,2,3
our %map = :a<1>, :b<2>
The rules of list assignment apply, so all the assignments involving
$x above produce warnings for discarded values. A warning may be
issued at compile time if it is detected that a run-time warning is
inevitable.
The = in a default declaration within a signature is not really
assignment, and is always parsed as item assignment. (That is, to
assign a list as the default value you must use parentheses to hide
any commas in the list value.)
To assign a list to a scalar value, you cannot say:
$a = 1, 2, 3;
because the 2 and 3 will be seen as being in a void context, as if you'd said:
($a = 1), 2, 3;
Instead, you must do something to explicitly disable or subvert the item assignment interpretation:
$a = [1, 2, 3]; # force construction (probably best practice)
$a = (1, 2, 3); # force grouping as syntactic item
$a = list 1, 2, 3; # force grouping using listop precedence
$a = @ 1, 2, 3; # same thing
@$a = 1, 2, 3; # force list assignment
$a[] = 1, 2, 3; # same thing
If a function is contextually sensitive and you wish to return a scalar
value, you must use item (or $ or + or ~) if you wish to
force item context for either the subscript or the right side:
@a[foo()] = bar(); # foo() and bar() called in list context
@a[item foo()] = item bar(); # foo() and bar() called in item context
@a[$ foo()] = $ bar(); # same thing
@a[+foo()] = +bar(); # foo() and bar() called in numeric context
%a{~foo()} = ~bar(); # foo() and bar() called in string context
But note that the first form still works fine if foo() and bar()
are item-returning functions that are not context sensitive.
In general, this will all just do what the user expects most of the time. The rest of the time item or list behavior can be forced with minimal syntax.
List operators are all parsed consistently. As in Perl 5, to the left a list operator look like term, while to the right it looks like an operator that is looser than comma. Unlike in Perl 5, the difference between the list operator form and the function form is consistently indicated via whitespace between the list operator and the first argument. If there is whitespace, it is always a list operator, and the next token will be taken as the first term of the list (or if there are no terms, as the expression terminator). Any infix operator occurring where a term is expected will be misinterpreted as a term:
say + 2; # means say(+2);
If there is no whitespace, subsequent parsing depends on the syntactic category of the next item. Parentheses (with or without a dot) turn the list operator into a function call instead, and all the function's arguments must be passed inside the parentheses (except for postfix adverbs, which may follow the parentheses provided they would not attach to some other operator by the rules of precedence).
Other than various forms of parentheses, all other postfixes are disallowed immediately after a list operator, even if there are no arguments. To add a postfix to an argumentless list operator you must write it as a function call with empty parentheses:
foo.[] # ILLEGAL
foo++ # ILLEGAL
foo().[] # legal
foo()++ # legal (if foo() is rw)
After the parentheses any postfix operators are allowed, and apply to the result of the function call. (Also note that the postfix restriction applies only to list operators; it doesn't apply to methods. It is legal to say
$foo.bar<a b c>
to mean
$foo.bar().{'a','b','c'}
because methods never assume there are arguments unless followed by parentheses or a colon.)
If the next item after the list operator is either an infix operator or a term, a syntax error is reported. [Conjecture: this may be relaxed in non-strict mode.]
Examples:
say foo + 1; say(foo(+1));
say foo $x; say(foo($x));
say foo$x; ILLEGAL, need space or parens
say foo+1; ILLEGAL, need space or parens
say foo++; ILLEGAL, need parens
say foo($bar+1),$baz say(foo($bar+1), $baz);
say foo.($bar+1),$baz say(foo($bar+1), $baz);
say foo ($bar+1),$baz say(foo($bar+1, $baz));
say foo .($bar+1),$baz say(foo($_.($bar+1), $baz));
say foo[$bar+1],$baz ILLEGAL, need foo()[]
say foo.[$bar+1],$baz ILLEGAL, need foo().[]
say foo [$bar+1],$baz say(foo([$bar+1], $baz));
say foo .[$bar+1],$baz say(foo($_.[$bar+1], $baz));
say foo{$bar+1},$baz ILLEGAL, need foo(){}
say foo.{$bar+1},$baz ILLEGAL, need foo().{}
say foo {$bar+1},$baz say(foo({$bar+1}, $baz));
say foo .{$bar+1},$baz say(foo($_.{$bar+1}, $baz));
say foo<$bar+1>,$baz ILLEGAL, need foo()<>
say foo.<$bar+1>,$baz ILLEGAL, need foo().<>
say foo <$bar+1>,$baz say(foo(<$bar+1>, $baz));
say foo .<$bar+1>,$baz say(foo($_.<$bar+1>, $baz));
Note that Perl 6 is making a consistent three-way distinction between
term vs postfix vs infix, and will interpret an overloaded character
like < accordingly:
any <a b c> any('a','b','c') # term
any()<a b c> (any).{'a','b','c'} # postfix
any() < $x (any) < $x # infix
any<a b c> ILLEGAL # stealth postfix
This will seem unfamiliar and "undwimmy" to Perl 5 programmers, who are used to a grammar that sloppily hardwires a few postfix operators at the price of extensibility. Perl 6 chooses instead to mandate a whitespace dependency in order to gain a completely extensible class of postfix operators.
term:<foo>.
Such a term is never considered a list prefix operator, though it
allows an optional set of empty parentheses (because it represents a
Code object). Unlike functions and list operators with arguments
(see above), a 0-ary term does not require parentheses even if followed
immediately by a postfix.
prefix:<foo> to declare foo as a named unary in precedence;
it must still take a single positional parameter (though any number of
named parameters are allowed, which can be bound to adverbs).
All other subs with arguments parse as list operators.
The && and || operators are smarter about list context
and return () on failure in list context rather than Bool::False.
The operators still short-circuit, but if either operator would return
a false value, it is converted to the null list in list context so
that the false results are self-deleting. (If this self-deleting
behavior is not desired, put the expression into item context rather
than list context.) This self-deletion is a behavior of the operators
themselves, not a general property of boolean values in list context, so
@foo = true($a||$b);
is guaranteed to insert exactly one boolean value into @foo.
|, &, and ^ are no longer bitwise operators (see
/Changes to Perl 5 operators) but now serve a much higher cause:
they are now the junction constructors.
A junction is a single value that is equivalent to multiple values. They thread through operations, returning another junction representing the result:
(1|2|3) + 4; # 5|6|7
(1|2) + (3&4); # (4|5) & (5|6)
Note how when two junctions are applied through an operator, the result is a junction representing the operator applied to each combination of values.
Junctions come with the functional variants any, all, one, and none.
This opens doors for constructions like:
unless $roll == any(1..6) { print "Invalid roll" }
if $roll == 1|2|3 { print "Low roll" }
Junctions work through subscripting:
doit() if @foo[any(1,2,3)]
Junctions are specifically unordered. So if you say
foo() | bar() | baz() == 42
it indicates to the compiler that there is no coupling between the junctional arguments. They can be evaluated in any order or in parallel. They can short-circuit as soon as any of them return 42, and not run the others at all. Or if running in parallel, the first successful thread may terminate the other threads abruptly. In general you probably want to avoid code with side effects in junctions.
Use of negative operators with syntactically recognizable junctions may produce a warning on code that works differently in English than in Perl. Instead of writing
if $a != 1 | 2 | 3 {...}
you need to write
if not $a == 1 | 2 | 3 {...}
However, this is only a syntactic warning, and
if $a != $b {...}
will not complain if $b happens to contain a junction at runtime.
Junctive methods on arrays, lists, and sets work just like the
corresponding list operators. However, junctive methods on a hash
make a junction of only the hash's keys. Use the listop form (or an
explicit .pairs) to make a junction of pairs.
Binary === tests immutable type and value correspondence:
for two value types (that is, immutable types), tests whether
they are the same value (eg. 1 === 1); for two mutable types
(object types), checks whether they have the same identity value.
(For most such types the identity is simply the reference itself.)
It is not true that [1,2] === [1,2] because those are different
Array objects, but it is true that @a === @a because those are
the same Array object).
Any object type may pretend to be a value type by defining a .WHICH
method which returns a value type that can be recursively compared
using ===, or in cases where that is impractical, by overloading
=== such that the comparison treats values consistently with their
"eternal" identity. (Strings are defined as values this way despite
also being objects.)
Two values are never equivalent unless they are of exactly the same type. By
contrast, eq always coerces to string, while == always coerces to
numeric. In fact, $a eq $b really means "~$a === ~$b" and $a == $b
means +$a === +$b.
Note also that, while string-keyed hashes use eq semantics by default,
object-keyed hashes use === semantics, and general value-keyed hashes
use eqv semantics.
Binary eqv tests equality much like === does, but does
so with "snapshot" semantics rather than "eternal" semantics. For
top-level components of your value that are of immutable types, eqv
is identical in behavior to ===. For components of your value
that are mutable, however, rather than comparing object identity using
===, the eqv operator tests whether the canonical representation
of both subvalues would be identical if we took a snapshot of them
right now and compared those (now-immutable) snapshots using ===.
If that's not enough flexibility, there is also an eqv() function
that can be passed additional information specifying how you want
canonical values to be generated before comparison. This gives
eqv() the same kind of expressive power as a sort signature.
(And indeed, the cmp operator from Perl 5 also has a functional
analog, cmp(), that takes additional instructions on how to
do 3-way comparisons of the kind that a sorting algorithm wants.)
In particular, a signature passed to eqv() will be bound to the
two operands in question, and then the comparison will proceed
on the formal parameters according to the information contained
in the signature, so you can force numeric, string, natural, or
other comparisons with proper declarations of the parameter's type
and traits. If the signature doesn't match the operands, eqv()
reverts to standard eqv comparison. (Likewise for cmp().)
cmp is no longer the comparison operator that
forces stringification. Use the leg operator for the old Perl 5
cmp semantics. The cmp is just like the eqv above except that
instead of returning Bool::False or Bool::True values it always
returns Order::Increase, Order::Same, or Order::Decrease
(which numerify to -1, 0, or +1).
leg operator (less than, equal, or greater) is defined
in terms of cmp, so $a leg $b is now defined as ~$a cmp ~$b.
The sort operator still defaults to cmp rather than leg. The
<=> operator's semantics are unchanged except that it returns
an Order value as described above. In other words, $a <=> $b
is now equivalent to +$a cmp +$b.
cmp
semantics, use the generic before and after infix operators.
As ordinary infix operators these may be negated (!before and !after)
as well as reduced ([before] and [after]).
min and max may be used to select one or the other
of their arguments. Reducing listop forms [min] and [max] are
also available, as are the min= and max= assignment operators.
By default min and max use cmp semantics. As with all cmp-based
operators, this may be modified by an adverb specifying different semantics.
The .. range operator has variants with ^ on either end to
indicate exclusion of that endpoint from the range. It always
produces a Range object. Range objects are lazy iterators, and
can be interrogated for their current .from and .to values
(which change as they are iterated). The .minmax method returns
both as a two-element list representing the interval. Ranges are not
autoreversing: 2..1 is always a null range. Likewise, 1^..^2
produces no values when iterated, but does represent the interval from
1 to 2 excluding the endpoints when used as a pattern. To iterate
a range in reverse use:
2..1:by(-1)
reverse 1..2
(The reverse is preferred because it works for alphabetic ranges
as well.) Note that, while .minmax normally returns (.from,.to),
a negative :by causes the .minmax method returns (.to,.from)
instead. You may also use .min and .max to produce the individual
values of the .minmax pair, but again note that they are reversed
from .from and .to when the step is negative. Since a reversed
Range changes its direction, it swaps its .from and .to but
not its .min and .max.
Because Range objects are lazy, they do not automatically generate
a list. One result of this is that a reversed Range object is still lazy.
Another is that smart matching against a Range object smartmatches the
endpoints in the domain of the object being matched, so fractional
numbers are not truncated before comparison to integer ranges:
1.5 ~~ 1^..^2 # true, equivalent to 1 < 1.5 < 2
2.1 ~~ 1..2 # false, equivalent to 1 <= 2.1 <= 2
If a * (see the "Whatever" type in S02) occurs on the right side
of a range, it is taken to mean "positive infinity" in whatever
typespace the range is operating, as inferred from the left operand.
A * on the left means "negative infinity" for types that support
negative values, and the first value in the typespace otherwise as
inferred from the right operand. (For signed infinities the signs
reverse for a negative step.) A star on both sides prevents any type
from being inferred other than the Ordered role.
0..* # 0 .. +Inf
'a'..* # 'a' .. 'zzzzzzzzzzzzzzzzzzzzzzzzzzzzz...'
*..0 # -Inf .. 0
*..* # "-Inf .. +Inf", really Ordered
1.2.3..* # Any version higher than 1.2.3.
May..* # May through December
Note: infinite lists are constructed lazily. And even though *..*
can't be constructed at all, it's still useful as a selector object.
Range objects may be iterated on either end as long as it is not
infinite. (Iterating an infinite end does not fail but just produces a
lot of infinities.) Ordinary iteration iterates the .from value by
adding the step. Either prefix:<=> or the shift function
may be used to iterate the front of a range object. The pop
function iterates the .to end by subtracting the step. In either
case, the value returned is either the old value if the endpoint
was inclusive, or the next value if the endpoint was exclusive.
In the case of ranges that are not an integral multiple of the step,
no check is done to see that iterating the front would produce the
same list as interating from the back and reversing. So we have
$range = 1..^42.5;
$front = $range.shift; # $front = 1, $range = 2..^42.5
$back = $range.pop; # $back = 41.5, $range = 2..^41.5
For any kind of zip or dwimmy hyper operator, any list ending with *
is assumed to be infinitely extensible by taking its final element
and replicating it:
@array, *
is short for something like:
@array[0..^@array], @array[*-1] xx *
An empty Range cannot be iterated; it returns a Failure instead. An empty
range still has a defined min and max, but the min is greater than the max.
If a range is generated using a magical autoincrement, it stops if the magical increment would "carry" and make the next value longer than the "to" value, on the assumption that the sequence can never match the final value exactly. Hence, all of these produce 'A' .. 'Z':
'A' .. 'Z'
'A' .. 'z'
'A' .. '_'
'A' .. '~'
The unary ^ operator generates a range from 0 up to
one less than its argument. So ^4 is short for 0..^4 or 0..3.
for ^4 { say $_ } # 0, 1, 2, 3
If applied to a list, it generates a multidimensional set of subscripts.
for ^(3,3) { ... } # (0,0)(0,1)(0,2)(1,0)(1,1)(1,2)(2,0)(2,1)(2,2)
If applied to a type name, it indicates the metaclass instance instead,
so ^Moose is short for HOW(Moose) or Moose.HOW. It still kinda
means "what is this thing's domain" in an abstract sort of way.
Since use of Range objects in item context is usually
non-sensical, a Range object used as an operand for scalar operators
will generally attempt to distribute the operator to its endpoints and
return another suitably modified Range instead. (Notable exceptions
include infix:<~~>, which does smart matching, and prefix:<+>
which returns the length of the range.) Therefore if you wish to
write a slice using a length instead of an endpoint, you can say
@foo[ start() + ^$len ]
which is short for:
@foo[ start() + (0..^$len) ]
which is equivalent to something like:
@foo[ list do { my $tmp = start(); $tmp ..^ $tmp+$len } ]
In other words, operators of numeric and other ordered types are
generally overloaded to do something sensible on Range objects.
In particular, multiplicative operators not only multiply the endpoints
but also the "by" of the Range object:
(1..11:by(2)) * 5 # same as 5..55:by(10)
5,15,25,35,45,45,55
Conjecture: non-linear functions might even produce non-uniform "by" values! Think of log scaling, for instance.
Perl 6 supports the natural extension to the comparison operators, allowing multiple operands:
if 1 < $a < 100 { say "Good, you picked a number *between* 1 and 100." }
if 3 < $roll <= 6 { print "High roll" }
if 1 <= $roll1 == $roll2 <= 6 { print "Doubles!" }
A chain of comparisons short-circuits if the first comparison fails:
1 > 2 > die("this is never reached");
Each argument in the chain will evaluate at most once:
1 > $x++ > 2 # $x increments exactly once
Note: any operator beginning with < must have whitespace
in front of it, or it will be interpreted as a hash subscript instead.
Here is the table of smart matches for standard Perl 6
(that is, the dialect of Perl in effect at the start of your
compilation unit). Smart matching is generally done on the current
"topic", that is, on $_. In the table below, $_ represents the
left side of the ~~ operator, or the argument to a given,
or to any other topicalizer. X represents the pattern to be
matched against on the right side of ~~, or after a when.
The first section contains privileged syntax; if a match can be done
via one of those entries, it will be. These special syntaxes are
dispatched by their form rather than their type. Otherwise the rest
of the table is used, and the match will be dispatched according to
the normal method dispatch rules. The optimizer is allowed to assume
that no additional match operators are defined after compile time,
so if the pattern types are evident at compile time, the jump table
can be optimized. However, the syntax of this part of the table
is still somewhat privileged, insofar as the ~~ operator is one
of the few operators in Perl that does not use multiple dispatch.
Instead, type-based smart matches singly dispatch to an underlying
method belonging to the X pattern object.
In other words, smart matches are dispatched first on the basis of the
pattern's form or type (the X below), and then that pattern itself
decides whether and how to pay attention to the type of the topic
($_). So the second column below is really the primary column.
The Any entries in the first column indicate a pattern that either
doesn't care about the type of the topic, or that picks that entry
as a default because the more specific types listed above it didn't match.
$_ X Type of Match Implied Match if (given $_)
====== ===== ===================== ===================
Any Code:($) item sub truth X($_)
Any Code:() simple closure truth X() (ignoring $_)
Any undef undefined not .defined
Any * block signature match block successfully binds to |$_
Any .foo method truth ?X i.e. ?.foo
Any .foo(...) method truth ?X i.e. ?.foo
Any .(...) sub call truth ?X i.e. ?.(...)
Any .[...] array value slice truth ?all(X) i.e. ?all(.[...])
Any .{...} hash value slice truth ?all(X) i.e. ?all(.{...})
Any .<...> hash value slice truth ?all(X) i.e. ?all(.<...>)
Any Bool simple truth X
Any Num numeric equality +$_ == X
Any Str string equality ~$_ eq X
Hash Pair test hash mapping $_{Xkey} ~~ Xval
Any Pair test object attribute .Xkey ~~ Xval (e.g. filetests)
Set Set identical sets $_ === X
Hash Set hash keys same set $_.keys === X
Any Set force set comparison Set($_) === X
Array Array arrays are comparable $_ «===» X (dwims * wildcards!)
Set Array array equiv to set $_ === Set(X)
Any Array lists are comparable @$_ «===» X
Hash Hash hash keys same set $_.keys === X.keys
Set Hash hash keys same set $_ === X.keys
Array Hash hash slice existence X.contains(any @$_)
Regex Hash hash key grep any(X.keys).match($_)
Scalar Hash hash entry existence X.contains($_)
Any Hash hash slice existence X.contains(any @$_)
Str Regex string pattern match .match(X)
Hash Regex hash key "boolean grep" .any.match(X)
Array Regex array "boolean grep" .any.match(X)
Any Regex pattern match .match(X)
Num Range in numeric range X.min <= $_ <= X.max (mod ^'s)
Str Range in string range X.min le $_ le X.max (mod ^'s)
Any Range in generic range [!after] X.min,$_,X.max (etc.)
Any Type type membership $_.does(X)
Signature Signature sig compatibility $_ is a subset of X ???
Code Signature sig compatibility $_.sig is a subset of X ???
Capture Signature parameters bindable $_ could bind to X (doesn't!)
Any Signature parameters bindable |$_ could bind to X (doesn't!)
Signature Capture parameters bindable X could bind to $_
Any Any scalars are identical $_ === X
The final rule is applied only if no other pattern type claims X.
All smartmatch types are "itemized"; both ~~ and given/when
provide item contexts to their arguments, and autothread any
junctive matches so that the eventual dispatch to .ACCEPTS never
sees anything "plural". So both $_ and X above are potentially
container objects that are treated as scalars. (You may hyperize
~~ explicitly, though. In this case all smartmatching is done
using the type-based dispatch to .ACCEPTS, not the form-based
dispatch at the front of the table.)
The exact form of the underlying type-based method dispatch is:
X.ACCEPTS($_) # for ~~
X.REJECTS($_) # for !~~
As a single dispatch call this pays attention only to the type of
X initially. The ACCEPTS method interface is defined by the
Pattern role. Any class composing the Pattern role may choose
to provide a single ACCEPTS method to handle everything, which
corresponds to those pattern types that have only one entry with
an Any on the left above. Or the class may choose to provide
multiple ACCEPTS multi-methods within the class, and these
will then redispatch within the class based on the type of $_.
The class may also define one or more REJECTS methods; if it does
not, the default REJECTS method from the Pattern role defines
it in terms of a negated ACCEPTS method call. This generic method
may be less efficient than a custom REJECTS method would be, however.
The smartmatch table is primarily intended to reflect forms and types that are recognized at compile time. To avoid an explosion of entries, the table assumes the following types will behave similarly:
Actual type Use entries for
=========== ===============
List Seq Array
KeySet KeyBag KeyHash Hash
Class Enum Role Type
Subst Grammar Regex
Char Cat Str
Int UInt etc. Num
Match Capture
Byte Str or Int
Buf Str or Array of Int
(Note, however, that these mappings can be overridden by explicit
definition of the appropriate ACCEPTS and REJECTS methods.
If the redefinition occurs at compile time prior to analysis of the
smart match then the information is also available to the optimizer.)
A Buf type containing any bytes or integers outside the ASCII
range may silently promote to a Str type for pattern matching if
and only if its relationship to Unicode is clearly declared or typed.
This type information might come from an input filehandle, or the
Buf role may be a parametric type that allows you to instantiate
buffers with various known encodings. In the absence of such typing
information, you may still do pattern matching against the buffer, but
(apart from assuming the lowest 7 bits represent ASCII) any attempt
to treat the buffer as other than a sequence integers is erroneous,
and warnings may be generously issued.
Matching against a Grammar object will call the TOP method
defined in the grammar. The TOP method may either be a rule
itself, or may call the actual top rule automatically. How the
Grammar determines the top rule is up to the grammar, but normal
Perl 6 grammars will default to setting top to the first rule in the
original base grammar. Derived grammars then inherit this idea of
the top rule. This may be overridden in either the base grammar or a
derived grammar by explicitly naming a rule TOP, or defining your
own TOP method to call some other rule.
Matching against a Signature does not actually bind any variables,
but only tests to see if the signature could bind. To really bind
to a signature, use the * pattern to delegate binding to the when
statement's block instead. Matching against * is special in that
it takes its truth from whether the subsequent block is bound against
the topic, so you can do ordered signature matching:
given $capture {
when * -> Int $a, Str $b { ... }
when * -> Str $a, Int $b { ... }
when * -> $a, $b { ... }
when * { ... }
}
This can be useful when the unordered semantics of multiple dispatch
are insufficient for defining the "pecking order" of code. Note that
you can bind to either a bare block or a pointy block. Binding to a
bare block conveniently leaves the topic in $_, so the final form
above is equivalent to a default. (Placeholder parameters may
also be used in the bare block form, though of course their types
cannot be specified that way.)
There is no pattern matching defined for the Any pattern, so if you
find yourself in the situation of wanting a reversed smartmatch test
with an Any on the right, you can almost always get it by explicit
call to the underlying ACCEPTS method using $_ as the pattern.
For example:
$_ X Type of Match Wanted What to use on the right
====== === ==================== ========================
Code Any item sub truth .ACCEPTS(X) or .(X)
Range Any in range .ACCEPTS(X)
Type Any type membership .ACCEPTS(X) or .does(X)
Regex Any pattern match .ACCEPTS(X)
etc.
Similar tricks will allow you to bend the default matching rules for composite objects as long as you start with a dotted method on $_:
given $somethingordered {
when .values.'[<=]' { say "increasing" }
when .values.'[>=]' { say "decreasing" }
}
In a pinch you can define a macro to do the "reversed when":
my macro statement_control:<ACCEPTS> () { "when .ACCEPTS: " }
given $pattern {
ACCEPTS $a { ... }
ACCEPTS $b { ... }
ACCEPTS $c { ... }
}
Various proposed-but-deprecated smartmatch behaviors may be easily (and we hope, more readably) emulated as follows:
$_ X Type of Match Wanted What to use on the right
====== === ==================== ========================
Array Num array element truth .[X]
Array Num array contains number *,X,*
Array Str array contains string *,X,*
Array Seq array begins with seq X,*
Array Seq array contains seq *,X,*
Array Seq array ends with seq *,X
Hash Str hash element truth .{X}
Hash Str hash key existence .contains(X)
Hash Num hash element truth .{X}
Hash Num hash key existence .contains(X)
Buf Int buffer contains int .match(X)
Str Char string contains char .match(X)
Str Str string contains string .match(X)
Array Scalar array contains item .any === X
Str Array array contains string X.any
Num Array array contains number X.any
Scalar Array array contains object X.any
Hash Array hash slice exists .contains(X.all) .contains(X.any)
Set Set subset relation .contains(X)
Set Hash subset relation .contains(X)
Any Set subset relation .Set.contains(X)
Any Hash subset relation .Set.contains(X)
Any Set superset relation X.contains($_)
Any Hash superset relation X.contains($_)
Any Set sets intersect .contains(X.any)
Set Array subset relation X,* # (conjectured)
Array Regex match array as string .Cat.match(X) cat(@$_).match(X)
(Note that the .cat method and the Cat type coercion both take a
single object, unlike the cat function which, as a list operator,
takes a syntactic list (or multilist) and flattens it. All of these
return a Cat object, however.)
Boolean expressions are those known to return a boolean value, such
as comparisons, or the unary ? operator. They may reference $_
explicitly or implicitly. If they don't reference $_ at all, that's
okay too--in that case you're just using the switch structure as a more
readable alternative to a string of elsifs. Note, however, that this means
you can't write:
given $boolean {
when True {...}
when False {...}
}
because it will always choose the True case. Instead use something like:
given $boolean {
when .true {...}
when .not {...}
}
Better, just use an if statement.
Note also that regex matching does not return a Bool, but merely
a Match object that can be used as a boolean value. Use an explicit
? or true to force a Bool value if desired.
The primary use of the ~~ operator is to return a boolean value in
a boolean context. However, for certain operands such as regular
expressions, use of the operator within item or list context transfers
the context to that operand, so that, for instance, a regular expression
can return a list of matched substrings, as in Perl 5. This is done
by returning an object that can return a list in list context, or that
can return a boolean in a boolean context. In the case regex matching
the Match object is a kind of Capture, which has these capabilities.
For the purpose of smartmatching, all Set and Bag values are
considered to be of type KeyHash, that is, Hash containers
where the keys represent the unique objects and the values represent
the replication count of those unique keys. (Obviously, a Set can
have only 0 or 1 replication because of the guarantee of uniqueness).
The Cat type allows you to have an infinitely extensible string.
You can match an array or iterator by feeding it to a Cat,
which is essentially a Str interface over an iterator of some sort.
Then a Regex can be used against it as if it were an ordinary
string. The Regex engine can ask the string if it has more
characters, and the string will extend itself if possible from its
underlying interator. (Note that such strings have an indefinite
number of characters, so if you use .* in your pattern, or if you
ask the string how many characters it has in it, or if you even print
the whole string, it may be feel compelled to slurp in the rest of
the string, which may or may not be expeditious.)
The cat operator takes a (potentially lazy) list and returns a
Cat object. In string context this coerces each of its elements
to strings lazily, and behaves as a string of indeterminate length.
You can search a gather like this:
my $lazystr := cat gather for @foo { take .bar }
$lazystr ~~ /pattern/;
The Cat interface allows the regex to match element boundaries
with the <,> assertion, and the StrPos objects returned by
the match can be broken down into elements index and position within
that list element. If the underlying data structure is a mutable
array, changes to the array (such as by shift or pop) are tracked
by the Cat so that the element numbers remain correct. Strings,
arrays, lists, sequences, captures, and tree nodes can all be pattern
matched by regexes or by signatures more or less interchangably.
However, the structure searched is not guaranteed to maintain a .pos
unless you are searching a Str or Cat.
An appended : marks the invocant when using the indirect-object
syntax for Perl 6 method calls. The following two statements are
equivalent:
$hacker.feed('Pizza and cola');
feed $hacker: 'Pizza and cola';
A colon may also be used on an ordinary method call to indicate that it should be parsed as a list operator:
$hacker.feed: 'Pizza and cola';
This colon is a separate token. A colon prefixing an adverb is not a separate token. Therefore, under the longest-token rule,
$hacker.feed:xxx('Pizza and cola');
is tokenized as an adverb applying to the method as its "toplevel preceding operator":
$hacker.feed :xxx('Pizza and cola');
not as an xxx sub in the argument list of .feed:
$hacker.feed: xxx('Pizza and cola'); # wrong
If you want both meanings of colon in order to supply both an adverb and some positional arguments, you have to put the colon twice:
$hacker.feed: :xxx('Pizza and cola'), 1,2,3;
(For similar reasons it's required to put whitespace after the colon of a label.)
Note in particular that because of adverbial precedence:
1 + $hacker.feed :xxx('Pizza and cola');
will apply the :xxx adverb to the + operator, not the method call.
This is not likely to succeed.
The new operators ==> and <== are akin to UNIX pipes,
but work with functions or statements that accept and return lists.
Since these lists are composed of discrete objects and not liquids,
we call these feed operators rather than pipes. For example,
@result = map { floor($^x / 2) },
grep { /^ \d+ $/ },
@data;
can also now be written with rightward feeds as:
@data ==> grep { /^ \d+ $/ }
==> map { floor($^x / 2) }
==> @result;
or with leftward feeds as:
@result <== map { floor($^x / 2) }
<== grep { /^ \d+ $/ }
<== @data;
Either form more clearly indicates the flow of data. See S06 for more of the (less-than-obvious) details on these two operators.
Perl 6's operators have been greatly regularized, for instance, by consistently prefixing numeric, strin