View Source Match Specifications in Erlang
A "match specification" (match_spec
) is an Erlang term describing a small
"program" that tries to match something. It can be used to either control
tracing with erlang:trace_pattern/3
or to search for objects in an ETS table
with for example ets:select/2
. The match specification in many ways works like
a small function in Erlang, but is interpreted/compiled by the Erlang runtime
system to something much more efficient than calling an Erlang function. The
match specification is also very limited compared to the expressiveness of real
Erlang functions.
The most notable difference between a match specification and an Erlang fun is the syntax. Match specifications are Erlang terms, not Erlang code. Also, a match specification has a strange concept of exceptions:
- An exception (such as
badarg
) in theMatchCondition
part, which resembles an Erlang guard, generates immediate failure. - An exception in the
MatchBody
part, which resembles the body of an Erlang function, is implicitly caught and results in the single atom'EXIT'
.
Grammar
A match specification used in tracing can be described in the following informal grammar:
- MatchExpression ::= [ MatchFunction, ... ]
- MatchFunction ::= { MatchHead, MatchConditions, MatchBody }
MatchHead ::= MatchVariable |
'_'
| [ MatchHeadPart, ... ]MatchHeadPart ::= term() | MatchVariable |
'_'
- MatchVariable ::= '$<number>'
MatchConditions ::= [ MatchCondition, ...] |
[]
MatchCondition ::= { GuardFunction } | { GuardFunction, ConditionExpression, ... }
BoolFunction ::=
is_atom
|is_float
|is_integer
|is_list
|is_number
|is_pid
|is_port
|is_reference
|is_tuple
|is_map
|is_map_key
|is_binary
|is_bitstring
|is_boolean
|is_function
|is_record
|is_seq_trace
|'and'
|'or'
|'not'
|'xor'
|'andalso'
|'orelse'
ConditionExpression ::= ExprMatchVariable | { GuardFunction } | { GuardFunction, ConditionExpression, ... } | TermConstruct
ExprMatchVariable ::= MatchVariable (bound in the MatchHead) |
'$_'
|'$$'
TermConstruct = {{}} | {{ ConditionExpression, ... }} |
[]
| [ConditionExpression, ...] |#{}
| #{term() => ConditionExpression, ...} | NonCompositeTerm | Constant- NonCompositeTerm ::= term() (not list or tuple or map)
- Constant ::= {
const
, term()} GuardFunction ::= BoolFunction |
abs
|element
|hd
|length
|map_get
|map_size
|max
|min
|node
|float
|round
|floor
|ceil
|size
|bit_size
|byte_size
|tuple_size
|tl
|trunc
|binary_part
|'+'
|'-'
|'*'
|'div'
|'rem'
|'band'
|'bor'
|'bxor'
|'bnot'
|'bsl'
|'bsr'
|'>'
|'>='
|'<'
|'=<'
|'=:='
|'=='
|'=/='
|'/='
|self
|get_tcw
- MatchBody ::= [ ActionTerm ]
ActionTerm ::= ConditionExpression | ActionCall
ActionCall ::= {ActionFunction} | {ActionFunction, ActionTerm, ...}
ActionFunction ::=
set_seq_token
|get_seq_token
|message
|return_trace
|exception_trace
|process_dump
|enable_trace
|disable_trace
|trace
|display
|caller
|caller_line
|current_stacktrace
|set_tcw
|silent
A match specification used in ets
can be described in the following
informal grammar:
- MatchExpression ::= [ MatchFunction, ... ]
- MatchFunction ::= { MatchHead, MatchConditions, MatchBody }
MatchHead ::= MatchVariable |
'_'
| { MatchHeadPart, ... }MatchHeadPart ::= term() | MatchVariable |
'_'
- MatchVariable ::= '$<number>'
MatchConditions ::= [ MatchCondition, ...] |
[]
MatchCondition ::= { GuardFunction } | { GuardFunction, ConditionExpression, ... }
BoolFunction ::=
is_atom
|is_float
|is_integer
|is_list
|is_number
|is_pid
|is_port
|is_reference
|is_tuple
|is_map
|is_map_key
|is_binary
|is_bitstring
|is_boolean
|is_function
|is_record
|'and'
|'or'
|'not'
|'xor'
|'andalso'
|'orelse'
ConditionExpression ::= ExprMatchVariable | { GuardFunction } | { GuardFunction, ConditionExpression, ... } | TermConstruct
ExprMatchVariable ::= MatchVariable (bound in the MatchHead) |
'$_'
|'$$'
TermConstruct = {{}} | {{ ConditionExpression, ... }} |
[]
| [ConditionExpression, ...] | #{} | #{term() => ConditionExpression, ...} | NonCompositeTerm | Constant- NonCompositeTerm ::= term() (not list or tuple or map)
- Constant ::= {
const
, term()} GuardFunction ::= BoolFunction |
abs
|element
|hd
|length
|map_get
|map_size
|max
|min
|node
|float
|round
|floor
|ceil
|size
|bit_size
|byte_size
|tuple_size
|tl
|trunc
|binary_part
|'+'
|'-'
|'*'
|'div'
|'rem'
|'band'
|'bor'
|'bxor'
|'bnot'
|'bsl'
|'bsr'
|'>'
|'>='
|'<'
|'=<'
|'=:='
|'=='
|'=/='
|'/='
|self
- MatchBody ::= [ ConditionExpression, ... ]
Function Descriptions
Functions Allowed in All Types of Match Specifications
The functions allowed in match_spec
work as follows:
is_atom
,is_boolean
,is_float
,is_integer
,is_list
,is_number
,is_pid
,is_port
,is_reference
,is_tuple
,is_map
,is_binary
,is_bitstring
,is_function
- Same as the corresponding guard tests in Erlang, returntrue
orfalse
.is_record
- Takes an additional parameter, which must be the result ofrecord_info(size, <record_type>)
, like in{is_record, '$1', rectype, record_info(size, rectype)}
.'not'
- Negates its single argument (anything other thanfalse
givesfalse
).'and'
- Returnstrue
if all its arguments (variable length argument list) evaluate totrue
, otherwisefalse
. Evaluation order is undefined.'or'
- Returnstrue
if any of its arguments evaluates totrue
. Variable length argument list. Evaluation order is undefined.'andalso'
- Works as'and'
, but quits evaluating its arguments when one argument evaluates to something else thantrue
. Arguments are evaluated left to right.'orelse'
- Works as'or'
, but quits evaluating as soon as one of its arguments evaluates totrue
. Arguments are evaluated left to right.'xor'
- Only two arguments, of which one must betrue
and the otherfalse
to returntrue
; otherwise'xor'
returns false.abs
,element
,hd
,length
,map_get
,map_size
,max
,min
,node
,round
,ceil
,floor
,float
,size
,bit_size
,byte_size
,tuple_size
,tl
,trunc
,binary_part
,'+'
,'-'
,'*'
,'div'
,'rem'
,'band'
,'bor'
,'bxor'
,'bnot'
,'bsl'
,'bsr'
,'>'
,'>='
,'<'
,'=<'
,'=:='
,'=='
,'=/='
,'/='
,self
- Same as the corresponding Erlang BIFs (or operators). In case of bad arguments, the result depends on the context. In theMatchConditions
part of the expression, the test fails immediately (like in an Erlang guard). In theMatchBody
part, exceptions are implicitly caught and the call results in the atom'EXIT'
.
Functions Allowed Only for Tracing
The functions allowed only for tracing work as follows:
is_seq_trace
- Returnstrue
if a sequential trace token is set for the current process, otherwisefalse
.set_seq_token
- Works asseq_trace:set_token/2
, but returnstrue
on success, and'EXIT'
on error or bad argument. Only allowed in theMatchBody
part and only allowed when tracing.get_seq_token
- Same asseq_trace:get_token/0
and only allowed in theMatchBody
part when tracing.message
- Sets an additional message appended to the trace message sent. One can only set one additional message in the body. Later calls replace the appended message.As a special case,
{message, false}
disables sending of trace messages ('call' and 'return_to') for this function call, just like if the match specification had not matched. This can be useful if only the side effects of theMatchBody
part are desired.Another special case is
{message, true}
, which sets the default behavior, as if the function had no match specification; trace message is sent with no extra information (if no other calls tomessage
are placed before{message, true}
, it is in fact a "noop").Takes one argument: the message. Returns
true
and can only be used in theMatchBody
part and when tracing.return_trace
- Causes areturn_from
trace message to be sent upon return from the current function. Takes no arguments, returnstrue
and can only be used in theMatchBody
part when tracing. If the process trace flagsilent
is active, thereturn_from
trace message is inhibited.Warning: If the traced function is tail-recursive, this match specification function destroys that property. Hence, if a match specification executing this function is used on a perpetual server process, it can only be active for a limited period of time, or the emulator will eventually use all memory in the host machine and crash. If this match specification function is inhibited using process trace flag
silent
, tail-recursiveness still remains.exception_trace
- Works asreturn_trace
plus; if the traced function exits because of an exception, anexception_from
trace message is generated, regardless of the exception is caught or not.process_dump
- Returns some textual information about the current process as a binary. Takes no arguments and is only allowed in theMatchBody
part when tracing.enable_trace
- Enable a trace flag for a process.With one parameter this function turns on tracing like the Erlang call
trace:process(S, self(), true, [P2])
, whereS
is the current trace session andP2
is the parameter toenable_trace
.With two parameters, the first parameter is to be either a process identifier or the registered name of a process. In this case tracing is turned on for the designated process in the same way as in the Erlang call
trace:process(S, P1, true, [P2])
, whereP1
is the first andP2
is the second argument.P1
cannot be one of the atomsall
,new
orexisting
(unless they are registered names).P2
cannot becpu_timestamp
ortracer
.Returns
true
and can only be used in theMatchBody
part when tracing.If used by the legacy function
erlang:trace_pattern/3
, the processP1
gets its trace messages sent to the same tracer as the process executing the statement uses.disable_trace
- Disable a trace flag for a process.With one parameter this function disables tracing like the Erlang call
trace:process(S, self(), false, [P2])
, whereS
is the current trace session andP2
is the parameter todisable_trace
.With two parameters this function works as the Erlang call
trace:process(S, P1, false, [P2])
, whereP1
can be either a process identifier or a registered name and is specified as the first argument to the match specification function.P2
cannot becpu_timestamp
ortracer
.Returns
true
and can only be used in theMatchBody
part when tracing.trace
- Enable and/or disable trace flags for a process.With two parameters this function takes a list of trace flags to disable as first parameter and a list of trace flags to enable as second parameter. Logically, the disable list is applied first, but effectively all changes are applied atomically. The trace flags are the same as for
trace:process/4
, not includingcpu_timestamp
.With three parameters to this function, the first is either a process identifier or the registered name of a process to set trace flags on, the second is the disable list, and the third is the enable list.
When used via the new
trace
API, trace flagtracer
is not allowed and the receiving tracer is always the tracer of the current session.When used via the legacy function
erlang:trace_pattern/3
, trace flagtracer
is allowed. If no tracer is specified, the same tracer as the process executing the match specification is used (not the meta tracer). If that process doesn't have a tracer either, then trace flags are ignored.When using a tracer module, the module must be loaded before the match specification is executed. If it is not loaded, the match fails.
Returns
true
if any trace property was changed for the trace target process, otherwisefalse
. Can only be used in theMatchBody
part when tracing.caller
- Returns the calling function as a tuple{Module, Function, Arity}
or the atomundefined
if the calling function cannot be determined. Can only be used in theMatchBody
part when tracing.Notice that if a "technically built in function" (that is, a function not written in Erlang) is traced, the
caller
function sometimes returns the atomundefined
. The calling Erlang function is not available during such calls.caller_line
- Similar tocaller
but returns additional information about the source code location of the function call-site within the caller function. Returns the calling function as a tuple{Module, Function, Arity, {File, Line}}
.File
is the string file name whileLine
is source line number. If theFile
andLine
cannot be determined,{Module, Function, Arity, undefined}
is returned. If the calling function cannot be determined, the atomundefined
is returned. Can only be used in theMatchBody
part when tracing.Notice that if a "technically built in function" (that is, a function not written in Erlang) is traced, the
caller_line
function sometimes returns the atomundefined
. The calling Erlang function is not available during such calls.current_stacktrace
- Returns the current call stack back-trace (stacktrace) of the calling function. The stack has the same format as in thecatch
part of atry
. See The call-stack back trace (stacktrace). The depth of the stacktrace is truncated according to thebacktrace_depth
system flag setting.Accepts a depth parameter. The depth value will be
backtrace_depth
if the argument is greater.display
- For debugging purposes only. Displays the single argument as an Erlang term onstdout
, which is seldom what is wanted. Returnstrue
and can only be used in theMatchBody
part when tracing.get_tcw
- Takes no argument and returns the value of the node's trace control word. The same is done byerlang:system_info(trace_control_word)
.The trace control word is a 32-bit unsigned integer intended for generic trace control. The trace control word can be tested and set both from within trace match specifications and with BIFs. This call is only allowed when tracing.
set_tcw
- Takes one unsigned integer argument, sets the value of the node's trace control word to the value of the argument, and returns the previous value. The same is done byerlang:system_flag(trace_control_word, Value)
. It is only allowed to useset_tcw
in theMatchBody
part when tracing.silent
- Takes one argument. If the argument istrue
, the call trace message mode for the current process is set to silent for this call and all later calls, that is, call trace messages are inhibited even if{message, true}
is called in theMatchBody
part for a traced function.This mode can also be activated with flag
silent
toerlang:trace/3
.If the argument is
false
, the call trace message mode for the current process is set to normal (non-silent) for this call and all later calls.If the argument is not
true
orfalse
, the call trace message mode is unaffected.
Note
All "function calls" must be tuples, even if they take no arguments. The value of
self
is the atom()self
, but the value of{self}
is the pid() of the current process.
Match target
Each execution of a match specification is done against a match target term. The format and content of the target term depends on the context in which the match is done. The match target for ETS is always a full table tuple. The match target for call trace is always a list of all function arguments. The match target for event trace depends on the event type, see table below.
Context | Type | Match target | Description |
---|---|---|---|
ETS | {Key, Value1, Value2, ...} | A table object | |
Trace | call | [Arg1, Arg2, ...] | Function arguments |
Trace | send | [Receiver, Message] | Receiving process/port and message term |
Trace | 'receive' | [Node, Sender, Message] | Sending node, process/port and message term |
Table: Match target depending on context
Variables and Literals
Variables take the form '$<number>'
, where <number>
is an integer between 0
and 100,000,000 (1e+8). The behavior if the number is outside these limits is
undefined. In the MatchHead
part, the special variable '_'
matches
anything, and never gets bound (like _
in Erlang).
- In the
MatchCondition/MatchBody
parts, no unbound variables are allowed, so'_'
is interpreted as itself (an atom). Variables can only be bound in theMatchHead
part. - In the
MatchBody
andMatchCondition
parts, only variables bound previously can be used. - As a special case, the following apply in the
MatchCondition/MatchBody
parts:- The variable
'$_'
expands to the whole match target term. - The variable
'$$'
expands to a list of the values of all bound variables in order (that is,['$1','$2', ...]
).
- The variable
In the MatchHead
part, all literals (except the variables above) are
interpreted "as is".
In the MatchCondition/MatchBody
parts, the interpretation is in some ways
different. Literals in these parts can either be written "as is", which works
for all literals except tuples, or by using the special form {const, T}
, where
T
is any Erlang term.
For tuple literals in the match specification, double tuple parentheses can also
be used, that is, construct them as a tuple of arity one containing a single
tuple, which is the one to be constructed. The "double tuple parenthesis" syntax
is useful to construct tuples from already bound variables, like in
{{'$1', [a,b,'$2']}}
. Examples:
Expression | Variable Bindings | Result |
---|---|---|
{{'$1','$2'}} | '$1' = a, '$2' = b | {a,b} |
{const, {'$1', '$2'}} | Irrelevant | {'$1', '$2'} |
a | Irrelevant | a |
'$1' | '$1' = [] | [] |
[{{a}}] | Irrelevant | [{a}] |
['$1'] | '$1' = [] | [[]] |
42 | Irrelevant | 42 |
"hello" | Irrelevant | "hello" |
$1 | Irrelevant | 49 (the ASCII value for character '1') |
Table: Literals in MatchCondition/MatchBody Parts of a Match Specification
Execution of the Match
The execution of the match expression, when the runtime system decides whether a trace message is to be sent, is as follows:
For each tuple in the MatchExpression
list and while no match has succeeded:
- Match the
MatchHead
part against the match target term, binding the'$<number>'
variables (much like inets:match/2
). If theMatchHead
part cannot match the arguments, the match fails. - Evaluate each
MatchCondition
(where only'$<number>'
variables previously bound in theMatchHead
part can occur) and expect it to return the atomtrue
. When a condition does not evaluate totrue
, the match fails. If any BIF call generates an exception, the match also fails. - Two cases can occur:
If the match specification is executing when tracing:
Evaluate each
ActionTerm
in the same way as theMatchConditions
, but ignore the return values. Regardless of what happens in this part, the match has succeeded.If the match specification is executed when selecting objects from an ETS table:
Evaluate the expressions in order and return the value of the last expression (typically there is only one expression in this context).
Differences between Match Specifications in ETS and Tracing
ETS match specifications produce a return value. Usually the MatchBody
contains one single ConditionExpression
that defines the return value without
any side effects. Calls with side effects are not allowed in the ETS context.
When tracing there is no return value to produce, the match specification either
matches or does not. The effect when the expression matches is a trace message
rather than a returned term. The ActionTerm
s are executed as in an imperative
language, that is, for their side effects. Functions with side effects are also
allowed when tracing.
Tracing Examples
Match an argument list of three, where the first and third arguments are equal:
[{['$1', '_', '$1'],
[],
[]}]
Match an argument list of three, where the second argument is a number > 3:
[{['_', '$1', '_'],
[{ '>', '$1', 3}],
[]}]
Match an argument list of three, where the third argument is either a tuple
containing argument one and two, or a list beginning with argument one and two
(that is, [a,b,[a,b,c]]
or [a,b,{a,b}]
):
[{['$1', '$2', '$3'],
[{'orelse',
{'=:=', '$3', {{'$1','$2'}}},
{'and',
{'=:=', '$1', {hd, '$3'}},
{'=:=', '$2', {hd, {tl, '$3'}}}}}],
[]}]
The above problem can also be solved as follows:
[{['$1', '$2', {'$1', '$2}], [], []},
{['$1', '$2', ['$1', '$2' | '_']], [], []}]
Match two arguments, where the first is a tuple beginning with a list that in
turn begins with the second argument times two (that is, [{[4,x],y},2]
or
[{[8], y, z},4])
:
[{['$1', '$2'],[{'=:=', {'*', 2, '$2'}, {hd, {element, 1, '$1'}}}],
[]}]
Match three arguments. When all three are equal and are numbers, append the process dump to the trace message, otherwise let the trace message be "as is", but set the sequential trace token label to 4711:
[{['$1', '$1', '$1'],
[{is_number, '$1'}],
[{message, {process_dump}}]},
{'_', [], [{set_seq_token, label, 4711}]}]
As can be noted above, the parameter list can be matched against a single
MatchVariable
or an '_'
. To replace the whole parameter list with a single
variable is a special case. In all other cases the MatchHead
must be a
proper list.
Generate a trace message only if the trace control word is set to 1:
[{'_',
[{'==',{get_tcw},{const, 1}}],
[]}]
Generate a trace message only if there is a seq_trace
token:
[{'_',
[{'==',{is_seq_trace},{const, 1}}],
[]}]
Remove the 'silent'
trace flag when the first argument is 'verbose'
, and add
it when it is 'silent':
[{'$1',
[{'==',{hd, '$1'},verbose}],
[{trace, [silent],[]}]},
{'$1',
[{'==',{hd, '$1'},silent}],
[{trace, [],[silent]}]}]
Add a return_trace
message if the function is of arity 3:
[{'$1',
[{'==',{length, '$1'},3}],
[{return_trace}]},
{'_',[],[]}]
Generate a trace message only if the function is of arity 3 and the first
argument is 'trace'
:
[{['trace','$2','$3'],
[],
[]},
{'_',[],[]}]
ETS Examples
Match all objects in an ETS table, where the first element is the atom
'strider'
and the tuple arity is 3, and return the whole object:
[{{strider,'_','_'},
[],
['$_']}]
Match all objects in an ETS table with arity > 1 and the first element is 'gandalf', and return element 2:
[{'$1',
[{'==', gandalf, {element, 1, '$1'}},{'>=',{size, '$1'},2}],
[{element,2,'$1'}]}]
In this example, if the first element had been the key, it is much more
efficient to match that key in the MatchHead
part than in the
MatchConditions
part. The search space of the tables is restricted with
regards to the MatchHead
so that only objects with the matching key are
searched.
Match tuples of three elements, where the second element is either 'merry'
or
'pippin'
, and return the whole objects:
[{{'_',merry,'_'},
[],
['$_']},
{{'_',pippin,'_'},
[],
['$_']}]
Function ets:test_ms/2
can be useful for testing complicated ETS matches.