View Source Data Types
Erlang provides a number of data types, which are listed in this section.
Note that Erlang has no user defined types, only composite types (data
structures) made of Erlang terms. This means that any function testing for a
composite type, typically named is_type/1
, might return true
for a term that
coincides with the chosen representation. The corresponding functions for built
in types do not suffer from this.
Terms
A piece of data of any data type is called a term.
Number
There are two types of numeric literals, integers and floats. Besides the conventional notation, there are two Erlang-specific notations:
$
char
ASCII value or unicode code-point of the characterchar
.base
#
value
Integer with the basebase
, which must be an integer in the range 2 through 36.
Leading zeroes are ignored. Single underscore characters (_
) can be
inserted between digits as a visual separator.
Examples:
1> 42.
42
2> -1_234_567_890.
-1234567890
3> $A.
65
4> $\n.
10
5> 2#101.
5
6> 16#1f.
31
7> 16#4865_316F_774F_6C64.
5216630098191412324
8> 2.3.
2.3
9> 2.3e3.
2.3e3
10> 2.3e-3.
0.0023
11> 1_234.333_333
1234.333333
12> 36#helloworld.
1767707668033969
Comparisons
Both integers and floats share the same linear order. That is, 1
compares less
than 2.4
, 3
compares greater than 2.99999
, and 5
is equal to 5.0
.
When wanting to compare an integer with another integer or a float with another
float, it may be tempting to use the term equivalence operators (=:=
, =/=
)
or pattern matching. This works for integers which has a distinct representation
for every number, but there's a surprising edge case for floating-point as the
latter has two representations for zero which are considered different by the
term equivalence operators and pattern matching.
If you wish to compare floating-point numbers numerically, use the regular
comparison operators (such as ==
) and add guards that require both the
arguments to be floating-point.
Note
Prior to OTP 27, the term equivalence operators had a bug where they considered
0.0
and-0.0
to be the same term. Legacy code that makes equality comparisons on floating-point zero should migrate to using the equal-to (==
) operator withis_float/1
guards, and compiler warnings have been added to that effect. These can be silenced by writing+0.0
instead, which is the same as0.0
but makes the compiler interpret the comparison as being purposely made against0.0
.Note that this does not break compatibility with IEEE 754 which mandates that
0.0
and-0.0
should compare equal: they are equal when interpreted as numbers (==
), and unequal when interpreted as opaque terms (=:=
).
Examples:
1> 0.0 =:= +0.0.
true
2> 0.0 =:= -0.0.
false
3> +0.0 =:= -0.0.
false
4> +0.0 == -0.0.
true
Representation of Floating Point Numbers
When working with floats you may not see what you expect when printing or doing arithmetic operations. This is because floats are represented by a fixed number of bits in a base-2 system while printed floats are represented with a base-10 system. Erlang uses 64-bit floats. Here are examples of this phenomenon:
1> 0.1+0.2.
0.30000000000000004
The real numbers 0.1
and 0.2
cannot be represented exactly as floats.
1> {36028797018963968.0, 36028797018963968 == 36028797018963968.0,
36028797018963970.0, 36028797018963970 == 36028797018963970.0}.
{3.602879701896397e16, true,
3.602879701896397e16, false}.
The value 36028797018963968
can be represented exactly as a float value but
Erlang's pretty printer rounds 36028797018963968.0
to 3.602879701896397e16
(=36028797018963970.0
) as all values in the range
[36028797018963966.0, 36028797018963972.0]
are represented by
36028797018963968.0
.
For more information about floats and issues with them see:
- What Every Programmer Should Know About Floating-Point Arithmetic
- 0.30000000000000004.com/
- Floating Point Arithmetic: Issues and Limitations
If you need to work with exact decimal fractions, for instance to represent money, it is recommended to use a library that handles that, or work in cents instead of dollars or euros so that decimal fractions are not needed.
Also note that Erlang's floats do not exactly match IEEE 754 floats,
in that neither Inf nor NaN are supported in Erlang. Any
operation that would result in NaN, +Inf, or -Inf, will instead raise
a badarith
exception.
Examples:
1> 1.0 / 0.0.
** exception error: an error occurred when evaluating an arithmetic expression
in operator '/'/2
called as 1.0 / 0.0
2> 0.0 / 0.0.
** exception error: an error occurred when evaluating an arithmetic expression
in operator '/'/2
called as 0.0 / 0.0
Atom
An atom is a literal, a constant with name. An atom is to be enclosed in single
quotes ('
) if it does not begin with a lower-case letter or if it contains other
characters than alphanumeric characters, underscore (_
), or @
.
Examples:
hello
phone_number
name@node
'Monday'
'phone number'
Bit Strings and Binaries
A bit string is used to store an area of untyped memory.
Bit strings are expressed using the bit syntax.
Bit strings that consist of a number of bits that are evenly divisible by eight are called binaries.
Examples:
1> <<10,20>>.
<<10,20>>
2> <<"ABC">>.
<<"ABC">>
3> <<1:1,0:1>>.
<<2:2>>
The is_bitstring/1
BIF tests whether a
term is a bit string, and the is_binary/1
BIF tests whether a term is a binary.
Examples:
1> is_bitstring(<<1:1>>).
true
2> is_binary(<<1:1>>).
false
3> is_binary(<<42>>).
true
For more examples, see Programming Examples.
Reference
A term that is unique
among connected nodes. A reference is created by calling the
make_ref/0
BIF. The
is_reference/1
BIF tests whether a term
is a reference.
Examples:
1> Ref = make_ref().
#Ref<0.76482849.3801088007.198204>
2> is_reference(Ref).
true
Fun
A fun is a functional object. Funs make it possible to create an anonymous function and pass the function itself — not its name — as argument to other functions.
Examples:
1> Fun1 = fun (X) -> X+1 end.
#Fun<erl_eval.6.39074546>
2> Fun1(2).
3
The is_function/1
and is_function/2
BIFs tests whether a term is a fun.
Examples:
1> F = fun() -> ok end.
#Fun<erl_eval.43.105768164>
2> is_function(F).
true
3> is_function(F, 0).
true
4> is_function(F, 1).
false
Read more about funs in Fun Expressions. For more examples, see Programming Examples.
Port Identifier
A port identifier identifies an Erlang port.
open_port/2
returns a port identifier. The
is_port/1
BIF tests whether a term is a port
identifier.
Read more about ports in Ports and Port Drivers.
Pid
Pid is an abbreviation for process identifier. Each process has a Pid which identifies the process. Pids are unique among processes that are alive on connected nodes. However, a Pid of a terminated process may be reused as a Pid for a new process after a while.
The BIF self/0
returns the Pid of the calling process. When
creating a new process, the parent
process will be able to get the Pid of the child process either via the return
value, as is the case when calling the spawn/3
BIF, or via
a message, which is the case when calling the
spawn_request/5
BIF. A Pid is typically used when
when sending a process a signal. The
is_pid/1
BIF tests whether a term is a Pid.
Example:
-module(m).
-export([loop/0]).
loop() ->
receive
who_are_you ->
io:format("I am ~p~n", [self()]),
loop()
end.
1> P = spawn(m, loop, []).
<0.58.0>
2> P ! who_are_you.
I am <0.58.0>
who_are_you
Read more about processes in Processes.
Tuple
A tuple is a compound data type with a fixed number of terms:
{Term1,...,TermN}
Each term Term
in the tuple is called an element. The number of elements is
said to be the size of the tuple.
There exists a number of BIFs to manipulate tuples.
Examples:
1> P = {adam,24,{july,29}}.
{adam,24,{july,29}}
2> element(1,P).
adam
3> element(3,P).
{july,29}
4> P2 = setelement(2,P,25).
{adam,25,{july,29}}
5> tuple_size(P).
3
6> tuple_size({}).
0
7> is_tuple({a,b,c}).
true
Map
A map is a compound data type with a variable number of key-value associations:
#{Key1 => Value1, ..., KeyN => ValueN}
Each key-value association in the map is called an association pair. The key and value parts of the pair are called elements. The number of association pairs is said to be the size of the map.
There exists a number of BIFs to manipulate maps.
Examples:
1> M1 = #{name => adam, age => 24, date => {july,29}}.
#{age => 24,date => {july,29},name => adam}
2> maps:get(name, M1).
adam
3> maps:get(date, M1).
{july,29}
4> M2 = maps:update(age, 25, M1).
#{age => 25,date => {july,29},name => adam}
5> map_size(M).
3
6> map_size(#{}).
0
A collection of maps processing functions are found in module maps
in STDLIB.
Read more about maps in Map Expressions.
Change
Maps were introduced as an experimental feature in Erlang/OTP R17. Their functionality was extended and became fully supported in Erlang/OTP 18.
List
A list is a compound data type with a variable number of terms.
[Term1,...,TermN]
Each term Term
in the list is called an element. The number of elements is
said to be the length of the list.
Formally, a list is either the empty list []
or consists of a head (first
element) and a tail (remainder of the list). The tail is also a list. The
latter can be expressed as [H|T]
. The notation [Term1,...,TermN]
above is
equivalent with the list [Term1|[...|[TermN|[]]]]
.
Example:
[]
is a list, thus[c|[]]
is a list, thus[b|[c|[]]]
is a list, thus[a|[b|[c|[]]]]
is a list, or in short [a,b,c]
A list where the tail is a list is sometimes called a proper list. It is
allowed to have a list where the tail is not a list, for example, [a|b]
.
However, this type of list is of little practical use.
Examples:
1> L1 = [a,2,{c,4}].
[a,2,{c,4}]
2> [H|T] = L1.
[a,2,{c,4}]
3> H.
a
4> T.
[2,{c,4}]
5> L2 = [d|T].
[d,2,{c,4}]
6> length(L1).
3
7> length([]).
0
A collection of list processing functions are found in module
lists
in STDLIB.
String
Strings are enclosed in double quotes ("), but is not a data type in Erlang.
Instead, a string "hello"
is shorthand for the list [$h,$e,$l,$l,$o]
, that
is, [104,101,108,108,111]
.
Two adjacent string literals are concatenated into one. This is done in the compilation.
Example:
"string" "42"
is equivalent to
"string42"
Change
Starting with Erlang/OTP 27 two adjacent string literals have to be separated by white space, or otherwise it is a syntax error. This avoids possible confusion with triple-quoted strings.
Strings can also be written as triple-quoted strings, which can be indented over multiple lines to follow the indentation of the surrounding code. They are also verbatim, that is, they do not allow escape sequences, and thereby do not need double quote characters to be escaped.
Change
Triple-quoted strings were added in Erlang/OTP 27. Before that 3 consecutive double quote characters had a different meaning. There were absolutely no good reason to write such a character sequence before triple-quoted strings existed, but there are some gotchas; see the Warning at the end of this description of triple-quoted strings.
Example, with verbatim double quote characters:
"""
Line "1"
Line "2"
"""
That is equivalent to the normal single quoted string (which also allows newlines):
"Line \"1\"
Line \"2\""
The opening and the closing line has got the delimiters: the """
characters.
The lines between them are the content lines. The newline on the opening line is
not regarded as string content, nor is the newline on the last content line.
The indentation is defined by the white space character sequence preceding the delimiter on the closing line. That character sequence is stripped from all content lines. There can only be white space before the delimiter on the closing line, or else it is regarded as a content line.
The opening line is not allowed to have any characters other than white space after the delimiter, and all content lines must start with the defined indentation character sequence, otherwise the string has a syntax error.
Here is a larger example:
X = """
First line starting with two spaces
Not escaped: "\t \r \xFF" and """
"""
That corresponds to the normal string:
X = " First line starting with two spaces
Not escaped: \"\\t \\r \\xFF\" and \"\"\"
"
It is possible to write consecutive double quote characters on the beginning of a content line by using more double quote characters as delimiters. This is a string that contains exactly four double quote characters, using a delimiter with five double quote characters:
"""""
""""
"""""
These strings are all the empty string:
""
"""
"""
"""
"""
Warning
Before Erlang/OTP 27, when triple-quoted strings were added, the character sequence
"""
was interpreted as"" "
, which means concatenating the empty string to the string that follows. All sequences of an odd number of double quote characters had this meaning.Any even number of double quote characters was interpreted as a sequence of empty strings, that were concatenated (to the empty string).
There was no reason to write such character sequences. But should that have happened, the meaning has probably changed with the introduction of triple-quoted strings.
The compiler preprocessor was patched in Erlang/OTP 26.1 to warn about 3 or more sequential double quote characters. In Erlang/OTP 26.2 this was improved to warn about adjacent string literals without intervening white space, which also covers the same problem at a string end.
If the compiler should emit such a warning, please change such double quote character sequences to have a whitespace after every second quote character, remove redundant empty strings, or write them as one string. This makes the code more readable, and means the same thing in all releases.
Sigil
A sigil is a prefix to a string literal. It is not a data type in Erlang, but
a shorthand notation that indicates how to interpret the string literal. Sigils
offer mainly two things: a compact way to create UTF-8 encoded binary strings,
and a way to write verbatim strings (not having to escape \
characters),
useful for regular expressions, for example.
A sigil starts with the Tilde character (~
) followed by a name defining the
sigil type.
Immediately after follows the sigil content; a character sequence between
content delimiters. The allowed delimiters are these start-end delimiter pairs:
() [] {} <>
, or these characters that are both start and end delimiters:
/ | ' " ` #
. Triple-quote string delimiters may
also be used.
The character escaping rules for the sigil content depends on the sigil type. When the sigil content is verbatim, there is no escape character. The sigil content simply ends when the end delimiter is found, so it is impossible to have the end delimiter character in the string content. The set of delimiters is fairly generous, and in most cases it is possible to choose an end delimiter that's not in the literal string content.
Triple-quote string delimiters allow choosing a larger
number of quote characters in the end delimiter, than whatever is in the string
content, which thereby facilitates any content also with a sequence of "
characters at the start of a line even for a verbatim string.
The Sigils are:
~
- The Vanilla (default) Sigil. Shorthand for a UTF-8 encodedbinary/0
. This sigil does not affect the character escaping rules, so with triple-quoted string delimiters they are the same as for~B
, and for other string delimiters they are the same as for~b
.~b
- The Binary Sigil. Shorthand for a UTF-8 encodedbinary()
, as if callingunicode:characters_to_binary/1
on the sigil content. Character escaping rules are the same as for~s
.~B
- The Verbatim Binary Sigil. As~b
, but the sigil content is verbatim.~s
- The String Sigil. Shorthand for astring()
, that is, a[char()]
which is a list of Unicode codepoints. Character escaping rules are the same as for a normalstring/0
. Using this sigil on a regular string does effectively nothing.~S
- The Verbatim String Sigil. As~s
, but the sigil content is verbatim. Using this sigil on a triple-quoted string does effectively nothing.
Examples
<<"\"\\µA\""/utf8>> = <<$",$\\,194,181,$A,$">> =
~b"""
"\\µA"
""" = ~b'"\\µA"' =
~B"""
"\µA"
""" = ~B<"\µA"> =
~"""
"\µA"
""" = ~"\"\\µA\"" = ~/"\\µA"/
[$",$\\,$µ,$A,$"] =
~s"""
"\\µA"
""" = ~s"\"\\µA\"" = ~s["\\µA"] =
~S"""
"\µA"
""" = ~S("\µA") =
"""
"\µA"
""" = "\"\\µA\""
Adjacent strings are concatenated in the compilation, but that is not possible
with sigils, since they are transformed into terms that in general may not be
concatenated. So, "a" "b"
is equivalent to "ab"
, but ~s"a" "b"
or
~s"a" ~s"b"
is a syntax error. ~s"a" ++ "b"
, however, evaluates to "ab"
since both operands to the ++
operator are strings.
Change
Sigils were introduced in Erlang/OTP 27
Record
A record is a data structure for storing a fixed number of elements. It has
named fields and is similar to a struct in C. However, a record is not a true
data type. Instead, record expressions are translated to tuple expressions
during compilation. Therefore, record expressions are not understood by the
shell unless special actions are taken. For details, see module shell
in STDLIB.
Examples:
-module(person).
-export([new/2]).
-record(person, {name, age}).
new(Name, Age) ->
#person{name=Name, age=Age}.
1> person:new(ernie, 44).
{person,ernie,44}
Read more about records in Records. More examples are found in Programming Examples.
Boolean
There is no Boolean data type in Erlang. Instead the atoms true
and false
are used to denote Boolean values. The is_boolean/1
BIF tests whether a term is a boolean.
Examples:
1> 2 =< 3.
true
2> true or false.
true
3> is_boolean(true).
true
4> is_boolean(false).
true
5> is_boolean(ok).
false
Escape Sequences
Within strings ("
-delimited), quoted atoms, and the content of
~b
and ~s
sigils, the following escape sequences are
recognized:
Sequence | Description |
---|---|
\b | Backspace (ASCII code 8) |
\d | Delete (ASCII code 127) |
\e | Escape (ASCII code 27) |
\f | Form Feed (ASCII code 12) |
\n | Line Feed/Newline (ASCII code 10) |
\r | Carriage Return (ASCII code 13) |
\s | Space (ASCII code 32) |
\t | (Horizontal) Tab (ASCII code 9) |
\v | Vertical Tab (ASCII code 11) |
\ XYZ, \ YZ, \ Z | Character with octal representation XYZ, YZ or Z |
\xXY | Character with hexadecimal representation XY |
\x{ X...} | Character with hexadecimal representation; X... is one or more hexadecimal characters |
\^a ...\^z \^A ...\^Z | Control A to control Z |
\^@ | NUL (ASCII code 0) |
\^[ | Escape (ASCII code 27) |
\^\ | File Separator (ASCII code 28) |
\^] | Group Separator (ASCII code 29) |
\^^ | Record Separator (ASCII code 30) |
\^_ | Unit Separator (ASCII code 31) |
\^? | Delete (ASCII code 127) |
\' | Single quote |
\" | Double quote |
\\ | Backslash |
Table: Recognized Escape Sequences
Change
As of Erlang/OTP 26, the value of
$\^?
has been changed to be 127 (Delete), instead of 31. Previous releases would allow any character following$\^
; as of Erlang/OTP 26, only the documented characters are allowed.
Within triple-quoted strings, escape sequences are not recognized. The only text that cannot be written in a triple-quoted string is three consecutive double quote characters at the beginning of a line (preceded only by whitespace). This limitation can be worked around by using more double quote characters for the string delimiters than in the string. Any number three or above is allowed for the start delimiter and the end delimiter is the same as the start delimiter.
When triple-quote string delimiters are used with the
~
, ~B
or ~S
sigils the same applies, but for the
~b
or ~s
sigils the escape sequences for normal
strings, above, are used.
Change
Triple-quoted strings and sigils were introduced in Erlang/OTP 27.
Type Conversions
There are a number of BIFs for type conversions.
Examples:
1> atom_to_list(hello).
"hello"
2> list_to_atom("hello").
hello
3> binary_to_list(<<"hello">>).
"hello"
4> binary_to_list(<<104,101,108,108,111>>).
"hello"
5> list_to_binary("hello").
<<104,101,108,108,111>>
6> float_to_list(7.0).
"7.00000000000000000000e+00"
7> list_to_float("7.000e+00").
7.0
8> integer_to_list(77).
"77"
9> list_to_integer("77").
77
10> tuple_to_list({a,b,c}).
[a,b,c]
11> list_to_tuple([a,b,c]).
{a,b,c}
12> term_to_binary({a,b,c}).
<<131,104,3,100,0,1,97,100,0,1,98,100,0,1,99>>
13> binary_to_term(<<131,104,3,100,0,1,97,100,0,1,98,100,0,1,99>>).
{a,b,c}
14> binary_to_integer(<<"77">>).
77
15> integer_to_binary(77).
<<"77">>
16> float_to_binary(7.0).
<<"7.00000000000000000000e+00">>
17> binary_to_float(<<"7.000e+00">>).
7.0