-
Notifications
You must be signed in to change notification settings - Fork 89
Nemerle for OOP Programmers Week 3
This week we will learn more about functional programming in Nemerle. We will find out about tuples, variants, and advanced pattern matching.
Tuples in Nemerle are similar to tuples in math, or in SQL for that matter. They are immutable, heterogeneous (a tuple can hold values of several types at once) finite sequences of objects. The most common examples of tuples is a pair or triple:
def our_pair = (1, 2);
def our_triple = (1, 12, -10);
As of version 0.9.3 of the Nemerle compiler, up to 19 elements can be held in a tuple (the syntax looks clumsy for 5 or 6 elements already, and you cannot use loops for iterating over tuples).
A tuple can hold values of several types:
def another_pair = ("foo", 3.0);
Tuples are useful for returning multiple values from functions or holding several things in a single cell in containers. One can always use record-like classes, however this requires additional hassle of a separate class definition.
The most common way of decomposing a tuple is by using a tuple pattern. It looks mostly like the tuple definition:
def divmod (x, y) {
def div = x / y;
def mod = x % y;
(div, mod)
}
match (divmod (142, 13)) {
| (d, m) =>
System.Console.WriteLine ($ "div=$d, mod=$m");
}
// Output: div=10, mod=12
Here we define a local function -- introduced in Week 2 -- called divmod
,
which returns the tuple (div, mod)
. The result is then matched
against a tuple pattern. As you can see, the pattern binds the first
element of the tuple to d
and the second to m
.
Single-case matching has a special, shorter syntax in Nemerle, as shown here:
def (d, m) = divmod (142, 13);
System.Console.WriteLine ($ "div=$d, mod=$m");
The divmod
function could be written more compactly as well:
def divmod (x, y) {
(x / y, x % y)
}
The arguments to the tuple constructor are just expressions, and
the names div
and mod
used for elements didn't
really matter.
For cases where pattern matching doesn't seem right (for example, you want just the first element of a tuple returned by a nested call), there is a special tuple indexer. Using it, the example above could be rewritten as:
def divmod (x, y) {
(x / y, x % y)
}
def r = divmod (142, 13);
System.Console.WriteLine ($ "div=$(r[0]), mod=$(r[1])");
Now, the tuple result gets assigned to r
, and the $
macro
expands it's elements by index. As with arrays, tuple indexing is 0-based
(the first element of a tuple has index 0).
In case tuples have started to look too much like arrays, here are some rules about tuple indexers to help clarify things.
Unlike an array indexer, it is not possible to supply anything beside a constant to the tuple indexer. For example, the following code will not work:
def pair = (2, "foo");
def idx = int.Parse (System.Console.ReadLine ());
def element = pair [idx]; // dynamic access is invalid
In this contrived example, it should be clear why this is prohibited:
depending on user input the type of element
would be string
or int
. The only solution would be to use a compile-time type
of object
for element
, however we felt that this
wouldn't be useful enough to mandate implementation.
Because tuples are immutable, you also cannot assign values to tuple elements, as in:
def p = (3, 4);
p [0] = 17; // incorrect assignment to tuple
The constructor of tuple types is the star (*
). For example
the tuple ("foo", 42, 3.1415)
has type string * int *
double
. This notion comes from math, where the times (×) symbol
is used, as in dimensions L × W × H.
A fully-typed standalone version of divmod
would look like:
static divmod (x : int, y : int) : int * int {
(x / y , x % y)
}
The tuple type constructor is also used in function types, which is not an accident. We consider a function taking more than one argument exactly the same as a function taking a single tuple argument of the corresponding type. For example the following code is perfectly valid:
def print_two_ints (x, y) {
System.Console.WriteLine ($ "first=$x, second=$y");
}
def divmod (x, y) {
(x / y, x % y)
}
print_two_ints (divmod (143, 77))
Nemerle supports multiple assignment. That is, one can use a pseudo-tuple of possible assignment targets. For example:
mutable x = 17, y = 32;
(x, y) = (y, x); // swap x and y
(x, y) = (y + 12, 2 * x); // or even change their values
All the source expressions are evaluated first and the assignment is done afterwards (this is why the swap works fine).
This section borrows some text from the Grokking Nemerle tutorial.
Variants (called data types or sum types in SML and OCaml) are forms of expressing data of several different kinds.
The simplest example of variants are enum types known from C (or C#).
<c>// C enum Color { Red, Yellow, Green } </c>
You can define C#-like enum
types in Nemerle too,
but we will talk about it next week. Now let us look at the simplest variant
type:
// Nemerle
variant Color {
| Red
| Yellow
| Green
}
However, the variant options might be more useful because they can carry some extra data with them:
variant Color {
| Red
| Yellow
| Green
| Different {
red : float;
green : float;
blue : float;
}
}
So if the color is neither red, yellow nor green, it can be represented
with RGB. You can create variant objects just like any other object,
by using its constructor. All variant options have an implicit
[Record]
macro invocation on them. We talked about this macro 2 weeks
ago, it adds a constructor assigning to each field of a class:
def blue = Color.Different (0, 0, 1);
def red = Color.Red ();
In the OO world, modeling variants with sub classing can sometimes be seen:
// C#
class Color {
class Red : Color { }
class Green : Color { }
class Yellow : Color { }
class Different : Color {
public float red;
public float green;
public float blue;
public Different (float red, float green, float blue) {
this.red = red;
this.green = green;
this.blue = blue;
}
}
}
Of course, you need to write a constructor, mark fields as public, and so on. When you're done -- using this kind of stuff can get quite involved -- you will need to use lots of runtime type checks.
On the other hand, Nemerle provides an easy and convenient method of dealing with variants -- pattern matching.
We already used pattern matching on lists, so you can imagine doing a switch over variant options. For example, a function returning string representation of a color could look like this:
variant Color {
| Red
| Yellow
| Green
| Different {
red : float;
green : float;
blue : float;
}
}
def string_of_color (color)
{
match (color) {
| Color.Red => "red"
| Color.Yellow => "yellow"
| Color.Green => "green"
| Color.Different (red = r, green = g, blue = b) =>
$ "rgb($r, $g, $b)"
}
}
System.Console.WriteLine (string_of_color (Color.Red ()));
System.Console.WriteLine (string_of_color (Color.Different (1, 1, 0)));
/* Output:
red
rgb(1, 1, 0)
*/
The first three patterns state we're not interested in any possible fields
in the case of Red
, Yellow
and Green
. The last
pattern, for the Different
case, binds values of the red
,
green
and blue
to r
, g
and b
respectively. You can omit matched fields at will, as well as change
the ordering.
It is also possible to use a shortcut here:
| Color.Different (r, g, b) =>
This is only available when you specify all the fields in the given object, and in the right order.
The example above, while simple, is not the best usage of variants. Variants are best at handling tree-like data structures. A common example of tree data structures are XML documents. However, we will deal with plain binary trees first.
The following example defines the type of trees of integers (representing sets).
variant Tree {
| Node {
left : Tree;
elem : int;
right : Tree;
}
| Null
public override ToString () : string
{
match (this) {
| Tree.Node (l, e, r) => $ "($l $e $r)"
| Tree.Null => "."
}
}
}
So a tree node is either an inside node with an element and two subtrees
or a null tree without any elements inside. We have also overridden the
ToString
method, so the $
macro and WriteLine
work properly on trees (the default implementation would yield
"Tree.Node"
or "Tree.Null"
only).
We can now define a method for inserting elements to the tree. It should
be defined inside the Tree
variant:
public Insert (e : int) : Tree
{
match (this) {
| Tree.Node (l, cur, r) =>
if (e < cur)
Tree.Node (l.Insert (e), cur, r)
else if (e > cur)
Tree.Node (l, cur, r.Insert (e))
else
this
| Tree.Null =>
Tree.Node (Tree.Null (), e, Tree.Null ())
}
}
The function checks if the element to insert is smaller than the element in the current node, and if so, inserts it in the left subtree. If it's bigger, it inserts the element in the right subtree. Otherwise, it has to be equal, so it doesn't insert it again: it just returns the unchanged tree.
If the function hits an empty subtree, it creates a new leaf in that place.
Having a Contains
check wouldn't be a bad idea, either:
public Contains (e : int) : bool
{
match (this) {
| Tree.Node (l, cur, _) when e < cur =>
l.Contains (e)
| Tree.Node (_, cur, r) when e > cur =>
r.Contains (e)
| Tree.Node => true
| Tree.Null => false
}
}
This function works much like insert -- it checks if the element could be found in the left or in the right subtree, and looks for it there. If not, either it has found the matching node, or null, and returns the appropriate result.
There is however one new feature used here -- when
guards
in matching. Any matching clause can have an additional condition
attached. The function first checks if the pattern matches the value,
and if so, the condition is evaluated. If that yields true, the given
branch is taken, otherwise it proceeds with further match clauses.
This example could also be written with regular if
s:
public Contains (e : int) : bool
{
match (this) {
| Tree.Node (l, cur, r) =>
if (e < cur)
l.Contains (e)
else if (e > cur)
r.Contains (e)
else
true
| Tree.Null => false
}
}
Finally we can test our example:
// we start with an empty tree
def t = Tree.Null ();
// add a few elements
def t = t.Insert (13).Insert (34).Insert (23);
// and display it
System.Console.WriteLine (t);
System.Console.WriteLine (t.Contains (13));
System.Console.WriteLine (t.Contains (42));
/* Output:
(. 13 ((. 23 .) 34 .))
True
False
*/
As you can see binary trees are not very interesting, so we will go on to XML. Whether XML is interesting remains a doubtful question, but at least it is somewhat more practical.
variant Node {
| Text {
value : string;
}
| Element {
name : string;
children : list [Node];
}
}
This variant defines a simplistic data structure to hold XML trees. An XML node is either a text node with some specified text inside, or an element node with a name and zero or more children. A sequence of children is represented as a Nemerle list.
For example the following tree:
<tree>
<branch>
<leaf/>
</branch>
<branch>
Foo
</branch>
</tree>
would be represented by:
Node.Element ("tree",
[Node.Element ("branch", [Node.Element ("leaf", [])]),
Node.Element ("branch", [Node.Text ("Foo")])])
Of course XML by itself is just a data format. Using data in the above form wouldn't be too easy. So we want some different internal representation of data, and use XML only to save it or send it over the network.
variant RefrigeratorContent
{
| Beer { name : string; volume : float; }
| Chips { weight : int; }
| Ketchup
public static FromXml (node : Node) : RefrigeratorContent
{
match (node) {
| Node.Element ("ketchup", []) =>
RefrigeratorContent.Ketchup ()
| Node.Element ("beer",
[Node.Element ("name", [Node.Text (name)]),
Node.Element ("volume", [Node.Text (volume)])]) =>
RefrigeratorContent.Beer (name, float.Parse (volume))
| Node.Element ("chips",
[Node.Element ("weight", [Node.Text (weight)])]) =>
RefrigeratorContent.Chips (int.Parse (weight))
| _ =>
throw System.ArgumentException ()
}
}
}
The most interesting thing here are the nested patterns. Until now we have always used very shallow patterns -- we just checked the top-level object and possibly its bound fields. However it is possible to look very deeply in the structure of object. Instead of binding values of fields to variables, this function checks if they match given patterns. This is all that is happening here -- nested patterns.
Probably the most annoying thing about the example above is the amount
of times you have to say RefrigeratorContent
and Node
.
Fortunately both can be skipped, however for quite different reasons.
RefrigeratorContent
used for object construction can be
skipped, because we are in the body of the RefrigeratorContent
variant definition. If we wanted to skip it elsewhere, we would have
to say using RefrigeratorContent;
first.
On the other hand, Node
in matching can be skipped, because
we provided the type of the node
parameter in the match
statement, therefore the compiler will just look up the appropriate
variant options in this type.
So the shorter example would be:
variant RefrigeratorContent
{
| Beer { name : string; volume : float; }
| Chips { weight : int; }
| Ketchup
public static FromXml (node : Node) : RefrigeratorContent
{
match (node) {
| Element ("ketchup", []) => Ketchup ()
| Element ("beer", [Element ("name", [Text (name)]),
Element ("volume", [Text (volume)])]) =>
Beer (name, float.Parse (volume))
| Element ("chips", [Element ("weight", [Text (weight)])]) =>
Chips (int.Parse (weight))
| _ =>
throw System.ArgumentException ()
}
}
}
Once we have something to put into a refrigerator, we can now build the refrigerator.
[Record]
class Refrigerator
{
public minimal_temperature : float;
public content : list [RefrigeratorContent];
public static FromXml (node : Node) : Refrigerator
{
match (node) {
| Element ("refrigerator",
Element ("minimal-temperature", [Text (min_temp)]) :: content) =>
def parse_content (lst) {
match (lst) {
| x :: xs =>
RefrigeratorContent.FromXml (x) :: parse_content (xs)
| [] => []
}
}
Refrigerator (float.Parse (min_temp), parse_content (content)); // (*)
| _ =>
throw System.ArgumentException ("node")
}
}
}
The reader will easily note that: a) the XML deserialization code looks a bit like a junk b) it can be generated automatically, and c) without matching it would be even worse. Later we will learn how to write macros to generate this kind of code automatically.
We can however still make it somewhat better, without resorting to
macros. First off, if you wrote the map
example from the
previous week, then the parse_content
function should
familiar to you. In fact it can be replaced with map
altogether, like this:
Refrigerator (float.Parse (min_temp), map (content, RefrigeratorContent.FromXml));
We have passed a static global function as a functional value. It works exactly the same as with local functions.
Because the map
function is generally useful, it is included
in the standard library as a member function of the list
data structure. So we remove parse_content
, and replace line
marked by (*)
with:
Refrigerator (float.Parse (min_temp), content.Map (RefrigeratorContent.FromXml));
Because it is quite a common pattern to do matching directly on
arguments, Nemerle provides a way to skip match
in such
cases. If your function starts with |
it is implicitly
surrounded with match (single_parm) { ... }
, or in the
case there is more than one parameter, with match (parm1, parm2, ..., parmN) { ... }
(so you can use tuple patterns to get to a particular parameter).
So we can further shorten the above example by two more lines:
public static FromXml (node : Node) : Refrigerator
{
| Element ("refrigerator",
Element ("minimal-temperature", [Text (min_temp)]) :: content) =>
Refrigerator (float.Parse (min_temp), content.Map (RefrigeratorContent.FromXml));
| _ =>
throw System.ArgumentException ("node")
}
Add an Iter (f : int -> void) : void
method to the Tree
variant, implementing inorder traversal (that is you first traverse the left
subtree, then call f on current element, and traverse the right subtree).
Write a function that reads XML from specified
files and puts it into the Node
variant defined above. Then write a
function to dump your data in a lispy format, something like:
(tree
(branch
(leaf)
)
(branch
($text "Foo")
)
)
Then you can implement indentation of output.
(tree
(branch
(leaf))
(branch
($text "Foo")))
Then copy the refridgerator variants from the above, fix any errors you find in them and try to parse the following file:
<?xml version='1.0' encoding='utf-8' ?>
<refrigerator>
<minimal-temperature>-3.0</minimal-temperature>
<beer>
<name>Hyneken</name>
<volume>0.6</volume>
</beer>
<beer>
<name>Bydweisser</name>
<volume>0.5</volume>
</beer>
<beer>
<name>Plsner</name>
<volume>0.5</volume>
</beer>
<chips>
<weight>500</weight>
</chips>
<ketchup/>
</refrigerator>
Warning: you do not need to write the XML parser. You should not do
it, actually. Use the System.Xml namespace from the .NET Framework. In
order to link with the System.Xml library you need to compile with the
-r:System.Xml
option. For example:
ncc -r System.Xml myprogram.n
should do.