(page 1) Built-in Types: Constants, Basic Functions, Expressions. -------------------------------------------------------- * everything has a (unchangeable) type. Constants of different types are syntactically distinct. e.g. 1 is an integer, not a real (1.0 is the real "one"). Other than the few built-in functions that are overloaded (length + - ~ * = i/o) all functions also have a unique type. e.g. / is division for reals, not integers (div is for integers). This allows ML to infer the type of almost all expressions/definitions. * the atomic types are: int, real, bool (true, false), string, unit * then there are built-in "constructors" which compose types. (,,,) makes tuples, e.g. (1,2,3) is a tuple of integers. A tuple's elements may be of different types, e.g. (1,(2,"!"),3.14). The type of a tuple is the product of the types of its elements: (1,2,3) has type int*int*int. (1,(2,"!"),3.14) has type int*(int*string)*real. :: (called cons) makes lists, e.g. 1::2::3::[] is a list of integers. [] is the empty list (nil is another name for []). [,,,] is a short-hand for multiple-cons's, e.g. [1,2,3]. A list's elements must be the same type. The type of a list is the Kleene-star of the type of its elements: [1,2,3] has type int list. [] has no elements - its type is 'a list, where 'a means "any type". {,,,} makes records, which are like tuples except that elements are identified by a label rather than by position. {name="Fred",age=28,married=false} is a record with three elements (a string, an integer, and a bool) identified by the labels name, age, and married (resp.). The type of a record specifies the labels and their types, e.g. {name="Fred",age=28,married=false} has type {age:int,married:bool,name:string}. * All functions take one argument (although certain syntactic patterns may lead you to think otherwise). The type of a function states the type of its argument and the type of its result, e.g. square-root (sqrt) has type real -> real. The argument of binary functions is a 2-tuple, e.g. / has type (real*real) -> real. It is possible to instruct ML that a particular binary function will be written in infix, and in fact, all built-in binary functions are infix. So, we write (1.0 / 3.0) rather than the usual (F X) form (/ (1.0,3.0)). Constructors are a special sort of function. What is the type of cons ? answer : ('a * 'a list) -> ('a list) i.e. cons takes something, of any type, paired with a list of things of that type and returns a list of things of the same type ('a may be any type, but all occurrences of 'a must be bound to the same type). Cons is polymorphic, which is entirely different than being overloaded. One of ML's main features is that functions are treated just like any other compound type (except for equality testing). Thus, there are function-constants (lambda expressions), and functions may be the arguments or results of other (so-called "higher order") functions. A very familiar higher-order function is the function composition operator o. Recall (from your undergrad days) that ((FoG) x) = (F (G x)). o has type (('a -> 'b) * ('c -> 'a)) -> ('c -> 'b) ('a 'b 'c may be bound to any type -- o is polymorphic) (page 2) * The basic unit of compilation is an expression followed by a semicolon. Simple expressions consist of a constants or a function applied to an expression. The type of an expression is the type of the value it evaluates to. Examples: 1; (* a constant *) ~1; (* the unary-minus function applied to 1 *) (2 * ~4) + (7 div (floor 2.9)); explode "abc"; (* has value ["a","b","c"] *) implode (explode "abc"); (* has value "abc". *) (implode o explode) "abc"; (* has value "abc". *) implode o explode; (* the value of this expression is a function. What is its type ? *) Compound expressions are built up using the usual constructs -- if-then-else, case, lambda (called fn in ML). if Exp1 then Exp2 else Exp3 -- as usual. e.g. if 2=3 then "uh oh" else "ok" The expressions in "then" and "else" must be the same type. case Exp0 of Pat1 => Exp1 | Pat2 => Exp2 | ... PatN => ExpN The value of Exp0 is matched against pattern Pat1 -- if the match succeeds Exp1 is evaluated, otherwise ML tries the next pattern, etc. ML will warn you if the patterns do not exhaustively cover the possible values of Exp0. Exp1...ExpN must be the same type. Exp0 and all the patterns must be the same type. We'll discuss patterns later. fn X => Exp is the ML syntax for (LAMBDA X . Exp) If X has type 'a and Exp has type 'b, then this lambda expression has type 'a -> 'b which means you could apply it to an argument of type 'a. e.g. (fn x => x+1) 33; (* has value 34 *) (fn x => x+1); (* the value of this expression is the "increment" function *) Value identifiers (ML-style "variables") and Definitions -------------------------------------------------------- * ML's "variables" are different than the variables of logic programming or imperative (ordinary) programming. In ML a variable is a symbolic name for a specific value (thus, "value identifier" is a very appropriate term). To assign name N to value V (V may be an expression of course) val N = V; e.g. val X = 22; gives the name X to the value 22. X and 22 are now perfectly interchangeable. val F = fn x => x+1; gives the name F to the increment function. This is how functions are defined. Being a very common thing, a more familiar syntactic form has been provided: fun F x = x+1; These "definition" expressions have the side-effect of updating the environment in which subsequent expressions will be evaluated. Thus the value of the expression X + 1; depends on the environment in which it is evaluated. However, X is not a free variable -- this expression can only be evaluated in environments in which X is bound to a value. * Scope rules -- standard lexical scoping. Formal parameters are local variables of course. For example, val X = 22; fun F X = X+1; (* the X's here are local, not names for 22 *) fun G Y = X+1; (* X is taken from the current environment. This definition, in this environment is identical to (fun G Y = 23) *) X + 9; (* has value 31 *) val X = 77; (* doesn't affect the definition of F or G *) X + 9; (* has value 86 *) val F = fn W => X + (F (W-1)); (* this is NOT a recursive definition *) val rec F = fn W => X + (F (W-1)); (* this IS a recursive definition *) fun F W = X + (F (W-1)); (* this is the same recursive definition. *) (page 3) * There are two mechanisms for introducing local variables. The differences between them are somewhat obscure -- for most purposes either one will do. local let DEFINITIONS DEFINITIONS in in DEFINITION EXPRESSION end; end; Example: defining a function G using local variables X and F fun G W = local let val X = 1 val X = 1 fun F Y = Y + 2 fun F Y = Y + 2 in in fun G W = X + (F W) X + (F W) end; end; * ML cannot infer the type of an expression involving only variables and overloaded operators, e.g. fun F (x,y) = x + y; You must explicitly supply enough type information to permit the type of the entire expression to be inferred. Any of the following is sufficient: fun F (x:int,y) = x + y; (* specify the type of x *) fun F (x,y):int = x + y; (* specify the type of the value returned by F *) fun F ((x,y):int*int) = x + y; (* specify the type of F's argument *) Patterns -------- * Patterns consist of constants, variables, constructors, and the underscore symbol (_) which matches anything. All variables in patterns are local. * Syntactically, patterns are virtually identical to functionless expressions -- only syntactic context distinguishes them. Patterns mostly occur in definitions of compound values, case expressions, and function definitions. * Patterns are used to decompose compound values (ones that are defined using constructors). Patterns used for this purpose should always match, so they shouldn't contain constants. In the following examples, patterns are on the left of the equals sign, and expressions are on the right. Patterns for tuples: val X = (1,2); (* constructs a 2-tuple and names it X *) val (A,B) = X; (* gives the name A to the first component of X and the name B to the second component of X *) val (C,_) = X; (* gives the name C to the first component of X *) val (C,C) = X; (* is an error in POLY ML *) Patterns for lists: val X = [1,2,3]; (* constructs a list and names it X *) val A::B = X; (* gives the name A to the head of X and the name B to the tail of X *) val C::_ = X; (* gives the name C to the head of X *) Patterns for records: val X = {LA=1,LB=2}; (* constructs a record and names it X *) val {LA=A,LB=B} = X; (* gives the name A to entry labelled LA, and the name B to entry labelled LB *) val {LB=C,...} = X; (* gives the name C to entry labelled LB. "..." is a special subpattern which matches all the unnamed entries in a record *) (page 4) * case expressions SYNTAX -- given above Example: fun list_length X = case X of [] => 0 | head::tail => 1 + (list_length tail); * function definitions SYNTAX val F = fn Pat1 => Exp1 | Pat2 => Exp2 | ... | PatN => ExpN fun F Pat1 = Exp1 | F Pat2 = Exp2 | ... | F PatN = ExpN (Note that we have to say F in each of the cases) Examples: val rec list_length = fn [] => 0 | head::tail => 1 + (list_length tail); fun list_length [] = 0 | list_length (head::tail) = 1 + (list_length tail); ============= END OF PART 1 ================================================== You should now be able to write all sorts of small, but interesting programs. I suggest you try the following, in the order shown. 1. The end of the Harper handout lists some of the built-in functions. Figure out what some of these do (e.g. @, ^) by typing in simple expressions. 2. It's important for you to be comfortable with higher-order functions. Try writing the following. Ssolutions are on the web: http://www.site.uottawa.ca/~holte/ML/Programs FIRST_ARG (F,X) -- F is a binary function (its argument is a pair) X is suitable as the first component of F's argument FIRST_ARG (F,X) returns a unary function G such that (G Y) = F (X,Y) e.g. Given: fun F (X:int,Y) = X + Y; FIRST_ARG (F,1) would be the "increment" function Hint: you must have FUN FIRST_ARG (F,X) = a lambda expression FILTER (P,L) -- P is a predicate (i.e. a function that returns true or false) L is a list FILTER (P,L) returns the elements in L for which P is true e.g. Given: fun P X = ((X mod 2) = 0); FILTER (P,L) would return the even numbers in L e.g. FILTER (P,[1,2,3,4,5]) = [2,4] Using FIRST_ARG and FILTER write a function THRESHOLD that takes an integer X and returns a function that takes a list L and returns the elements of L that are less than or equal to X. e.g. (THRESHOLD 3) [6,1,4,5,2,3] should return [1,2,3] Hint #1: use a local function, KEEP (A,B), which is true when B is less than or equal to A. Hint #2: (2) consider using FIRST_ARG twice. Hint #3: if you're still stuck, try this warm-up exercise: using FIRST_ARG, FILTER, and the function P above, write a function, EVENS that takes a list of integers and returns all of its even elements. 3. Later we're going to define an abstract type for (finite) sets. Choose a way of representing sets (e.g. using lists) and write your own set-processing functions (e.g. is_member, union, add_member, cardinality). Hint: you may have to restrict yourself to sets with elements of a particular type (e.g. sets of integers). 4. Browse the other programs in http://www.site.uottawa.ca/~holte/ML/Programs