## Parsing words

If you’ve been paying close attention so far, you realize I’ve lied to you. I
said each word acts on the stack in order, but there a few words like `[`

, `]`

,
`:`

and `;`

that don’t seem to follow this rule.

These are **parsing words** and they behave differently from simpler words like
`5`

, `[1,b]`

or `drop`

. We will cover these in more detail when we talk about
metaprogramming, but for now it is enough to know that parsing words are special.

They are not defined using the `:`

word, but with the word `SYNTAX:`

instead.
When a parsing words is encountered, it can interact with the parser using a
well-defined API to influence how successive words are parsed. For instance `:`

asks for the next tokens from the parsers until `;`

is found and tries to
compile that stream of tokens into a word definition.

A common use of parsing words is to define literals. For instance `{`

is a
parsing word that starts an array definition and is terminated by `}`

.
Everything in-between is part of the array. An example of array that we have
seen before is `{ 1 2 3 4 5 6 7 8 9 10 }`

.

There are also literals for hashmaps,
`H{ { "Perl" "Larry Wall" } { "Factor" "Slava Pestov" } { "Scala" "Martin Odersky" } }`

,
and byte arrays, `B{ 1 14 18 23 }`

.

Other uses of parsing word include the module system, the object-oriented
features of Factor, enums, memoized functions, privacy modifiers and more. In
theory, even `SYNTAX:`

can be defined in terms of itself, although of course
the system has to be bootstrapped somehow.

## Stack shuffling

Now that you know the basics of Factor, you may want to start assembling more complex words. This may sometimes require you to use variables that are not on top of the stack, or to use variables more than once. There are a few words that can be used to help with this. I mention them now since you need to be aware of them, but I warn you that using too many of these words to manipulate the stack will cause your code to quickly become harder to read and write. Stack shuffling requires mentally simulating moving values on a stack, which is not a natural way to program. In the next section we’ll see a much more effective way to handle most needs.

Here is a list of the most common shuffling words together with their effect on the stack. Try them in the listener to get a feel for how they manipulate the stack, and explore the online help to find out more.

```
dup ( x -- x x )
drop ( x -- )
swap ( x y -- y x )
over ( x y -- x y x )
dupd ( x y -- x x y )
swapd ( x y z -- y x z )
nip ( x y -- y )
rot ( x y z -- y z x )
-rot ( x y z -- z x y )
2dup ( x y -- x y x y )
```

## Combinators

Although the words mentioned in the previous paragraph are occasionally useful
(especially the simpler `dup`

, `drop`

and `swap`

), you should write code that
does as little stack shuffling as possible. This requires practice getting the
function arguments in the right order. Nevertheless, there are certain common
patterns of needed stack manipulation that are better abstracted away into their
own words.

Suppose we want to define a word to determine whether a given number `n`

is
prime. A simple algorithm is to test each number from `2`

to the square root of
`n`

and see whether it is a divisor of `n`

. In this case, `n`

is used in two
places: as an upper bound for the sequence, and as the number to test for
divisibility.

The word `bi`

applies two different quotations to the single element on the
stack above them, and this is precisely what we need. For instance
`5 [ 2 * ] [ 3 + ] bi`

yields

```
10
8
```

`bi`

applies the quotation `[ 2 * ]`

to the value `5`

and then the quotation
`[ 3 + ]`

to the value `5`

leaving us with `10`

and then `8`

on the stack.
Without `bi`

, we would have to first `dup`

`5`

, then multiply, and then `swap`

the result of the multiplication with the second `5`

, so we could do the addition

```
5 dup 2 * swap 3 +
```

You can see that `bi`

replaces a common pattern of `dup`

, then calculate,
then `swap`

and calculate again.

To continue our prime example, we need a way to make a range starting from `2`

.
We can define our own word for this `[2,b]`

, using the `[a,b]`

range word we
discussed earlier

```
: [2,b] ( n -- {2,...,n} ) 2 swap [a,b] ; inline
```

What’s up with that `inline`

word? This is one of the modifiers we can use after
defining a word, another one being `recursive`

. This will allow us to have the
definition of a short word inlined wherever it is used, rather than incurring
a function call.

Try our new `[2,b]`

word and see that it works

```
6 [2,b] >array .
```

Using `[2,b]`

to produce the range of numbers from `2`

to the square root of a
number `n`

that is already on the stack is easy: `sqrt floor [2,b]`

(technically
`floor`

isn’t necessary here, as `[a,b]`

works for non-integer bounds). Let’s
try that out

```
16 sqrt [2,b] >array .
```

Now, we need a word to test for divisibility. A quick search in the online help
shows that `divisor?`

is the word we want. It will help to have the arguments
for testing divisibility in the other direction, so we define `multiple?`

```
: multiple? ( a b -- ? ) swap divisor? ; inline
```

Both of these return `t`

```
9 3 divisor? .
3 9 multiple? .
```

If we’re going to use `bi`

in our `prime`

definition, as we implied above, we
need a second quotation. Our second quotation needs to test for a value in the
range being a divisor of `n`

- in other words we need to partially apply the
word `multiple?`

. This can be done with the word `curry`

, like this:
`[ multiple? ] curry`

.

Finally, once we have the range of potential divisors and the test function on
the stack, we can test whether any element satisfied divisibility with `any?`

and then negate that answer with `not`

. Our full definition of `prime`

looks like

```
: prime? ( n -- ? ) [ sqrt [2,b] ] [ [ multiple? ] curry ] bi any? not ;
```

Altough the definition of `prime`

is complicated, the stack shuffling is minimal
and is only used in the small helper functions, which are simpler to reason
about than `prime?`

.

Notice that `prime?`

uses two levels of quotation nesting since `bi`

operates on
two quotations, and our second quotation contains the word `curry`

, which also
operates on a quotation. In general, Factor words tend to be rather shallow,
using one level of nesting for each higher-order function, unlike Lisps or more
generally languages based on the lambda calculus, which use one level of nesting
for each function, higher-order or not.

Many more combinators exists other than `bi`

(and its relative `tri`

), and you
should become acquainted at least with `bi`

, `tri`

, `bi*`

and `bi@`

by reading
about them in the online help and trying them out in the listener.

In the next post, we will see how to organize our words in modules and add unit tests.

Until then!