![]() |
<PowerForms Tutorial> |
[ Basics | Intermediate | Advanced ]
Note: This tutorial assumes basic knowledge on:
This is the PowerForms tutorial that will explain how to automatically ensure that all input fields are appropriately filled in by the client prior to page submission. Related benefits include file scanning and string matching and will also be presented. As is evident below, our solution is based on regular expressions:
Formats: Client-side input validation
service {
session EnterDigit() {
format Digit = range('0', '9');
/* Format declaration */
int n;
html H = <html>
Enter your favorite digit:
<input type="text" name="d"
format="Digit">
</html>;
show H receive [n = d];
...;
}
} |
Formats are declared using the keyword format followed by an identifier that will henceforth refer to the format. A format is really a regular expression (regexp) specified in an elaborate syntax. The format Digit in the example is declared to match the ten characters in the range from '0' to '9' (including both end points). The format is subsequently bound to the text input field named "d" in the document H through the format="Digit" attribute. When run, the EnterDigit session will ask the client for his favorite digit and incrementally verify that the text entered in the text input field matches the regular expression (i.e. that the client enters a digit). The document can only be submitted when the input is matched by the regexp.
The validation status will be shown in the browser's status bar and by status icons placed next to the input field. By default these icons show traffic lights with the colors red, yellow, and green, signalling respectively that the format is invalid (not in the language defined by the regexp), on the way of becoming valid (in the prefix of the regexp), and valid (in the regexp).
service {
session EnterDigit() {
format Digit = regexp("[0-9]");
/* Perl style format declaration */
int n;
html H = <html>
Enter your favorite digit:
<input type="text" name="d"
format="Digit">
</html>;
show H receive [n = d];
...;
}
} |
This service is equivalent to the one above, but with Perl style regexps instead.
service {
session EnterAge() {
int n;
format Digit = range('0', '9');
format Age = concat(Digit, star(Digit));
html H = <html>
Enter your favorite age:
<input type="text" name="a"
format="Age">
</html>;
show H receive [n = a];
...;
}
} |
Formats are not surprisingly allowed to be defined in terms of each other. Here the Digit format from the previous example is used in the definition of another format Age that will match any non-zero number of digits. Concat takes any number of regexps and is the concatenation of these. Star (a.k.a. Kleene's star) takes one regexp and is any number of repetitions (even zero) of this regexp. This format is bound to the text input field named "a" in the document H. When this document is shown, the client must and can only enter a valid age (a positive integer).
Perl style
service {
session EnterAge() {
int n;
format Digit = regexp("[0-9]");
format Age = regexp("{Digit}{Digit}*");
html H = <html>
Enter your favorite age:
<input type="text" name="a"
format="Age">
</html>;
show H receive [n = a];
...;
}
} |
This service is equivalent to the one above, but with Perl style regexps instead. The syntax {id} designates another format named id.
No circularly defined formats
service {
session Enter() {
format RecF = concat(star("foo"), |
Note: This is an illegal service!
As with all other toplevel declarations, formats have two pass scope rules (they are visible on the same scope level even before their lexical definition point). However, formats cannot be circularly or recursively defined.
require <std.wigmac>
service {
session EnterEmail() {
string s;
format Digit = range('0', '9');
format Alpha = range('a', 'z');
format Word = plus(union(Digit, Alpha));
format Email = concat(Word, "@", Word,
star(concat(".", Word)));
html H = <html>
Enter your email:
<input type="text" name="e"
format="Email">
</html>;
show H receive [s = e];
...;
}
} |
This service defines four formats. The first is the Digit format we have already seen. The second, Alpha, is defined to be any lower case alphanumeric character. The third is any (non-zero) number of repetitions of the either a digit or a lower alphanumeric case character. The plus construct is really a regexp macro being invoked. The macro takes one regexp argument and is the concatenation of the argument with star of the argument. We shall see in the macro tutorial how this macro is defined.
Escaping validation: ignoreformats
service {
session EnterEmail() {
string s;
format Email = ...;
html H = <html>
Enter your email:
<input type="text" name="e"
format="Email">
<input type="submit" value="Cancel"
ignoreformats="yes">
</html>;
show H receive [s = e];
...;
}
} |
Normally, one cannot submit a page while the input fields are not all correctly filled in. Sometimes, however, it is nice to be able to disable this functionality which is exactly what the attribute ignoreformats does if it has the value "yes". The attribute is applicable to all input fields that causes the document to be submitted (that is, submit, continue, and image fields). The example will show a document prompting the client for his email, but the client has the possibility of pressing the cancel button (even if the email field is not correctly filled in).
Customizing errors and warnings
service {
session CustomErrors() {
int n;
format Digit = range('0', '9');
format Number = plus(Digit);
html H = <html>
Enter your email:
<input type="text" name="n"
format="Number"
red="Not a number!">
yellow="Enter a valid number"
</html>;
show H receive [n = n];
...;
}
} |
As previously explained, the incremental validation status is shown in the browser's staus bar while the client is entering data. The status bar will feature standard default messages corresponding to the three states red, yellow, and green (mentioned in the first example). These messages can easily be redefined by assigning the corresponding attributes "red" and "yellow". The "red" and "yellow" messages are also the ones shown when the client attempts to submit a document containing data that violate formats.
Changing the status icons
service {
session EnterDigit() {
int n;
format Digit = range('0', '9');
html H = <html>
Enter your favorite digit:
<input type="text" name="d"
format="Digit">
</html>;
show H receive [n = d];
...;
}
} |
You can change the status icons to your own images reflecting the particular style and look-and-feel you want your service to have. This is done outside the service by placing four images "red.gif", "yellow.gif", "green.gif", and "na.gif" in your RGYIMAGEDIR (set in your ".bigwig" configuration file). If the changes are only relevant to the service at hand, you may want to consider having a local ".bigwig" configuration file in the service's directory (in which you compile). If you do not want any status icons, (for now) you have to make 1-pixel transparent images [sorry].
Formats: String matching
service {
session MatchString() {
int n;
string s;
format Digit = range('0', '9');
format Number = plus(Digit);
s = ...;
if (match(s,Number)[]) {
/* if `s' matched `Number' */
n = (string) s;
...;
} else {
n = -1; // `s' was not a number
...;
}
}
} |
As in Perl, strings in <bigwig> can (at runtime) be matched against regexp formats. In the example, the string s is matched against the format Number (the empty square brackets are explained below). The match construction returns a boolean stating whether or not the string is in the language defined by the regular expression.
String recording
service {
session RecordString() {
string d;
string e = "bigwig@brics.dk";
format Word = ...;
format Email = concat(Word, "@",
[domain = // format recording
concat(Word,star(concat(".", Word)))]
);
if (match(e,Email)[d = domain]) {
/* Here, `d' is "brics.dk". */
...;
}
}
} |
The format Email in this example contains a ``recording'' regexp named "domain". A recording regexp will record the string its regexp argument matches when used in a match contruction. Since the string e is matched by the regexp format Email, match evaluates to true and d is assigned the value "brics.dk". If a string is not matchable (match returns false), all recordings are assigned initial values (0 for integers, "" for strings, etc.). This format recording mechanism is reminiscent of parentheses in Perl regexps.
service {
session FileScan() {
int n;
file f;
format Digit = range('0', '9');
format Number = plus(Digit);
f = open("hello.txt", "r");
n = scan(f, Number);
close(f);
...;
}
} |
A final application of regexp formats is file scanning. The construction scan takes a file handle followed by a regexp format and scan as much of the file from the current file position that is in the regular language defined by the format and advance the file pointer accordingly. Note that the file must be in read mode.
Caution: Not all formats are suitable for use with this construction. Imagine applying the format concat("a",anything,"b") to a very large file. Due to the greedy nature of the scan construct, it will not know when to quit until it has read (into memory!) the entire file (maybe the very last character is a "b").
service {
session EnterPassword() {
string s;
format Alpha = union(range('a', 'z'),
range('A', 'Z'));
format Char3x = concat(anychar,anychar,anychar);
format AtLeast3x = concat(Char3x, anything);
format HasNonAlpha = complement(star(Alpha));
format PW = intersection(AtLeast3x, HasNonAlpha);
html H = <html>
Enter your password:
<input type="password" name="p" format="PW">
</html>;
show H receive [s = p];
...;
}
} |
This service has five formats Alpha, Char3x, AtLeast3x, HasNonAlpha, and PW the goals of which is to define a valid (and restrictive) password. Valid passwords must be at least three characters and contain at least one non-alphanumeric character. The definitions almost speak for themselves. Alpha is defined to be any lower or upper case alphanumeric character; Char3x to be the concatenation of any three characters; AtLeast3x to be at least three characters; HasNonAlpha to be any string that has a non alphanumeric character; and finally PW to be any string at least three characters long that has a non-alphanumeric character. This format is subsequently bound to a password input field causing the usual incremental validation behavior.
Match: Strange special cases...
service {
session S() {
string w = "whatever";
format Strange = concat([R = anything],
[S = anything]);
if (match(w, Strange)[r = R, s = S]) {
/* Both `r' and `s' are "whatever". */
...;
}
}
} |
Due to an overlap in the two formats in the concatenation (that is, the regexp concat(anything,anything) is equivalent to anything) and the fact that we minimize the deterministic finite-state automata (DFAs) produced from the regular expressions, both R and S match the string "whatever" in the example. Consequently, both r and s hold the value "whatever" in the then-branch of the if. Also, recordings inside complement constructions may have unpredictable outcomes. [Technically, this is caused by the merging of states in the compositionally produced automata (annotated with alphabet symbols and sets of recording symbols)].
|
bigwig@brics.dk Last updated: November 2, 2001 |
|