<Dynamic Documents Tutorial>

[ Basics | Intermediate | Advanced ]


Basics

Below are several different ways of writing the ubiquitous Hello-World service. These examples should provide basic insight on how html documents are constructed, combined, and shown to the client.


Shortest Hello World

service {
  session Hello() {
    exit <html>Hello World!</html>;
  }
}

Result:

Hello World!

This is the shortest possible hello-world service. It has one session named Hello that when run simply exits a html document constant only containing the text "Hello World!" onto the client's browser after which the session terminates.


Introducing an `html' variable

service {
  session Hello() {
    html D;

    D = <html>Hello World!</html>;
    exit D;
  }
}

Result:

Hello World!

Here, a variable D of type html is introduced. The first statement assigns to this variable an html document constant. The next (and last) statement of the session exits this document.


Initialization of `html' variables

service {
  session Hello() {
    html D = <html>Hello World!</html>;

    exit D;
  }
}

Result:

Hello World!

The same as in the previous example, except the variable D is initialized to contain the hello-world document.


Gaps and String plugging

service {
  session Hello() {
    html D, H;

    D = <html>Hello <[gap]>!</html>;
    /* string `World' is plugged into D
       and assigned to H. */
    H = D <[gap = "World"];
    exit H;
  }
}

Result:

Hello World!

In the first statement, we assign to D an html document containing a gap called "gap". The second statement plugs the constant string "World" into the document D, producing an entirely new document containing not surprisingly the text "Hello World!" which is subsequently assigned to H. Note that the expression D <[gap = "World"] does not side-effect D, but simply evaluates to a document value. The final statement exits H to the client.


Gaps and Document plugging

service {
  session Hello() {
    html D = <html>Hello <[gap]>!</html>;
    html H;
    html W = <html><b>World</b></html>;

    /* html document W is plugged into D 
       and assigned to H. */
    H = D <[gap = W];
    exit H;
  }
}

Result:

Hello World!

Instead of plugging a constant string "World", another document containing the text "World" in bold face is plugged into the document D. In this way, documents can be gradually composed and in a highly dynamic fashion (whence the name "Dynamic Documents" or "DynDoc"). Also, the programmer is no longer forced to do a linear construction from the first <html> to the last </html> tag.


Plug evaluates to a document value

service {
  session Hello() {
    html D = <html>Hello <[gap]>!</html>;
    html W = <html><b>World</b></html>;

    /* the expression D <[gap = W] evaluates
       to a document that is exited to the 
       client */
    exit D <[gap = W];
  }
}

Result:

Hello World!

This example emphasizes the fact that D <[gap = W] is (just) an expression that evaluates to a document value (which is exited to the client).


Plug shorthand

service {
  session Hello() {
    html D = <html>Hello <[g]>!</html>;
    html g = <html><b>World</b></html>;

    exit D <[g];
  }
}

Result:

Hello World!

The expression D <[g] is shorthand for D <[g = g].


Plug shorthand II

service {
  session Hello() {
    html D = <html>s = <[s]>, n = <[n]></html>;

    exit D <[tuple {n = 42, s = "hello"}];
  }
}

Result:

s = hello, n = 42

The tuple expression, tuple {n = 42, s = "hello"}, is evaluated to a tuple value whose attributes (here n and s) are plugged into the document D.


Plug: Implicit coercion

service {
  session Hello() {
    int n = 42;
    html D = <html>Hello <[gap]>!</html>;

    exit D <[gap = n];
  }
}

Result:

Hello 42!

The plug expression will in fact implicitly coerce any type to `html' documents. For instance, the integer value 42 is converted to a string "42" immediately before being plugged. The same thing goes for bool, floats, chars, and time.


Plug: Character escaping

service {
  session Hello() {
    string s = "<b>World</b>";
    html D = <html>Hello <[gap]>!</html>;

    exit D <[gap = s];
  }
}

Result:

Hello <b>World</b>!

All strings plugged into `html' documents are properly escaped (The markup characters "<" and ">" are escaped to "&lt;" and "&gt;"). The reason for this is two-fold: to guarantee the safety of the analysis and for security reasons. Imagine what happens if a malicious client's input (say, "<script>location.replace("http://.../")</script>") gets propagated onto a document shown.


Plug: Bypassing character escaping

service {
  session Hello() {
    string s = "<b>World</b>";
    html D = <html>Hello <[gap]>!</html>;

    exit D <[gap = rawhtml(s)]; // caution!
  }
}

Result:

Hello World!

The predefined function rawhtml converts strings to html documents verbatim, leaving markup characters unchanged. Be cautious when using this function as it may introduce security breaches. Also, the analysis can no longer give all its static guarantees. When using this function it is the programmers responsability to make sure no unintended character sequences are introduced. Clearly, gaps cannot be dynamically plugged into documents this way, since the analysis is performed at compile-time.


Attribute gaps (vs. html gaps)

service {
  session Hello() {
    html D = <html>
      <font color=[col]>Hello World!</font>
    </html>;

    /* the `col' gap cannot be plugged 
       with documents */
    D = D <[col = "blue"]; 
    exit D;
  }
}

Result:

Hello World!

All the gaps we have seen so far have been the so called ``html gaps'' where the syntax is <[id]> (where `id' is the name of the gap). The col gap is an example of another kind of gap, namely an ``attribute gap'' - it is written within an html attribute (here "font"). Attribute gaps cannot like ``html gaps'' be plugged with documents, only strings (or ints, floats, ...).


Show: Local state preserved

service {
  session Hello() {
    html D = <html>Hello <[gap]>!</html>;
    html H = <html><b>World</b></html>;

    D = D <[gap = H];
    show <html>Click continue</html>;
    /* execution continues here after show
       with entire local state as it was
       (`D' has a bold-faced `World' in it!) */
    exit D;
  }
}

Result:

Click continue
;
Hello World!

Instead of exiting a document to the client, one can instead show a document to the client. The semantics is that the page will be shown and execution will resume (with the local state preserved) after the show statement when the client submits the page. If the page does not contain a submit button, a default continue submit button will automatically be provided. Note that this default continue button is not shown in the "result" above. Thus, the document D exited in the final statement of the session, is the one constructed in the first statement.


Input fields and Show-receive

service {
  session EnterName() {
    string s;
    html Input = <html>
      Name?: <input type="text" name="name">
    </html>;
    html Output = <html>Hi <b><[name]></b>!</html>;

    show Input receive [s = name];
    exit Output <[name = s];
  }
}

Result:

Name?:
;
Hi foo!

...assuming "foo" was entered.

Here the document Input contains an input field of type "text". When such a document is shown, its values must be received. Conversely, when fields are received, they are required to be in the document shown. The identifier immediately following the equal character (`='), names an input field in the document shown. The identifier immediately preceding the equal character (`='), names an lvalue in the service. This lvalue can be of any type (the text entered by the client will be appropriately coerced).


Receive shorthand

service {
  session EnterName() {
    string name;
    html Input = <html>
      Name?: <input type="text" name="name">
    </html>;
    html Output = <html>Hi <b><[name]></b>!</html>;

    show Input receive [name];
    exit Output <[name];
  }
}

Result: Not shown!

The statement show Input receive [name]; is shorthand for show Input receive [name = name];.


Receive shorthand II

service {
  session EnterPerson() {
    schema Person {
      int age;
      string name;
    }
    tuple Person p;
    html Input = <html>
      Name?: <input type="text" name="name">
      Age?: <input type="text" name="age">
    </html>;

    show Input receive [p];
  }
}

Result: Not shown!

A document prompting the client for name and age is shown. When the document is submitted the name and age values entered are collated into a tuple value that is assigned to the tuple variable p. Notice the one-to-one correspondence between the names of the input fields and the attributes in the tuple with schema Person.


Checkboxes

service {
  session Checkbox() {
    vector string v;
    html Input = <html>
      x? <input type="checkbox" name="c" value="x">
      /
      y? <input type="checkbox" name="c" value="y">
    </html>;

    show Input receive [v = c];
    ...;
  }
}

Result:

x? / y?
; ...

When a document contains more than one checkbox (as inferred by the analysis), such an input field group must be received into a vector (or a relation). A single checkbox can be received in a basic type. The same thing goes for select (multiple) fields. There are a number of requirements on the different field kinds, but most of them behave roughly as the "text" field in the previous example. See the form input table for more information.


Shorthand: ``Plug, then assign''

service {
  session Hello() {
    html D = <html>Hello <[gap]>!</html>;
    html H = <html><b>World</b></html>;

    D =<[gap = H]; // plug, then assign (to D)
    exit D;
  }
}

Result:

Hello World!

The assignment expression D = D <[gap = H], can be abbreviated to D =<[gap = H], using the side-effecting ``plug, then assign''-operator "=<[". Note the three characters "=", "<", and "[" form one lexical token (that is, they cannot be separated by whitespaces). The syntax is reminiscent of the `+=' (add, then assign) operator from C/Java/<bigwig>.


Iteration and `html' documents

service {
  session IterateList() {
    int i;
    html H = <html><[more]></html>;
    html Item = <html><li><[no]><[more]></html>;

    for (i=3; i>0; i--) {
      H =<[more = Item <[no = i]];
    }
    exit H;
  }
}

Result:

  • 3
  • 2
  • 1
  • Documents can easily be iteratively constructed. This will often happen when presenting data from a vector to the client. Notice that initially the document H contains a gap "more". This is important since all documents (expressions) in the program must have the same set of gaps for all program flows [see the example on "implicit closing of gaps" below for the exact details]. The program would be rejected by the compiler if H did not initially (in the for loop) contain the gap "more".


    Code Gaps: Code Expressions

    service {
      int x = 21;
      session CodeExp() {
        html H = <html>res = <[(x*2)]></html>;
    
        exit H;
      }
    }

    Result:

    res = 42

    Just before document H is exited, the code expression, <[(x*2)]> is evaluated, yielding "42" which is implicitly coerced to a string and inserted into the document which is shown to the client.


    Code Gaps: Code Statements

    service {
      int x = 5;
      session CodeExp() {
        html H = <html>
          res = <[{int r = x+1; r*7;}]>
        </html>;
    
        exit H;
      }
    }

    Result:

    res = 42

    The same thing happens here, the only difference being that the code gap is not a code expression but a code statement. The last statement in a code statement is required to be a statement-expression (here "r*7;") whose result is the result of the code statements.


    Code Gaps: Scope Restrictions

    service {
      html D;
    
      session CodeExp() {
        int x = 21;
        html H = <html>res = <[(x*2)]></html>;
    
        D = H; // `x' passed ``out of scope'' to D. 
        exit H;
      }
    
      session SomeOtherSession() {
        ...;
        exit D; // `x' ``out of scope''!
      }
    }

    Result: N/A, Illegal service.

    The scope for variables in code expressions and code statements is the toplevel scope outside the sessions. Otherwise it would be possible to pass variables out of scope as happens with the x usage in the code gap in the example.


    Separating designer/programmer tasks by lexical inclusion

    service {
      session Hello() {
        html H = #include "../docs/hello.html";
    
        exit H;
      }
    }

    Result:

    HELLO WORLD

    Notice how the designer and programmer's tasks have been separated in the code. This division can, in fact, be made even more explicit by lexically including the documents required by the service:

    Now, the designer is free to design the html page in for instance FrontPage or some other html page design tool. The document is automatically included in the service by the lexical analyser during service compilation.


    Intermediate

    Rapid prototyping

    service {
      session Hello() {
        html P =  <html>
          Name: <input type="text" name="N"><br>
          Age: <input type="text" name="A">
          <[more]>
        </html> @ ".../docs/person.html";
    
        exit P;
      }
    }

    Result: ...if the file "../docs/person.html" does not (yet) exist.

    Name:
    Age:

    When the file designated in the <html>...</html> @ "URL" construction does not exist, the constant in-lined document is used.

    ---../docs/person.html---
    <html> Please enter the following information: <table> <tr> <td><em>Name:</em></td> <td><input type="text" name="N"></td> </tr> <tr> <td><em>Age:</em></td> <td><input type="text" name="A"></td> </tr> </table> <[more]> </html>

    Result: ...if the file "../docs/person.html" does exist.

    Please enter the following information:
    Name:
    Age:

    Whereas, when the document does exist (the file is non-empty) it is the one used. The two html documents are required to have the same flow types (i.e. same gaps and fields). This is verified by the compiler at compile-time.

    The idea behind this construct is to provide a means for rapid prototyping and to aid collaboration between the programmer and the designer of a Web service. The programmer rapidly makes some prototype html documents, with focus on the functionality (the fields and gaps) and not on the graphical layout of the document. Then, as the designer finishes the ``real'' documents, they replace the prototype ones.


    Auto-wrapped tags

    service {
      session Hello() {
        html H = <html>
          <head><title>foo</title></head>
          <body bgcolor=[col]>
            Hello World!
          </body>
        </html>
    
        exit H <[col = "yellow"];
      }
    }

    Result:

    Hello World!

    The tags "<html>" and "</html>" are in fact just delimiters saying that whatever comes in between is an html document template. They are not themselves part of the html document template. However, when an html document template...

    <html>...</html>

    ...is shown, it is wrapped with an html, a head, a (default) title, a body, and a form tag with an appropriate continue "action" url, yielding:

    <html>
      <head>
        <title><bigwig> service: hello</title>
      </head>
      <body>
      <form action="continue url">
        ...
      </form>
      </body>
    </html>

    Note that the <form> tag pair is not added when a page is flashed or exited.

    In order to be able to add information the placement of which is required before the <form> tag, <bigwig> intercepts an optional <body> tag and uses it, overwriting the default one (placing it immediately preceding the autogenerated <form> tag). As can be seen in the example, such a <body> tag can contain attributes (for instance, a bgcolor attribute as in the example). Everything written outside this <body> tag, gets placed outside the autogenerated <form> tag. Thus, the title of the document shown will be "foo". Being able to insert information outside the form (and body) is particularly useful when a page is to contain JavaScript code. Note that <body> tags along with the information preceeding them in documents plugged into other documents is discarded.


    Gaps are unordered

    service {
      session Address() {
        int n = 42;
        string s = "Somewhere street";
        html H;
        html UK = <html><[number]>, <[street]></html>
        html DK = <html><[street]> <[number]></html>
    
        if (...) H = UK;
        else H = DK;
        exit H <[number = n, street = s];
      }
    }

    Result:

    42, Somewhere street
    , when `...' evaluates to true
    Somewhere street 42
    , when `...' evaluates to false

    The plug operator can perform multiple pluggings in one go, as is evident in the plug expression H <[number = n, street = s] above. Each document in <bigwig> has a set of unordered gaps (i.e. not a sequence of gaps). The fact that the order of gaps is inconsequential, can be exploited to customize information presentation. For instance, in Denmark addresses are written street name followed by house number, whereas in the United Kingdom they are the house number followed by a comma and the name of the street. The plug operation will plug n and s into the gaps number and street regardless of their respective placements in the document H in which we plug.


    Implicit closing of gaps

    service {
      session ImplicitClose() {
        html H = <html>Hello <[what]>!</html>
    
        if (...) H = H <[what = "World"];
        /* Here, H no longer has the what gap */
        exit H;
      }
    }

    Result:

    Hello World!

    After the if-statement, document H no longer has the what gap. The reason is that its presence is not guaranteed (if, for instance, the expression `...' evaluates to true). Consequently, the what gap is implicitly closed in the ``else branch'' of the if. Implicit closing happens automatically so that the service will obey the flow join requirements dictated by the dynamic document analysis. Attempts to plug in H's what gap after the if-statement will be denied by the compiler, yielding a compile-time error.


    A gap may not occur twice in the same document

    service {
      session NoMultGaps() {
        html B;
        html D = <html>Hello <[what]><[g]></html>;
        html H = <html>World <[what]></html>;
        
        B = D <[g = H]; // Illegal document!
        exit B;
      }
    }

    Result: N/A, Illegal service.

    Our analysis and implementation prohibits a gap from being present twice in a document. Consequently, the document produced by the plug operation (with two gaps by the name of what) above is illegal and the service will be rejected by the compiler, yielding a compile-time error.


    Analysis inference (track)

    service {
      session OrderDrink() {
        string order;
        html D;
        html H = <html>
          Name? <input type="text" name="name">
          <br>
          <[choices]>
        </html>;
        html Coffee = <html>Coffee?
          <input type="radio" name="drink"
                 value="coffee">
          <font color=[col]><[alts]></font>
        </html>;
        html Tea = <html><br>Tea? 
          <input type="radio" name="drink"
                 value="tea">
          <[alts]>
        </html>;
    
        H = H <[choices = Coffee <[alts = Tea]]; 
        /* Here, track(H) would report:
           gaps = { ("col", attrGap),
                    ("alts", htmlGap) }
           fields = { ("name", textField),
                      ("drink", radioField) } */
        D = H <[col = "blue"];
        show D receive [order = drink];
      }
    }

    Result:

    Name?
    Coffee?
    Tea?

    Clearly, the programmer needs to be aware of how gaps and fields propagate in the service. To help the programmer, we have added a keyword "track" which is a predefined function that is the identity on expressions of type html except for one fact: It will report, at compile-time, the flow-type of its html argument (as inferred by the flow analysis). A document's flow-type is its set of gaps and their kinds (attribute or html) and its set of fields and their kinds (text, radio, checkbox, ...). The function is really not part of the <bigwig> language, but is considered a helpful compiler feature.


    Functions and `html' documents

    service {
      session Hello() {
        html H = <html>Hello <[gap]>!</html>;
    
        html f(bool in_bold, string s) {
          html B = <html><b><[content]></b></html>;
    
          if (in_bold) return B <[content = s];
          else return (html) s;
        }
    
        exit H <[gap = f(true, "World")];
      }
    }

    Result:

    Hello World!

    Here we have defined a function f that takes a boolean in_bold and a string s as arguments and produces an html document. The document produced is s as an html document, and with the text in bold face if the first argument is true. The function contains a type-conversion expression (html) s (that casts s to an html document).


    Recursive functions and `html' documents

    service {
      session RecList() {
        html list(int n) {
          html Item = <html>
            <li><[no]><[more]>
          </html>;
    
          if (n==0) return <html></html>;
          return Item <[no = n, more = list(n-1)];
        }
    
        exit list(3);
      }
    }

    Result:

  • 3
  • 2
  • 1
  • Needless to say, html documents work with recursion. This example is equivalent to the iterative example previously mentioned.


    Matching `html' documents

    service {
      session Dilbert() {
        html OutDoc = <html>
          <h1>Today's Dilbert</h1>
          <img src=[src] alt="today's Dilbert comic">
        </html>;
        string data = get("http://www.dilbert.com/");
        string src_str;
        match(data,<html><[]>
          <img src=[src] alt="today's Dilbert comic">
        <[]></html>)[src_str = src];
        exit OutDoc <[src=
            "http://www.dilbert.com"+src_str];
      }
    }

    Result: Today's Dilbert image without any adds.

    For now, the second html argument to match must be a constant html document. The constant document contains two unnamed gaps "<[]>". Whatever such gaps match in a match operation is discarded. Note that in order to work properly, the return character and the spaces after the first and before the second unnamed gap should be removed.


    Advanced

    Unfortunately, due to the undecidable nature of the analysis, there are many services that are unfairly rejected by the compiler. Such examples can be found below:


    Attribute and html gaps (continued)

    service {
      session Hello() {
        html D;
        html A = <html>
          <font color=[gap]>Hello World!</font>
        html>;
        html H = <html>Hello <[gap]>!</html>;
    
        if (...) D = A;
        else D = H;
        /* gap "gap" in D is an attribute gap
           (i.e. D can no longer be plugged with
           an html document). */
        exit D <[gap = "blue"];
      }
    }

    Result:

    Hello World!
    , when `...' evaluates to true
    Hello blue!
    , when `...' evaluates to false

    After the if-else statement, when the flows of the statements from the if-branch and the else-branch meet, the gap named "gap" in document D becomes an attribute gap regardless of which of the two branches is taken at runtime. Thus, D can no longer be plugged with documents (otherwise there could be problems if the expression ... evaluated to true). [In formal analysis terminology: least-upper-bound of an attribute gap and an html gap is an attribute gap].


    Tuple fields

    service {
      schema PersonSchema {
        bool married;
        string name;
      }
    
      html f(int n) {
        int i;
        html T = <html><tuple name=person>
          Name?: <input type="text" name="name">
          <input type="checkbox" name="married"
                 value="true">
        </tuple><[more]></html>;
        html H = T;
    
        for (i=0; i<n-1; i++) {
          H =<[more = T];
        }
        return H;
      }
    
      session Tuples() {
        vector PersonSchema p;
        html D;
    
        D = f(random(10));
        show D receive [p = person];
        ...;
      }
    }

    Result: Not shown!

    Note: This example assumes knowledge of schemas.
    The fact that one has to explicitly receive every single field, is incompatible with creating a dynamic number of input fields for the client to fill in. This is what the <tuple> tag was added for. The <tuple> tag is internal to <bigwig> and never actually shown to the client.


    Analysis shortcomings (undecidability)

    service {
      session Checkboxes() {
        int n;
        string choice;
        html D = <html><[gap1]><[gap2]></html>
        html H = <html>
          <input type="checkbox" name="c"
                 value="...">
        </html>;
    
        n = ...;
        if (n%2==0) D =<[gap1 = H];
        /* Here, it is assumed that D has a
           checkbox group `c' with one 
           element. */
        if (n%2==1) D =<[gap2 = H];
        /* Here, it is assumed that D has a
           checkbox group `c' with more than
           one element. */
        show D receive [choice = c];
      }
    }

    Result: N/A, Illegal program!

    After the first if-statement, the two possible flows (the then-branch and the non-existant or empty else-branch) cause the analysis to assume that there is a checkbox group named c with one checkbox. This is a fair assumption because a non-checked and a non-existant checkbox submit the same (non-existant) information upon submission. After the second if-statement the same thing happens again, forcing the analysis to safely assume that there could possibly be several checkboxes in the group named c. Consequently, when the document D is shown, it is assumed to potentially return several (more-than-one) units of information and is thus required to be received in a vector (or relation). This is why the above program will fail. It can be deduced that independent of the value of n, D could never hold two checkboxes. However, this deduction is due to a very complex inter-relationship between the two expressions n%2==0 and n%2==1 that a compiler could never hope to generally uncover. Precisely this example could be made detectable, but due to the undecidable nature of the problem itself, there would still be infinitely many similar undecidable problems, regardless of how sophisticated we make our analysis.


    Analysis shortcomings (monovariance)

    service {
      session Id() {
        html H = <html>Hello <[gap]>!</html>;
    
        html id(html H) {
          return H;
        }
    
        H = id(H);
        H = H <[gap = "World"];
        H = id(H); // Illegal call!
        exit H;
      }
    }

    Result: N/A, Illegal program!

    Since the interprocedural data-flow analysis on dynamic documents is monovariant (as opposed to poly-variant), there are some restrictions regarding the use of functions. It is required that the set of gaps and their kinds (attribute/html) and the set of fields and their kinds of all html arguments are the same for all calls to a function. The same thing goes for any html values returned. The first call to the id function dictates that from now on the argument (and result) given to id must be a document that has exactly one gap named "gap" of kind `html'. Consequently, the second call to id with a document without any gaps is denied by the compiler, yielding a compile-time error. This example would benefit from making the analysis poly-variant which it may be in the future.


    bigwig@brics.dk
    Last updated: November 2, 2001
    Valid HTML 4.01!