VISUAL LANGUAGES

Shi-Kuo Chang

Visual Computer Laboratory
Department of Computer Science
University of Pittsburgh
Pittsburgh, PA 15260 USA
Email: chang@cs.pitt.edu

and

Knowledge Systems Institute
3420 Main Street, Skokie
IL 60076 USA


(To appear in Encyclopedia of Electrical and Electronics Engineering, John Wiley and Sons, 1998)

Languages that let users create custom icons and iconic/visual sentences are receiving increased attention as multimedia applications become more prevalent. Visual language systems let the user introduce new icons, and create iconic/visual sentences with different meanings and the ability to exhibit dynamic behavior. Furthermore, visual programming systems support problem solving and software development through the composition of basic software components using spatial operators such as "connect port #1 of component A to port #2 of component B".

We will first introduce the elements of visual languages, then describe how visual languages can be extended to deal with multimedia. We will discuss visual programming languages both for general purpose problem solving and for special application to database querying. Finally on-line bibliographies for further reference and some thoughts concerning the future of visual languages and visual programming languages are provided.

1. ELEMENTS OF VISUAL LANGUAGES

A visual language is a pictorial representation of conceptual entities and operations and is essentially a tool through which users compose iconic, or visual, sentences (Ref 1). The icons generally refer to the physical image of an object. Compilers for visual languages must interpret visual sentences and translate them into a form that leads to the execution of the intended task (Ref 2). This process is not straightforward. The compiler cannot determine the meaning of the visual sentence simply by looking at the icons. It must also consider the context of the sentence, how the objects relate to one another. Keeping the user’s intent and the machine's interpretation the same is one of the most important tasks of a visual language (Ref 3).
A visual sentence is a spatial arrangement of object icons and/or operation icons that usually describes a complex conceptual entity or a sequence of operations. Object icons represent conceptual entities or groups of object icons that are arranged in a particular way. Operation icons, also called process icons, denote operations and are usually context-dependent. Figure 1(a) illustrates a visual sentence that consists of horizontally arranged icons, with a dialog box overlaid on it. This particular location-sensitive visual sentence changes meaning when the locations of icons change (see Figure 1(b)), and can be used to specify to-do items for TimeMan, a time-management personal digital assistant.

Figure 2 illustrates a content-sensitive visual sentence for TimeMan. The fish in the tank are object icons, each of which represents a to-do item, and the cat is an operation icon that appears when there are too many fish in the tank (the to-do list is too long). Figure 3 illustrates a time-sensitive visual sentence that changes its meaning with time. The icons (circles and vertical bars) in this visual sentence are connected by arcs. Thus this visual sentence is the visual representation of a directed graph, specifically a Petri net. When tokens flow in this directed graph, this visual sentence changes its meaning.
Icons are combined using operators. The general form of binary operations is expressed as x1 op x2= x3, where the two icons x1 and x2 are combined into x3 using operator op. The operator op = (opm, opp), where opm is the logical operator, and opp is the physical operator. Using this expanded notation, we can write (xm1, xp1) op (xm2, x p2) = ((xm1 opm xm2), (xp1 opp xp2)). In other words, the meaning part xm1 and xm2 are combined using the logical operator opm, and the physical part xp1 and xp2 are combined using the physical operator opp.

Operators can be visible or invisible. Most system-defined spatial/temporal operators are invisible, whereas all user-defined operators are visible for the convenience of the user. For example, excluding the dialog box, the visual sentence in Figure 1(a) is the horizontal combination of three icons. Therefore, it can be expressed as:

( CHILDREN hor SCHOOL_HOUSE ) hor SUNRISE

where hor is an invisible operator denoting a horizontal combination. But if we look at Figure 2, the cat is a visible operator denoting a process to be applied to the fish in the fish tank. An operation icon can be regarded as a visible operator.

The four most useful domain-independent spatial icon operators are ver, for vertical composition; hor, for horizontal composition; ovl for overlay; and con, for connect. The operators ver, hor and ovl are usually invisible (see Figure 1 for an example, where the hor operator is invisible). On the other hand, the operator con is usually visible as a connecting line (see Figure 3 for an example, where the connecting lines among the icons called places and transitions are visible). This operator con is very useful in composing visual programs (see Section 3).

A visual language has a grammar, G, which a compiler uses to generate sentences belonging to this visual language:

G = (N, X, OP, s, R)

where N is the set of nonterminals, X is the set of terminals (icons), OP is the set of spatial relational operators, s is the start symbol, and R is the set of production rules whose right side must be an expression involving relational operators.

Informally, a visual language is a set of visual sentences, each of which is the spatial composition of icons from the set X, using spatial relational operators from the set OP. To represent the meaning of an icon, we use either a frame or a conceptual graph, depending on the underlying semantic model of the application system being developed. Both are appropriate representations of meaning, and can be transformed into one another. For example, the SCHOOL_HOUSE icon in Figure 1(a) can be represented by the following frame:
   Icon SCHOOL_HOUSE
   WHO:       nil
   DO:        study
   WHERE:     school
   WHEN:      nil
In other words, the SCHOOL_HOUSE icon has the meaning "study" if it is in the DO location, or the meaning "school" in the WHERE location. Its meaning is "nil" if it is in the WHO or WHEN location. An equivalent linearized conceptual graph is as follows:
   [Icon = SCHOOL_HOUSE]
     --(sub)-->  [WHO = nil]
     --(verb)->  [DO = study]
     --(loc)-->  [WHERE = school]
     --(time)->  [WHEN = nil]
The meaning of a composite icon can be derived from the constituent icons, if we have the appropriate inference rules to combine the meanings of the constituent icons. Conceptual dependency theory can be applied to develop inference rules to combine frames (Ref 4). Conceptual operators can be used to combine conceptual graphs (Ref 5). As a simple example, the merging of the frames for the icons in the visual sentence shown in Figure 1(a) will yield the frame:
   Visual_Sentence vs1
   WHO:   children
   DO:    study
   WHERE: nil
   WHEN:  morning
We can derive this frame by merging the frames of the four icons using the following rule: Thus the first slot with slot_name WHO gets the value "children" from the corresponding slot of the first icon CHILDREN, the second slot with slot_name DO gets the value "study" from the corresponding slot of the second icon SCHOOL_HOUSE, etc.

For visual sentences that are directed graphs, the syntax and semantics can be specified using various kinds of graph grammars. Graph grammars can be used to define the concrete and the abstract syntax of visual languages, but the problem of efficient parsing of visual sentences based upon graph grammars still requires the continued effort of researchers, because most graph parsers work in exponential time. As a starting place for further study, (Ref 6) presents a layered graph grammar and its parsing algorithm, and also surveys various graph parsing algorithms.

2. EXTENDING VISUAL LANGUAGES FOR MULTIMEDIA

Visual languages, which let users customize iconic sentences, can be extended to accommodate multimedia objects, letting users access media dynamically. Teleaction objects, or multimedia objects with knowledge structures, can be designed using visual languages to automatically respond to events and perform tasks like find related books in virtual library BookMan.

At the University of Pittsburgh and the Knowledge Systems Institute, we have developed a formal framework for visual language semantics that is based on the notion of an icon algebra and have designed several visual languages for the speech impaired. We have since extended the framework to include the design of multidimensional languages - visual languages that capture the dynamic nature of multimedia objects through icons, earcons (sound), micons (motion icons), and vicons (video icons). The user can create a multidimensional language by combining these icons and have direct access to multimedia information, including animation.

We have successfully implemented this framework in developing BookMan , a virtual library used by the students and faculty of the Knowledge Systems Institute. As part of this work, we extended the visual language concepts to develop teleaction objects, objects that automatically respond to some events or messages to perform certain tasks (Ref 7). We applied this approach to emergency management, where the information system must react to flood warnings, fire warnings, and so on, to present multimedia information and to take actions (Ref 8). An Active Medical Information System was also developed based upon this approach (Ref 9).

Figure 4 shows the search and query options available with BookMan. Users can perform a range of tasks, including finding related books, finding books containing documents similar to documents contained in the current book, receiving alert messages when related books or books containing similar documents have been prefetched by BookMan, finding other users with similar interests or receiving alert messages about such users (the last function requires mutual consent among the users) m etc. Much of this power stems from the use of Teleaction Objects (TAOs).

To create a TAO, we attached knowledge about events to the structure of each multimedia object - a complex object that comprises some combination of text, image, graphics, video, and audio objects. TAOs are valuable because they greatly improve the selective access and presentation of relevant multimedia information. In BookMan, for example, each book or multimedia document is a TAO because the user can not only access the book, browse its table of contents, read its abstract, and decide whether to check it out, but also be informed about related books, or find out who has a similar interest in this subject. The user can indicate an intention by incrementally modifying the physical appearance of the book, usually with just a few clicks of the mouse.

TAOs can accommodate a wide range of functions. For example, when the user clicks on a particular book, it can automatically access information about related books and create a multimedia presentation from all the books. The drawback of TAOs is that they are complex objects and therefore the end user can not easily manipulate them with traditional define, insert, delete, modify, and update commands. Instead, TAOs require direct manipulation, which we provided through a multidimensional language.

The physical appearance of a TAO is described by a multidimensional sentence. The syntactic structure derived from this multidimensional sentence controls its dynamic multimedia presentation. The TAO also has a knowledge structure called the active index that controls its event-driven or message-driven behavior. The multidimensional sentence may be location-sensitive, time-sensitive or content-sensitive. Thus, an incremental change in the TAO’s external appearance is an event that causes the active index to react. As we will describe later, the active index itself can be designed using a visual-language approach.

The multidimensional language consists of generalized icons and operators, and each sentence has a syntactic structure that controls the dynamics of a multimedia presentation.

Section 1 described the icons and operators in a visual (not multidimensional) language. In a multidimensional language, we need not only icons that represent objects by images, but also icons that represent the different types of media. We call such primitives generalized icons and define them as x = (xm, xp) where xm is the meaning and xp is the physical appearance. To represent TAOs, we replace the xp with other expressions that depend on the media type:

o Icon: (xm, xi) where xi is an image
o Earcon: (xm, xe) where xe is sound
o Micon: (xm, xs) where xs is a sequence of icon images (motion icon)
o Ticon: (xm, xt) where xt is text (ticon can be regarded as a subtype of icon)
o Vicon: (xm, xv) where xv is a video clip (video icon)

The combination of an icon and an earcon/micon/ticon/vicon is a multidimensional sentence.

For multimedia TAOs, we define operators as

o Icon operator op = (opm, opi), such as ver (vertical composition), hor (horizontal composition), ovl (overlay), con (connect), surround, edge_to_edge, etc.
o Earcon operator op = (opm, ope), such as fade_in, fade_out, etc.
o Micon operator op = (opm, ops), such as zoom_in, zoom_out, etc.
o Ticon operator op = (opm, opt), such as text_merge, text_collate, etc.
o Vicon operator op = (opm, opv), such as montage, cut, etc.

Two classes of operators are possible in constructing a multimedia object. As we described in Section 1, spatial operators are operators that involve spatial relations among image, text or other spatial objects. A multimedia object can also be constructed using operators that consider the passage of time. Temporal operators, which apply to earcons, micons, and vicons, make it possible to define the temporal relation (Ref 10) among generalized icons. For example, if one wants to watch a video clip and at the same time listen to the audio, one can request that the video co_start with the audio. Temporal operators for earcons, micons, ticons and vicons include co_start, co_end, overlap, equal, before, meet, and during and are usually treated as invisible operators because they are not visible in the multidimensional sentence.

When temporal operators are used to combine generalized icons, their types may change. For example, a micon followed in time by another icon is still a micon, but the temporal composition of micon and earcon yields a vicon. Media type changes are useful in adaptive multimedia so that one type of media may be replaced/combined/augmented by another type of media (or a mixture of media) for people with different sensory capabilities.

We can add still more restrictions to create subsets of rules for icons, earcons, micons and vicons that involve special operators:

o For earcons, special operators include fade_in, fade_out,
o For micons, special operators include zoom_in, zoom_out,
o For ticons, special operators include text_collate, text_merge,
o For vicons, special operators include montage, cut.

These special operators support the combination of various types of generalized icons so that the multidimensional language can fully reflect all multimedia types.

Multidimensional languages can handle temporal as well as spatial operators. As we described in Section 1, a visual language has a grammar, G = (N, X, OP, s, R)

To describe multidimensional languages, we extended the X and OP elements of G: X is still the set of terminals but now includes earcons, micons, ticons, and vicons as well as icons, and the OP set now includes temporal as well as spatial relational operators.

Figure 1(b) without the dialog box illustrates a simple visual sentence, which describes the to-do item for TimeMan. With the dialogue box, the figure becomes a multidimensional sentence used by TimeMan to generate "The children drive to school in the morning" in synthesized speech. The multidimensional sentence has the syntactic structure:

(DIALOG_BOX co_start SPEECH) ver (((CHILDREN hor CAR) hor SCHOOL_HOUSE) hor SUNRISE)

Figure 5 is a hypergraph of the syntactic structure. The syntactic structure is essentially a tree, but it has additional temporal operators (such as co_start) and spatial operators (such as hor and ver) indicated by dotted lines. Some operators may have more than two operands (for example, the co_start of audio, image, and text), which is why the structure is called a hypergraph. The syntactic structure controls the multimedia presentation of the TAO.

For World Wide Web applications, the HTML language can be extended to TAOML (Teleaction Object Markup Language) so that teleaction objects can be specifed using HTML enhanced by a multidimensional language and realized as Web pages. For example, the TAOML pages can serve as the interface to an Active Medical Information System (Ref 9).

Multidimensional languages must also account for multimedia dynamics because many media types vary with time. This means that a dynamic multidimensional sentence changes over time. Transformation rules for spatial and temporal operators can be defined to transform the hypergraph in Figure 5 to a Petri net that controls the multimedia presentation. Figure 3 represents the Petri net of the sentence in Figure 1(b). As such, it is also a representation of the dynamics of the multidimensional sentence in Figure 1(b). The multimedia presentation manager can execute this Petri net dynamically to create a multimedia presentation (Ref 11). For example, the presentation manager will produce the visual sentence in Figure 1(b) as well as the synthesized speech.

3. VISUAL PROGRAMMING LANGUAGES

Visual programming is programming by visual means. Typically, a programmer or an end user employs some visual programming tool to define and/or construct basic software components such as cells, circuits, blocks, etc. and then put these components together to compose a visual program. The constructed visual program is then interpreted and executed by a visual programming system.

The basic software components can be defined by the programmer/user or obtained from a predefined software component library. Each software component has a visual representation for ease of comprehension by the user. Therefore software components are generalized icons, and a visual program is a visual sentence composed from generalized icons that are software components. Since the software components are connected together to form a visual program, a visual program can be represented by graph where the basic components in the graph may have multiple attachment points. Examples of commercially available visual programming systems include Prograph which is an object-oriented programming language with dataflow diagrams as its visualization (Ref 12), and LabVIEW which supports the interconnections of boxes representing software/hardware components (Ref 13).

Visual programming is appealing because the programmer or end user can easily manipulate the basic software components and interactively compose visual programs with the help of visual programming tools. Some would claim that visual programming is more intuitive and therefore simpler than traditional programming. Some would further claim that even untrained people can learn visual programming with little effort. However such claims remain to be proven, especially for large-scale software development (Ref 14).

As described in the previous two sections, visual languages and multidimensional languages are useful in specifying the syntactic structure, knowledge structure and dynamic behavior of complex multimedia objects such as TAOs (teleaction objects). We can also construct visual programs using active index cells, which are the key elements of TAOs (Ref 15). Without the active index cell, a TAO would not be able to react to events or messages, and the dynamic visual language would lose its power. As an example of visual programming, we can specify index cells using a visual programming tool to be described in Section 3.2. The index cells can thus be connected together as a visual program to accomplish a given task.

An index cell accepts input messages, performs some action, and posts an output message to a group of output index cells. Depending on its internal state and the input messages, the index cell can post different messages to different groups of output index cells. Therefore the connection between an index cell and its output cells is dynamic. For example, if a BookMan user wants to know about new books on nuclear winter, the user modifies the visual sentence, causing TAO to send a message to activate a new index cell that will collect information on nuclear winter.

An index cell can be either live or dead, depending on its internal state. The cell is live if the internal state is anything but the dead state. If the internal state is the dead state, the cell is dead. The entire collection of index cells, either live or dead, forms the index cell base. The set of live cells in the index cell base forms the active index.

Each cell has a built-in timer that tells it to wait a certain time before deactivating (dead internal state). The timer is reinitialized each time the cell receives a new message and once again becomes active (live). When an index cell posts an output message to a group of output index cells, the output index cells become active. If an output index cell is in a dead state, the posting of the message will change it to the initial state, making it a live cell, and will initialize its timer. On the other hand, if the output index cell is already a live cell, the posting of the message will not affect its current state but will only reinitialize its timer.

Active output index cells may or may not accept the posted message. The first output index cell that accepts the output message will remove this message from the output list of the current cell. (In a race, the outcome is nondeterministic.) If no output index cell accepts the posted output message, the message will stay indefinitely in the output list of the current cell. For example, if no index cells can provide the BookMan user with information about nuclear winter, the requesting message from the nuclear winter index cell will still be with this cell indefinitely.

After its computation, the index cell may remain active (live) or deactivate (die). An index cell may also die if no other index cells (including itself) post messages to it. Thus the nuclear winter index cell in BookMan will die if not used for a long time, but will be reinitialized if someone actually wants such information and sends a message to it.

Occasionally many index cells may be similar. For example, a user may want to attach an index cell to a document that upon detecting a certain feature sends a message to another index cell to prefetch other documents. If there are 10,000 such documents, there can be ten thousand similar index cells. The user can group these cells into an index cell type, with the individual cells as instances of that type. Therefore, although many index cells may be created, only a few index cell types need to be designed for a given application, thus simplifying the application designer’s task.

To aid multimedia application designers in constructing index cells, we developed a visual programming tool, IC Builder, and used it to construct the index cells for BookMan. Figure 6 shows a prefetch index cell being built. Prefetch is used with two other index cell types to retrieve documents (Ref 15). If the user selects the prefetch mode of BookMan, the active index will activate the links to access information about related books. Prefetch is responsible for scheduling prefetching, initiating (issuing) a prefetching process to prefetch multimedia objects, and killing the prefetching process when necessary.

Figure 6(a) shows the construction of the state-transition diagram. The prefetch index cell has two states: state 0, the initial and live state, and state -1, the dead state. The designer draws the state-transition diagram by clicking on the appropriate icons. In this example, the designer has clicked on the fourth vertical icon (zigzag line) to draw a transition from state 0 to state 0. Although the figure shows only two transition lines, the designer can specify as many transitions as necessary from state 0 to state 0. Each transition could generate a different output message and invoke different actions. For example, the designer can represent different prefetching priority levels in BookMan by drawing different transitions.

The designer wants to specify details about transition2 and so has highlighted it. Figure 6(b) shows the result of clicking on the input message icon (top icon to the right of the State Transition Specification Dialog box.) IC Builder brings up the Input Message Specification Dialog box so that the designer can specify the input messages. The designer specifies message 1 (start_prefetch) input message. The designer could also specify a predicate, and the input message is accepted only if this predicate is evaluated true. Here there is no predicate, so the input message is always accepted.

Figure 6(c) shows what happens if the designer clicks on the output message icon in Figure 6(a) (bottom icon to the right of the State Transition Specification Dialog box). IC Builder brings up the Output Message Specification Dialog box so that the designer can specify actions, output messages, and output index cells. In this example, the designer has specified three actions: compute_schedule (determine the priority of prefetching information), issue_prefetch_proc (initiate a prefetch process), and store_pid (once a prefetch process is issued, its process id or pid is saved so that the process can be killed later if necessary). In the figure there is no output message, but both input and output messages can have parameters. The index cell derives the output parameters from the input parameters.

The construction of active index from index cells is an example of visual programming for general purpose problem solving - with appropriate customization the active index can do almost anything. In the following, we will describe a special application of visual programming to database querying.

When the user makes incremental changes to a multidimensional sentence, certain events occur and messages are sent to the active index. For example, suppose the user clicks on a book TAO to change the color attribute of the book. This is a select event, and the message select is sent to the active index. If the user creates a new related_info operation icon, this is a related_info event, and a message prefetch_related_info is sent to the active index. The incremental changes to a multidimensional sentence can be either:

o Location-sensitive. The location attribute of a generalized icon is changed.
o Time-sensitive. The time attribute of a generalized icon is changed.
o Content-sensitive. An attribute of a generalized icon other than a location or time attribute is changed or a generalized icon is added or deleted, or an operator is added or deleted.

A visual sentence or multidimensional sentence can also be either location-sensitive, time-sensitive, or content-sensitive. In Section 1 we gave examples of different types of visual sentences. The resulting language is a dynamic visual language or dynamic multidimensional language.

A dynamic visual language for virtual reality (VR) serves as a new paradigm in a querying system with multiple paradigms (form-based queries, diagram-based queries and so on) because it lets the user freely switch paradigms (Ref 16). When the user initially browses the virtual library, the VR query may be more natural; but when the user wants to find out more details, the form-based query may be more suitable. This freedom to switch back and forth among query paradigms gives the user the best of all worlds, and dynamic querying can be accomplished with greater flexibility.

From the viewpoint of dynamic languages, a VR query is a location-sensitive multidimensional sentence. As Figure 4(b) shows, BookMan indicates the physical locations of books by marked icons in a graphical presentation of the books stacks of the library. What users see is similar (with some simplification) to what they would experience in a real library. That is, the user selects a book by picking it from the shelf, inspects its contents and browses adjacent books on the shelf.

In Figure 4(a), initially the user is given the choice of query paradigms: search by title, author, ISBN, or keyword(s). If the user selects the virtual library search, the user can then navigate in the virtual library, and as shown in Figure 4(b), the result is a marked object. If the user switches to a form-based representation by clicking the DetailedRecord button, the result is a form as shown in Figure 4(c). The user can now use the form to find books of interest, and switch back to the VR query paradigm by clicking the VL location button in Figure 4(c).

Essentially, the figure illustrates how the user can switch between a VR paradigm (such as the virtual library) and a logical paradigm (such as the form). There are certain admissability conditions for this switch. For a query in the logical paradigm to be admissable to the VR paradigm, the retrieval target object should also be an object in VR. For example, the virtual reality in the BookMan library is stacks of books, and an admissable query would be a query about books, because the result of that query can be indicated by marked book icons in the virtual library.

Conversely, for a query in the VR paradigm to be admissable to the logical paradigm, there should be a single marked VR object that is also a database object, and the marking is achieved by an operation icon such as similar_to (find objects similar to this object), near (find objects near this object), above (find objects above this object), below (find objects below this object), and other spatial operators. For example, in the VR for the virtual library, a book marked by the operation icon similar_to is admissable and can be translated into the logical query "find all books similar to this book".

Visual query systems for multimedia databases, like BookMan, are under active investigation at many universities as well as industrial laboratories (Ref 17). These systems are very flexible. For example, a user can easily and quickly ask for any engineering drawing that contains a part that looks like the part in another drawing and that has a signature in the lower right corner that looks like John Doe’s signature. In BookMan we have a mechanism that lets users create similarity retrieval requests that prompt BookMan to look for books similar to the book being selected, and then perform searches on the World Wide Web using a Web browser enhanced with an active index (Ref 18).

4. CONCLUDING REMARKS

Visual languages and visual programming languages are progressing at a rapid pace. Several on-line bibliographies are now available (Ref. 19, 20, 21). As far as programming is concerned, visual programming languages may not be appropriate for every situation. An important question is whether visual programming languages can scale up to handle large scale applications (Ref 22). Moreover, empirical, systematic evaluation of visual programming languages needs to be done (Ref 23).

The average programmer and end user are used to a hybrid mode of human-computer interaction, involving text, graphics, sound and the like. Thus, "pure" visual programming languages are sometimes hard to justify. On the other hand, languages allowing hybrid mode of interactions are already unavoidable due to the explosion of multimedia computing and network computing. As multimedia applications become even more widespread, we expect to see more special-purpose or general-purpose visual language systems and visual programming systems in which visual and multidimensional languages play an important role, both as a theoretical foundation and as a means to explore new applications.

Acknowledgments:

This research was supported in part by the National Science Foundation under grant IRI- 9224563.

BIBLIOGRAPHY

1. Chang, S. K., G. Costagliola, G Pacini, M. Tucci, G. Tortora, B. Yu, and J. S. Yu, "Visual Language System for User Interfaces," IEEE Software, pp. 33-44, March 1995.

2. Chang, S. K., "A Visual Language Compiler for Information Retrieval by Visual Reasoning," IEEE Transactions on Software Engineering, pp. 1136-1149, 1990.

3. Crimi, C., A. Guercio, G. Pacini, G. Tortora, and M. Tucci, "Automating Visual Language Generation," IEEE Transactions on Software Engineering, vol. 16, no. 10, pp. 1122-1135, October 1990.

4. Chang, S. K., S. Orefice, M. Tucci, and G. Polese, "A Methodology and Interactive Environment for Iconic Language Design," International Journal of Human- Computer Studies, vol. 41, pp. 683-716, 1994.

5. Chang, S. K., M.J. Tauber, B. Yu, and J.S. Yu, "A Visual Language Compiler," IEEE Transactions on Software Engineering, vol. 5, no. 5, pp. 506-525, 1989.

6. Rekers, J. and Schuerr, A., "Defining and Parsing Visual Languages with Layered Graph Grammars", Journal of Visual Languages and Computing, Vol. 8, No. 1, 1997, 27-55.

7. Chang, H., T. Hou, A. Hsu, and S. K. Chang, "Management and Applications of Tele-Action Objects," ACM Multimedia Systems Journal, vol. 3, no. 5-6, pp. 204-216, Springer Verlag, 1995.

8. Khalifa, Y., S. K. Chang, and L. Comfort, "A Prototype Spatial- Temporal Reasoning System for Emergency Management," Proc. of International Conference on Visual Information Systems VISUAL96, pp. 469-478, Melbourne, Australia, February 5-7, 1996.

9. Chang, S. K., Graupe, D., Hasegawa, K. and Kordylewski, H., "An Active Medical Information System for Information Retrieval, Discovery and Fusion", to appear in IJSEKE, March 1998 (for a demo see www.cs.pitt.edu/~jung/AMIS2).

10. Allen, J. F., "Maintaining Knowledge about Temporal Intervals," Communications of the ACM, vol. 26, no. 11, pp. 832-843, November 1983.

11. Lin, C. C., J. X. Xiang, and S. K. Chang, "Transformation and Exchange of Multimedia Objects in Distributed Multimedia Systems," ACM Multimedia Systems Journal, vol. 4, no. 1, pp. 2-29, Springer Verlag, 1996.

12. Prograph CPX User's Guide, Pictorius Incorporated, 1993.

13. Baroth, E. and Hartsouth, C., "Visual Programming in the Real World", in Visual Object-Oriented Programming Concepts and Environments, (M. Burnett , A. Goldberg and T. Lewis, eds), Manning Publications Co., Greenwich, CT., 1995 , 21-42.

14. Whitley, K. N., "Visual Programming Languages and the Empirical Evidence For and Against", Journal of Visual Languages and Computing, Vol. 8, No. 1, 1997, 109-142.

15. Chang, S. K., "Towards a Theory of Active Index," Journal of Visual Languages and Computing, vol. 6, no. 1, pp. 101-118, 1995.

16. Chang, S. K., M. F. Costabile, and S. Levialdi, "Reality Bites - Progressive Querying and Result Visualization in Logical and VR Spaces," Proc. of IEEE Symposium on Visual Languages, pp. 100-109, St. Louis, October 1994.

17. Catarci, T., Costabile, M.F., Levialdi, S. and Batini, C., "Visual Query Systems for Databases: A Survey", Journal of Visual Languages and Computing, Vol. 8, No. 2, 1997, 215-260.

18. Catarci, T., Chang, S. K., Dong, L. B. and Santucii, G., "A Prototype Web-At-a-Glance System for Intelligent Information Retrieval", Proc. of SEKE'97, Madrid, Spain, June 18-20, 1997, 440-449 (for a demo see www.cs.pitt.edu/~jung/WAG).

19. Burnett, M., http://www.cs.orst.edu/~burnett/vpl.html

20. Korfhage, R., www.pitt.edu/~korfhage/vlrefs.html

21. Schiffer, S., http://www.swe.uni-linz.ac.at/schiffer/buch/literatur.htm

22. Burnett, M., Baker, M., Bohus, C., Carlson, P., Yang, S. and Zee, P., "Scaling Up Visual Programming Languages", Computer 28(3), IEEE CS Press, pp. 45-54, March 1995.

23. Kiper, J.D., Howard, E. and Ames, C., "Criteria for Evaluation of Visual Programming Languages", Journal of Visual Languages and Computing, Vol. 8, No. 2, 1997, 175-192.

Figures:

Figure 1. A visual sentence whose meaning changes when the icons change their positions is called a location-sensitive visual sentence. The visual sentence (a) has the meaning "The children study in the morning", and (b) has the meaning "The children drive to school in the morning". Comparing the two, this example shows how the placement of the "school" icon changes the meaning. Such visual sentences can be used to specify to-do items for the time management personal digital assistant TimeMan.

Figure 1(a)


Figure 1(b)

Figure 2. Content-Sensitive visual sentences (a) and (b) show the fish tank and cat metaphor for the time management personal digital assistant TimeMan. Each fish represents a to-do item. When the to-do list grows too long, the fish tank is overpopulated and the cat appears. The fish tank icon and cat operation icon have corresponding index cells receiving messages from these icons when they are changed by the user.

Figure 2(a)


Figure 2(b)

Figure 3. A time-sensitive visual sentence for the Petri net controlling the presentation of the visual sentence shown in Figure 1(b).

Figure 4. The virtual library BookMan lets the user (a) select different search modes, (b) browse the virtual library and select desired book for further inspection, and (c) switch to a traditional form-based query mode.

Figure 4(a)


Figure 4(b)

Figure 4(c)

Figure 5. The syntactic structure of the multidimensional sentence shown in Figure 1(b). This structure is a hypergraph because some relational operators may correspond to lines with more than two end points.

Figure 6. The visual specification for an active index cell of the virtual library BookMan: (a) the state transitions, (b) input message, (c) output message and actions.

Figure 6(a)


Figure 6(b)


Figure 6(c)