SwansonSoftware Home > Science > Language


Draft Version 0.5, November 10 2003
Gregory Swanson


  1. Fundamentals
  2. Language Families
  3. Metalanguage
  4. Elements of Programming Languages
  5. References

1. Fundamentals

Programming languages today are based on concepts identical to the ones spoken languages are based on. This was not the case with early programming languages, which were created ad-hoc. Languages improved when designers applied formal language theory and formal semantics.

Programming language syntax is usually defined in a formal language such as Extended Backus-Naur Form or syntax diagrams. Semantics are much harder to define and are created with the aid of an 'abstract machine' to model the computer hardware.

Most programming languages belong to a class called context-free languages that are easy to parse using a stack. Context-free means that the language's syntax definitions do not depend on the context in which syntax elements appear. This makes it possible to generate the lexer and parser (parts of the compiler) automatically from lexical specifications and syntax. The programs that do this are called, respectively, a lexer generator and a parser generator.

Syntax and Semantics

Open a book about a programming language and scan the contents. You find chapter headings like:

Kernighan and Ritchie, 1988

Schwartz, Olson, and Christiansen, 1997

David Flanagan, 1998

The similarities are apparent, but so are the differences. At the highest level, a programming language consists of:

(The above modified from Fischer and Grodzinsky)

High-level Classification

If syntax and semantics were all we cared about there might be only one programming language. However there are many, and they exist because programs are used in different kinds of work. These kinds of work fall into two categories that form our highest-level idea of what a program does. (ibid):

These two concepts indicate very different fundamental requirements in a language, and therefore two major categories of languages. I will refer to these as systems languages (first in the above list) and non-systems languages. Requirements for languages in these categories differ in terms of level of support for abstraction, capability for implicit and explicit communication, capacity for alternate expressions of a process or model, and the way data is mapped onto the machine.

After writing a few programs with a particular language and getting to know its capabilities, a programmer may remark on the language's lack of support to perform certain tasks, or that it requires writing lots of code to accomplish. The language lacks the ability to represent in a sufficiently abstract way the ideas the programmer needs to communicate.

Capacity to abstract ideas using a language is indicated by three qualities of data representation. Systems languages have less capacity in these qualities than do non-systems languages (ibid).

Using the Formal Language to Create a Program

Source code is read by a program called a parser or compiler. There are several stages in the process:

Language Quality

Besides semantic clarity - a quality of a language whereby the semantic intent of program text is easy to determine - language quality is characterized by the quality and availability of its features (ibid):

2. Language Families

One must keep in mind that languages are more alike than different, and often belong to more than one category. These family classifications are intended to extend the reader's vocabulary more than to indicate their differences. This is only a summary of the more well-known language categories and is not meant to be thorough. (ibid).

3. Metalanguage

Metalanguage describes the parts of a programming language used to manipulate the language rather than data in a program. This section describes the elements usually found in a metalanguage, and comments on the tradeoffs due to the way these features are implemented (ibid).

4. Elements of Programming Languages



Fischer, Alice E. and Grodzinsky, Frances S., 1993, The Anatomy of Programming Languages: Prentice Hall, pages. ISBN 0-13-035155-5

Flanagan, David, 1997, JavaScript: The Definitive Guide, Third Edition: O`Reilly & Associates, pages. ISBN 1-56592-392-8

Kernighan, Brian W. and Ritchie, Dennis M., 1988, The C Programming Language, Second Edition: Prentice Hall, pages. ISBN 0-13-110370-9

Schwartz, Randal L., Olson, Erik and Christiansen, Tom, 1997, Learning Perl on Win32 Systems: O`Reilly & Associates, pages. ISBN 1-56592-324-3

Top of page