Backus–Naur Form (BNF)
Backus–Naur Form (BNF) is a formal metalanguage used to define the syntax and grammar rules of programming languages. Developed in the late 1950s by American computer scientist John Backus, BNF serves as a structured notation that helps programmers ensure that their code adheres to a specific set of rules, much like a grammar book does for a natural language. The notation uses a series of symbols, such as angle brackets to denote categories and defined terms, to visually represent language constructs. One notable feature of BNF is its use of metasymbols, which represent broader concepts in programming.
Peter Naur, a colleague of Backus, contributed to the refinement of BNF through his work on the ALGOL language, leading to a widely accepted version known as ALGOL 60. Over time, BNF has evolved into extended Backus-Naur Form (EBNF), which offers additional flexibility. Although BNF is a powerful tool for creating compilers—software that translates code between different languages—it does have limitations, such as challenges in defining variable lengths and ranges. Despite these drawbacks, BNF remains a foundational method in computer science for defining programming language syntax, proving essential for various applications, including simulations and data analysis in complex systems.
On this Page
Subject Terms
Backus–Naur Form (BNF)
Backus-Naur form (BNF) is a computer metalanguage that is used to specify the terms of another programming language. Sometimes called Backus Normal Form, the language originated in the 1950s and was the first computer language designed to help define the syntax of other computer languages. Decades later, Backus-Naur form remains one of the most popular ways for programmers to define new computer languages.
![Backus-Naur Form for address. By Benjamin Doe (Own work) [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons rssalemscience-20160829-16-144029.jpg](https://imageserver.ebscohost.com/img/embimages/ers/sp/embedded/rssalemscience-20160829-16-144029.jpg?ephost1=dGJyMNHX8kSepq84xNvgOLCmsE2epq5Srqa4SK6WxWXS)

Background
Programmable computers originated in the 1940s. During the 1950s, the scientists working on computer design and programming began writing languages to encode the programs that ran the computers, including FORTRAN and ALGOL—two of the most popular languages. American computer scientist John Backus worked on the teams that designed both languages. Backus is considered the originator of FORTRAN.
Based on his experience with FORTRAN and ALGOL, in 1959 Backus devised a way to define the syntax, or rules used to write or analyze a language, that could be used with these computer languages or any others. This collection of symbols, characters, and graphics—called a notation—helped programmers determine if the program was consistent within its own rules. It functions much as a book of grammar rules does for a writer.
Peter Naur, one of Backus's colleagues, was editing the ALGOL language and noticed some discrepancies and differences in Backus's notation. He adjusted these, creating what was at the time a new and revised version of the language called ALGOL 60 in 1963. This increased awareness of the notation and became the version used almost universally by programmers. As a result, the notation is named after both Backus and Naur. Eventually it was further refined and became known as extended Backus-Naur form, or EBNF. Sometimes there are small differences in how the Backus-Naur form is used, similar to the ways there are regional differences in many spoken languages. However, these differences are usually very minor and do not interfere with understanding the overall form.
Overview
When an English-speaking person is learning another language, such as Spanish, the person might use a textbook written in his or her first language—English—that explains the grammar and rules of Spanish. In this case, English is a metalanguage, or a language that explains and defines another language. In the same way, Backus-Naur form is a metalanguage for other computer languages. Using the symbols, graphics, and characters of Backus-Naur form, a programmer can share the rules for another new language. These symbols are laid out in a syntax chart, which the programmer uses to visually represent how the language is used.
For example, in Backus-Naur form, anything enclosed in angle brackets (< >) is understood to be a category of information that will be added to the program and ::= is understood to mean "is defined as." So, if a programmer wanted to tell someone that every time the category is "number" that equals the digits 0 through 9, it might be written like this: <number> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9. The vertical line is a notation for "or." In Backus-Naur form, < >, ::=, and | are considered metasymbols, or characters that represent other concepts or objects in a programming language.
Backus-Naur form is important to programmers creating compilers, or programs that help translate languages that are used to convert one computer language to another. The form can be used to convert the language used by a computer program that collects data to the language used by a computer that can combine that data with other information from other sources to generate a new kind of data. Compilers are used in many computerized functions, such as system dynamics, wherein information is gathered from a variety of different sources to generate a simulation or model of a situation. For example, a town that is considering adding a new intermodal transportation center that will integrate bus, taxi, car, and train stops might gather data on population, transportation usage, the businesses in the area, and the road system and rail systems connected to the area. Backus-Naur form will enable the analysts working on the project to create a compiler that will let them design a simulation to show how the proposed center will affect the area and what value it will bring. Without a way to explain the languages used by the various computer systems holding the different data, this process would be much more difficult.
Despite its usefulness, Backus-Naur form does have some flaws. Programmers do not have a convenient way to limit the length of a list of variables. It is also difficult to describe ranges; for example, if the definition of <number> is the numbers 1 through 100, the program needs to be written out listing each number with the "or" symbol (|) between each digit ( 1 | 2 | 3 | 4 | …98 | 99 | 100).
Backus-Naur form also does not provide a way to include a number that is not yet known—a variable—in the programming. All factors must be determined before they can be coded into the notations. However, despite these limitations, Backus-Naur form and its derivations remain the acknowledged best method for defining the grammar and syntax of a computer language. It allows for greater accuracy in using the language and provides programmers with a quick and efficient way to verify and apply the computer language.
Bibliography
Biswas, Ashutosh. "What Are BNF and EBNF in Programming?" FreeCodeCamp, 17 July 2023, https://www.freecodecamp.org/news/what-are-bnf-and-ebnf/. Accessed 4 Nov. 2024.
"BNF Notation in Compiler Design." GeeksforGeeks, 20 July 2021, www.geeksforgeeks.org/bnf-notation-in-compiler-design/. Accessed 4 Nov. 2024.
"EBNF: A Notation to Describe Syntax." University of California, Irvine, www.ics.uci.edu/~pattis/ICS-33/lectures/ebnf.pdf. Accessed 4 Nov. 2024.
"Getting Started: Syntax and Backus-Naur Form." Gold Parser, www.goldparser.org/getting-started/3-syntax.htm. Accessed 4 Nov. 2024.