Abstraction (computer science)
Abstraction in computer science is a fundamental concept used to manage the complexities of computer systems by simplifying user instructions. It allows users to interact with systems without needing to understand the intricate details of how they operate. For instance, while computers process information in binary, users often input data in more familiar decimal formats, thanks to layers of abstraction that convert these inputs seamlessly. Various forms of abstraction exist, including data abstraction, which organizes data meaningfully, and control abstraction, which streamlines programming through control flows.
Programming languages exemplify abstraction levels, categorized from low-level machine languages to high-level languages that enhance usability and adaptability across different systems. Abstraction can also lead to challenges, such as "abstraction inversion," where users may struggle to access obscured functions within a program. Additionally, the concept of "leaky abstraction" highlights that no abstraction is perfect; users may still encounter the underlying complexities. Within object-oriented programming, abstraction plays a crucial role in defining objects and enabling polymorphism, allowing a single interface to interact with various data types effectively. Overall, abstraction serves as a vital tool in making computer systems more accessible and manageable for users.
On this Page
Subject Terms
Abstraction (computer science)
In computer science, abstraction is a strategy for managing the complex details of computer systems. Broadly speaking, it involves simplifying the instructions that a user gives to a computer system in such a way that different systems, provided they have the proper underlying programming, can "fill in the blanks" by supplying the levels of complexity that are missing from the instructions. For example, most modern cultures use a decimal (base 10) positional numeral system, while digital computers read numerals in binary (base 2) format. Rather than requiring users to input binary numbers, in most cases a computer system will have a layer of abstraction that allows it to translate decimal numbers into binary format.

![Data abstraction levels of a database system. By No machine-readable author provided. Doug Bell~commonswiki assumed (based on copyright claims). [Public domain], via Wikimedia Commons 113931282-115578.jpg](https://imageserver.ebscohost.com/img/embimages/ers/sp/embedded/113931282-115578.jpg?ephost1=dGJyMNHX8kSepq84xNvgOLCmsE2epq5Srqa4SK6WxWXS)
There are several different types of abstraction in computer science. Data abstraction is applied to data structures in order to manipulate bits of data manageably and meaningfully. Control abstraction is similarly applied to actions via control flows and subprograms. Language abstraction, which develops separate classes of languages for different purposes—modeling languages for planning assistance, for instance, or programming languages for writing software, with many different types of programming languages at different levels of abstraction—is one of the fundamental examples of abstraction in modern computer science.
The core concept of abstraction is that it ideally conceals the complex details of the underlying system, much like the desktop of a computer or the graphic menu of a smartphone conceals the complexity involved in organizing and accessing the many programs and files contained therein. Even the simplest controls of a car—the brakes, gas pedal, and steering wheel— are, in a sense, abstract because of the complex elements involved in converting the mechanical energy applied to them into the electrical signals and mechanical actions that govern the car's motions.
Background
Even before the modern computing age, mechanical computers such as abacuses and slide rules abstracted, to some degree, the workings of basic and advanced mathematical calculations. Language abstraction has developed alongside computer science as a whole; it has been a necessary part of the field from the beginning, as the essence of computer programming involves translating natural-language commands such as "add two quantities" into a series of computer operations. Any involvement of software at all in this process inherently indicates some degree of abstraction.
The levels of abstraction involved in computer programming can be best demonstrated by an exploration of programming languages, which are grouped into generations according to degree of abstraction. First-generation languages are machine languages, so called because instructions in these languages can be directly executed by a computer’s central processing unit (CPU), and are written in binary numerical code. Originally, machine-language instructions were entered into computers directly by setting switches on the machine. Second-generation languages are called assembly languages, designed as shorthand to abstract machine-language instructions into mnemonics in order to make coding and debugging easier.
Third-generation languages, also called high-level programming languages, were first designed in the 1950s. This category includes older, now obscure, and little-used languages such as COBOL and FORTRAN as well as newer, more commonplace languages such as C++ and Java. While different assembly languages are specific to different types of computers, high-level languages were designed to be machine independent, so that a program would not need to be rewritten for every type of computer on the market.
In the late 1970s, the idea was advanced of developing a fourth generation of languages further abstracted from the machine itself. Some people classify Python and Ruby as fourth-generation rather than third-generation languages. However, third-generation languages have themselves become extremely diverse, blurring this distinction. The category encompasses not just general-purpose programming languages, such as C++, but also domain-specific and scripting languages.
Computer languages are also used for purposes beyond programming. Modeling languages are used in computing, not to write software but for planning and design purposes. Object-role modeling, for instance, is an approach to data modeling that combines text and graphical symbols in diagrams that model semantics; it is commonly used in data warehouses, the design of web forms, requirements engineering, and the modeling of business rules. A simpler and more universally familiar form of modeling language is the flowchart, a diagram that abstracts an algorithm or process.
Overview
The idea of the algorithm is key to computer science and computer programming. An algorithm is a set of operations, with every step defined in sequence. A cake recipe that defines the specific quantities of ingredients required, the order in which the ingredients are to be mixed, and how long and at what temperature the combined ingredients must be baked is essentially an algorithm for making cake. Algorithms had been discussed in mathematics and logic long before the advent of computer science, and they provide its formal backbone.
One of the problems with abstraction arises when users need to access a function that is obscured by the interface of a program or some other construct, a dilemma known as "abstraction inversion." The only solution for the user is to use the available functions of the interface to recreate the function. In many cases, the resulting re-implemented function is clunkier, less efficient, and potentially more error-prone than the obscured function would be, especially if the user is not familiar enough with the underlying design of the program or construct to know the best implementation to use. A related concept is that of "leaky abstraction," a term coined by software engineer Joel Spolsky, who argued that all abstractions are leaky to some degree. An abstraction is considered "leaky" when its design allows users to be aware of the limitations that resulted from abstracting the underlying complexity. Abstraction inversion is one example of evidence of such leakiness, but it is not the only one.
The opposite of abstraction, or abstractness, in computer science is concreteness. A concrete program, by extension, is one that can be executed directly by the computer. Such programs are more commonly called low-level executable programs. The process of taking abstractions, whether they be programs or data, and making them concrete is called refinement.
Within object-oriented programming (OOP)—a class of high-level programming languages, including C++ and Common Lisp—"abstraction" also refers to a feature offered by many languages. The objects in OOP are a further enhancement of an earlier concept known as abstract data types; these are entities defined in programs as instances of a class. For example, "OOP" could be defined as an object that is an instance in a class called "abbreviations." Objects are handled very similarly to variables, but they are significantly more complex in their structure—for one, they can contain other objects—and in the way they are handled in compiling.
Another common implementation of abstraction is polymorphism, which is found in both functional programming and OOP. Polymorphism is the ability of a single interface to interact with different types of entities in a program or other construct. In OOP, this is accomplished through either parametric polymorphism, in which code is written so that it can work on an object irrespective of class, or subtype polymorphism, in which code is written to work on objects that are members of any class belonging to a designated superclass.
Bibliography
Abelson, Harold, Gerald Jay Sussman, and Julie Sussman. Structure and Interpretation of Computer Programs. 2nd ed, Cambridge: MIT P, 1996. Print.
Brooks, Frederick P., Jr. The Mythical Man-Month: Essays on Software Engineering. Anniv. ed. Reading: Addison, 1995. Print.
Goriunova, Olga, ed. Fun and Software: Exploring Pleasure, Paradox, and Pain in Computing. New York: Bloomsbury, 2014. Print.
Graham, Ronald L., Donald E. Knuth, and Oren Patashnik. Concrete Mathematics: A Foundation for Computer Science. 2nd ed. Reading: Addison, 1994. Print.
McConnell, Steve. Code Complete: A Practical Handbook of Software Construction. 2nd ed. Redmond: Microsoft, 2004. Print.
Monteiro, Tiago Capelo. "What Is Abstraction in Programming
Pólya, George. How to Solve It: A New Aspect of Mathematical Method. Expanded Princeton Science Lib. ed. Fwd. John H. Conway. 2004. Princeton: Princeton UP, 2014. Print.
Roberts, Eric S. Programming Abstractions in C++. Boston: Pearson, 2014. Print.
Roberts, Eric S. Programming Abstractions in Java. Boston: Pearson, 2017. Print.