Algorithms

  • Type of physical science: Computers, Computation
  • Field of study: Computers

An algorithm is a precisely defined sequence of instructions for carrying out a particular task. When that task is performed by a computer, the algorithm must be expressed according to the strict rules of a programming language that is "understood" by the computer.

89316871-89247.gif89316871-89248.gif

Overview

The word "algorithm" is derived from the name of a ninth century Arabic mathematician, Abu Ja'far Mohammed ibn-Musa al-Khowarizmi, who wrote a book about solving equations (the title of his book, Al-jabr wa'l muqabalah, also provided the English language with the word "algebra"). In his book, al-Khowarizmi discussed the steps that one should follow to find solutions to algebraic equations, particularly second-degree equations. The steps that are followed are those that are taught today in a first course in algebra at the junior-high or high-school level. Beginning students often struggle to memorize these steps, known as the "quadratic formula."

The quadratic formula is an example of an algorithm. The most common nontechnical word that is used as a synonym for "algorithm" is "recipe." Although most people are comfortable with the notion of a recipe in everyday life, the world of scientific computing requires a much more precise definition. For example, cooking recipes often leave a wide range of interpretation—"add a dash of salt" or "bring to a mild boil," for example. Words such as "dash" and "mild" are certainly not precisely defined. Another property of everyday-life algorithms that is not suitable for the computer world is that instructions might not be subdivided finely enough into detailed component parts. For example, the procedure for changing a flat tire might have as its first instruction "loosen the lug nuts." Of course, if the lug nuts are covered by a hubcap, the step "remove hubcap" has been omitted. The assumption is that the individual changing the tire can figure out how to do a step in the procedure even if that step requires several substeps. This is an assumption that cannot be made about a computer. A third requirement for an algorithm is that the process must terminate after a finite amount of time. While most recipes have this property, they sometimes are not expressed this way. The most familiar example is the set of instructions for shampooing one's hair contained on many shampoo bottles: "lather, rinse, repeat." A person carefully following those instructions would never finish the job because repeating that collection of instructions would put the shampooer into an "infinite loop."

An algorithm is thus a sequence of precisely defined instructions that terminates in a finite amount of time. This notion is of fundamental importance because algorithms are what computers do. Whenever a computer is running, it is executing an algorithm, even if it seems that the computer is simply sitting there with a blank screen and a flashing cursor. In this case, there is an algorithm instructing the computer when and where to flash the cursor until someone pushes a button, presses a key on a keyboard, or does something else to interrupt the computer—effectively terminating this algorithm and giving the computer another algorithm to execute.

Although it is the computer that has focused attention on the importance of algorithms, this precise notion of algorithmic thinking has been understood for a long time. An example of an ancient algorithm is Euclid's algorithm for finding the greatest common divisor ("gcd") of two positive integers. The gcd of two positive integers m and n is defined to be the largest positive integer that divides evenly into both m and n. For example, the gcd of 15 and 21 is 3, the gcd of 10 and 20 is 10, and the gcd of 14 and 15 is 1. How might one find the gcd of 154 and 429? The algorithmic solution, as given by Euclid in approximately 350 BCE, is as follows:

  • Divide 429 by 154, leaving a remainder of 121.
  • Divide 154 by 121, leaving a remainder of 33.
  • Divide 121 by 33, leaving a remainder of 22.
  • Divide 33 by 22 leaving a remainder of 11.
  • Divide 22 by 11 leaving a remainder of 0.
  • Stop because the remainder is 0.
  • The greatest common divisor is 11, the last nonzero remainder.

Note how this process meets the requirements of an algorithm. Each individual step is precisely and unambiguously specified. Moreover, since the remainder in a division problem is always smaller than the divisor, the sequence of remainders produced by the Euclidean Algorithm is guaranteed to reach zero. There is, however, one important detail glossed over in this discussion. The above sequence of instructions certainly constitutes an algorithm, but how does one know that the result produced by the algorithm is, in fact, the gcd of the two original numbers? This is something that Euclid proved mathematically, and while the proof is not complicated, it is not given here. The point is that there is no question that the Euclidean Algorithm works perfectly every time.

An inherent characteristic of an algorithm that is important to mathematicians and computer scientists is its computational complexity. The complexity of an algorithm is a measure of how long the algorithm takes to complete its job. Clearly, it would be of little help to know that a problem can be solved on a computer if the amount of time required to solve the problem is too long. For example, returning to the Euclidean Algorithm, an important question is this: To compute the gcd of 1,254,472 and 853,227, how many division operations are required? Recall that one first divides by 853,227 and then takes the remainder of that division and divides that into 853,227, repeating this process with the next remainder until the remainders work their way down to zero. It seems possible at the outset that if the remainders work their way to zero quite slowly, one might need to perform close to 800,000 divisions.

In fact, it can be proved that with the above two numbers, the Euclidean Algorithm requires fewer than forty divisions. This is because the Euclidean Algorithm is known to be of logarithmic complexity. While this may sound somewhat complicated, the idea is quite simple. For these purposes, the logarithm of a positive integer is defined simply as the number of times the original integer can be divided in half until it is less than or equal to one. For example, the logarithm of 50 is approximately 6 because the sequence 50, 25, 12.5, 6.25, 3.125, 1.5625, 0.78125 requires six halving operations to result in a number smaller than one. What has been proved about the Euclidean Algorithm is that the number of divisions required before the algorithm terminates is less than twice the logarithm of the smaller number. In the above example, it takes twenty halving operations before 853,227 is smaller than 1, so the logarithm of 853,227 is approximately 20. Since twice 20 is 40, this is an upper bound on how much "time" the algorithm requires. Logarithmic algorithms tend to be very efficient because the logarithm of even a large number tends to be small. Thus, the algorithm executes quickly even when the problem is large.

While some important real-world problems are of logarithmic complexity, there are many real-world problems that belong to the other extreme, exponential complexity. Again, for these purposes, the exponential of a number n can be defined as the number 2n. Compared to the logarithm, the exponential of a number is quite large. For example, the logarithm of 16 is 4, while the exponential of 16 is 216, or 65,536. It should thus come as no surprise to learn that exponential algorithms execute very slowly; the amount of time required is proportional to the exponential of the numbers involved instead of the logarithm. The logarithmic and exponential classes are not the only two classes of algorithms. There are, in fact, many complexity classes between these two extreme classes. It is important to know an algorithm's complexity class so that a determination can be made as to whether it is practical to try to execute that algorithm on a computer.

Applications

The importance of the design and analysis of algorithms has increased tremendously because of the impact of the computer. Despite the conceptions of many people, the computer cannot think. The computer is simply a machine that can follow a fairly small set of precisely defined instructions. Of course, the computer performs these instructions with blinding speed and amazing accuracy. Still, the only thing a computer can do is execute algorithms. That is, a computer can play chess, draw pictures, or sort a list of names into alphabetical order because someone has written a computer program that performs those tasks. It is known that there are certain problems for which algorithmic solutions cannot exist. This is significant, as it means that there are some problems that the computer cannot solve. Fortunately, many important real-world problems do have algorithmic solutions: devising a schedule for an airline, a manufacturing process, or a major sports league; designing a network of communications sites; and routing telephone calls efficiently from one city to another, for example.

The importance of the computational complexity of algorithms is demonstrated by considering the application of cryptography. Cryptography deals with encoding messages in such a way that an unauthorized eavesdropper who intercepts a message will be unable to make sense of the message. There was a time when cryptography was of interest primarily to the military and diplomatic communities. With the proliferation of communications via the computer, however, massive amounts of information of a personal, corporate, or financial nature (including financial transactions) are transmitted over nonsecure data lines for anyone to see. Thus, cryptography has become important to a much larger and more general population.

What role have algorithms played in the development of cryptography? In the mid-1970s, mathematicians began experimenting with the notion of a public-key cryptosystem. This is a scheme whereby certain components of a cryptography scheme can be made public while other components are kept secret. The need for certain public information arises in computer communications because, unlike in military and diplomatic contexts, in which individuals typically know with whom they are communicating, in other contexts individuals might need to communicate with people they have never met. Of course, not everything can be public, or there would be no ability to hide the content of the message from an intruder.

The most common public-key cryptosystem, using the RSA Algorithm, is based on some simple ideas from elementary arithmetic, prime numbers and factoring. (A prime number is a positive integer greater than one the only divisors of which are itself and one. Factoring an integer simply means writing that integer as a product of primes—for example, 42 factors into the primes 2 × 3 × 7.) The numbers used in these public-key cryptosystems are extremely large—between one hundred and two hundred digits long. The secret part of the key is developed by using the Euclidean Algorithm, along with some other numeric algorithms, all of which are of logarithmic complexity. This means that setting up the secret part of the cryptosystem does not require very much computer time. With the public information available, an unauthorized intruder actually has enough information to break the system. For the RSA Algorithm, for example, breaking the system requires the ability to factor large numbers. Factoring is a problem with an algorithmic solution, so factoring can be performed on a computer. Unfortunately for the intruder (and fortunately for the sender), the best-known factoring algorithms are of exponential complexity. This means that trying to use the computer (or any other means) to break the system is hopeless—the system can be broken, but it would require too much time (on the order of millions of centuries). Because setting up the system requires algorithms of logarithmic complexity while breaking the system requires algorithms of exponential complexity, the RSA Algorithm provides a secure means of communication in a vast computer network.

In addition to complexity, algorithms are often classified by the type of problem-solving strategy used. For example, a "greedy" algorithm, as its name implies, tries to solve a problem in the most efficient manner by doing what appears to be most efficient at each step. While this may seem like a completely logical approach, greedy algorithms do not always work.

To see this, consider the "change-making" problem. Suppose a customer buys an item for less than one dollar and pays with a one-dollar bill. Moreover, suppose the goal is to give change to the customer using as few coins as possible. This is certainly a reasonable goal; most customers would not want to receive their change entirely in pennies, for example, and most cashiers would prefer to spend as little time as possible handing out change. A greedy algorithm for accomplishing this would be:

  • Determine change due as $1 - purchase price
  • Give the customer the highest-valued coin possible (the highest value that is less than or equal to the amount of change due)
  • Repeat until the purchase price plus the change received totals $1

For example, if the amount of a purchase is 38 cents, the algorithm would compute the change due as 62 cents and then return two quarters, one dime, and two pennies to the customer.

This algorithm is greedy because at each stage the intent is to decrease the amount of change owed by the largest amount. Does this algorithm work? In the US monetary system, it does. That is, the customer gets the fewest number of coins possible. This algorithm, though, might not work in all systems. Suppose, for example, that the US monetary system also had a coin worth 12 cents. The greedy algorithm would make change for an 85-cent purchase by returning a 12-cent piece and three pennies (a total of four coins), while the optimal solution, which is not discovered by the greedy algorithm, would be to return a dime and a nickel (only two coins). While a greedy strategy might be simple to implement, it might not produce the overall best result. Algorithms used for solving practical problems thus require careful analysis.

This kind of characterization of algorithms is important because the kinds of algorithmic processes that scientists develop correspond to important problem-solving techniques. If the problem is one of determining the best locations for emergency services given the network and traffic patterns of a street system for a large city, for example, it is helpful if the problem-solver can relate this to a particular type of algorithmic process and then use the techniques developed for these kinds of algorithms to solve the given problem. In the example just cited, two algorithmic techniques that often occur in artificial intelligence, branch-and-bound and backtracking, can be employed to solve the problem.

Context

Many problems that mathematicians would like to solve do not have algorithmic solutions. For example, in a first geometry course, students are often asked to prove that two triangles are congruent. One of the reasons that many students find these types of proofs difficult is that there is no recipe to follow that works in all cases. On the other hand, an algorithmic solution to a problem is indeed a very powerful solution. The reason for this is that once an algorithm is available, then the strategy prescribed by that algorithm works for all instances of that problem. Thus, the quadratic formula, which was known to the Babylonians around 1000 BCE, completely settles the problem of solving quadratic equations. That is, there does not exist a quadratic equation that cannot be handled by the quadratic formula.

With the invention of the digital computer in the 1940s, algorithmic thinking became even more important. Problems that were known to have algorithmic solutions that were thought to require too much time to solve could finally be attacked. An important example of this kind of problem is the Linear Programming Problem. This problem deals with how to distribute resources for a manufacturing process in such a way as to maximize profit. The solution to this problem, the Simplex Algorithm, was developed in 1947 by George Dantzig and is the most famous algorithm of the early computer era.

The speed and accuracy of the computer also led to another important breakthrough in problem-solving: using algorithms to solve problems that normally would not be solved in an algorithmic way. An example of this occurs in the area of differential equations, a field of mathematics that is used to describe the behavior of many natural systems that undergo changes. Differential equations can help explain such diverse things as why a banjo sounds different than a guitar; how long it will take for a snowplow to traverse a system of streets if the intensity of the snowfall is known; and why an earthquake might destroy two buildings and leave standing a similarly constructed building in between. Prior to the computer, most differential equations were solved by learning numerous techniques (and tricks) and applying one of these techniques if it applied to the problem at hand. Unfortunately, for many applications, the techniques do not apply.

Yet, mathematicians and computer scientists developed numerical solutions to differential equations. This simply means that most differential equations can be solved on a computer using algorithmic techniques. It should be pointed out that the solution that the computer produces is not an exact solution but is only an approximation. However, because the computer is so fast, scientists can make this approximation as accurate as needed, so effectively that the computer produces a solution that is as good as an exact one.

During the twenty-first century, the idea of algorithms has expanded along with the development of social media and ever-more complex levels of computing, such as generative artificial intelligence (AI) like ChatGPT. Indeed, the word "algorithm" has entered the English lexicon and become pervasive in modern life—no longer simply the domain of computer scientists. People regularly refer to the algorithms on social media platforms like Facebook and TikTok that determine what shows up in a user's daily "feed," and terms like "algorithmic trading" and "algorithmic bias" have become common. It has even been suggested that humans now live in the "age of the algorithm." As algorithms have grown increasingly sophisticated, it has become harder for scientists to explain the ways in which they work and how they reach outputs. This has particularly been the case with AI algorithms—experts cannot always discern why a particular input results in a specific output. Even so, algorithms are behind the scenes in many of people's daily interactions, from Google searches to automatically generated recommendations received while online shopping.

Algorithms and algorithmic thinking have played an important role in the development of solutions to important problems. Yet, it has been the development of the computer that has elevated the significance of algorithmic thinking to the top of the list among problem-solving techniques. Problems that were once thought to be too difficult because they required too many calculations, such as sending images back to Earth from photographic satellites orbiting distant planets, are now solved on a routine basis. As new and difficult problems arise for society to solve, certainly one of the first questions to be asked will be this: Can this problem be solved on a computer? It should be apparent that what this question really asks is this: Is there an efficient algorithm to solve this problem?

Principal Terms

complexity: the cost (usually measured in time) of executing an algorithm, relative to the size of the problem

computer program: an algorithm expressed in a formal language designed for conveying sequences of instructions to a computer

cryptography: a method of encoding messages so that the contents of a message are unintelligible to an unauthorized listener

greedy method: an algorithmic technique in which each decision is based on maximizing profit for that decision, not for the long run

RSA Algorithm: an algorithm developed by Massachusetts Institute of Technology mathematicians that is the basis for a public-key cryptography system

By Robert L. Holliday

Bibliography

Anthology Series. ACM Turing Award Lectures: The First Twenty Years, 1966-1985. ACM Press, 1987.

Chayka, Kyle. "The Age of Algorithmic Anxiety." The New Yorker, 25 July 2022, www.newyorker.com/culture/infinite-scroll/the-age-of-algorithmic-anxiety. Accessed 10 Oct. 2024.

Dewdney, A. K. The Armchair Universe<D>. W. H. Freeman, 1988.

‗‗‗‗‗‗‗‗‗‗‗‗. The Turing Omnibus. Computer Science Press, 1989.

Engel, Arthur. Elementary Mathematics from an Algorithmic Standpoint. Keele Mathematical Education Publications, 1984.

Gardner, Martin. "A New Kind of Cipher That Would Take Millions of Years to Break." Scientific American 237 (August, 1977): 120-124.

Gonzalez, Kevin Meneses. "The Algorithm Revolution in the 21st Century. Use Cases Used in Technological Giants." Medium, 18 Apr. 2024, medium.com/@kevinmenesesgonzalez/the-algorithm-revolution-in-the-21st-century-use-cases-used-in-technological-giants-d095e2c6913e. Accessed 10 Oct. 2024.

Kowalkiewicz, Marek. "How Did We Get Here? The Story of Algorithms." Medium, 10 Oct. 2019, towardsdatascience.com/how-did-we-get-here-the-story-of-algorithms-9ee186ba2a07. Accessed 10 Oct. 2024.

Pattis, Richard E. Karel the Robot. John Wiley & Sons, 1981.