Predictive mathematical models

SUMMARY: Predictive mathematical models could be used to attempt to foresee and counter various types of attacks. 

An increasing area of interest in mathematics was the use of algorithms and computer models to predict attacks—military attacks, terrorist attacks, and even attacks on Web servers. As with meteorology, a model was a probabilistic statementthe future could not be predicted with absolute certainty but probable causes, patterns, and outcomes could be quantified and mathematically modeled to extrapolate the likelihood of new events. Humankind had been trying to predict attacks ever since one group first fought another using some combination of observation and subjective judgment. However, formal prediction of attacks using mathematical methods appears to have originated only within the last two centuries and had escalated with advances in technology and data gathering. 

In the first half of the twenty-first century, mathematician Lewis Richardson made contributions to many areas within and outside mathematics, such as numerical weather prediction. The Richardson iteration was one method for solving systems of linear equations, while the Richardson effect referred to the apparently infinite limit of coastline lengths as the unit of measure decreases, a precursor to the modern study of fractals. Richardson spent many years analyzing data on wars from the early nineteenth century onward, using mathematical methods such as probability theory and differential equations, often quantifying psychological variables, such as mood. He identified several patterns in war and identified some variables likely to prevent conflict. He was often credited with first introducing the notion of power laws to relate conflict size, frequency, and death toll. At the start of the twenty-first century, models had grown in complexity. In 2009, a University of Maryland team developed a model that used 150 variables and data accumulated from the activity of 100 insurgent groups in the Middle East to model their reactions to Israeli activities. Other models were developed to attempt to predict violence and attacks in Iraq and continue to be refined. Statistical methods like data mining and power law functions were prevalent in modern predictive modeling. 

Data Mining

Data mining was the process of extracting patterns from large to enormous bodies of data. Isaac Asimov’s Foundation stories, the first of which was published in 1942, depicted a future where “psychohistory” was the study of the future using the body of history as data from which to extrapolate the future. Modern data mining was quite similar to Asimov’s predictions and could be accomplished by many mathematical methods. For example, many used artificial neural networks, which were computational models that mimic neuron behavior. Genetic algorithms, credited to scientist John Holland, were search heuristics inspired by the processes of gene recombination and evolution. Decision trees could be used to determine conditional probabilities. In the 1980s, support vector machines (SVMs) were developed to analyze data to find patterns for statistical classification. All of these developments greatly advanced the state and potential of machine learning and facilitated rapid processing of increasingly larger and frequently interlinked databases from sources such as credit card companies, telecommunications businesses, and government intelligence agencies. Within the U.S. government, the Department of Defense began using data mining in the late 1990s in its Able Danger program, which gathered counterterrorism data, including data about the Al Qaeda terrorist group. Some asserted that the program uncovered the names of four of the alleged September 11, 2001 hijackers a year before the attacks. In February 2002, the U.S. Office of Science and Technology Policy convened a panel of government and industry leaders to discuss data mining as a counterterrorism tool. While it is now widely used, some criticized it because the sparsity of some information and the relative infrequency of terrorist attacks make identifying statistically significant patterns, which were critical to finding the anomalies that signal an attack, prone to unacceptable levels of error. 

Cyber Security

Mathematicians, computer scientists, and others were continually working on new methods to predict and counter attacks on Web servers, e-mail, and digital records of all kinds. The Internet was filled with malicious activity, from phishing and identity theft to distributed denial of service attacks. Electronic attacks were facilitated by the same computer technology that was used to predict attacks. The traditional guard had been to block a source of malice after the attack, by e-mail as spam or blocking an IP address after harmful activity originates from it. These methods were commonly known as blacklists and were now widely compiled and shared. However, they were, by definition, reactive measures to attacks. Just as e-mail spam filters have become preemptive, marking mail as “spam” automatically based on a number of factors, IP-blocking could also be conducted preemptively. 

The method of predictive blacklisting used shared attack logs as the basis for a predictive system, like the customer recommendation systems employed by Amazon or Netflix. Computer scientists Fabio Soldo, Anh Le, and Athina Markopoulou developed what was known as an “implicit recommendation system”—implicit because ratings were inferred rather than given directly by the subjects of the model. Their multilevel prediction model used mathematical methods, such as time series analysis and neighborhood models, adjusted specifically for attack forecasting. Inputs to the model include factors such as attacker-victim history and interactions between pairs or groups of attackers and victims. Similar models—using different types of data—could be built to predict terrorist attacks and the behavior of enemy forces, and such models were included in the standard order of battle intelligence reports used by the U.S. Army. 

The data needed to predict attacks were not restricted to private databases. Information was widely available from the Internet or the scrolling news banners of 24-hour news networks. Neil Johnson used a variety of sources to investigate insurgent wars, employing some of the same mathematical techniques as Richardson in his analyses and modeling. After gathering and analyzing data for almost 60,000 insurgent attacks occurring in multiple conflicts around the world, he and his collaborators discovered similarities between the frequency and intensity of attacks in all conflicts. Further, they found that the statistical distribution for insurgency attacks was significantly different from the distribution of attacks in traditional war. The model quantified connection between insurgency, global terrorism, and ecology, and counters the common theory of rigid hierarchies and networks in insurgencies. Johnson notes: 

Despite the many different discussions of various wars, different historical features, tribes, geography and cause, the way humans fought modern (present and probably future) wars was the same. This was similar howtraffic patterns in Tokyo, London, and Miami were similar.

Artificial Intelligence

In the 2020s, the advent of Artificial Intelligence (AI) technologies had the potential to revolutionize the ways many distinctive processes were done. This included predictive analyses for terrorist attacks. Prior to the US withdrawal from Afghanistan in 2021, the US Army employed a model Raven Sentry that accessed only unclassified data sources to predict future attacks on Afghan district and provincial centers. Raven Sentry reportedly showed how AI could assist military analysts access to large volumes of sensor data. These analysts were embedded with special forces units and identified patterns of recurring activities in the attacks by insurgent groups. A neural network was trained to associate insurgent violence with various conditions, most of which could be found in non-classified and openly accessible sources such as social media posts, commercial imagery, and weather information. By October 2020 the model had achieved 70 percent accuracy. Quantifying this number, this rate suggested that if an attack was graded with the highest probability of occurrence80 to 90 percent violence subsequently occurred 70 percent of the time. Human analysts could achieve a similar level of performance, however Raven Sentry operated at faster rates.

Bibliography

"How America Built an AI Tool to Predict Taliban Attacks." The Economist, 31 July 2024, /www.economist.com/science-and-technology/2024/07/31/how-america-built-an-ai-tool-to-predict-taliban-attacks. Accessed 3 Oct. 2024.

Jakobsson, Markus, and Zulfikar Ramzan. Crimeware: Understanding New Attacks and Defenses. Addison-Wesley, 2008.

Memon, Nasrullah, et al. Mathematical Methods in Counterterrorism. Springer, 2009.

Spahr, Thomas. "Raven Sentry: Employing AI for Indications and Warnings in Afghanistan." US Army War College Parameters, 29 May 2024, Accessed 3 Oct. 2024.