Actuarial Statistics
Actuarial statistics is a specialized field that focuses on assessing uncertainty and risk, particularly within the realms of insurance, pensions, and investments. Actuaries utilize statistical methods and probability theory to evaluate financial risks, helping organizations make informed decisions despite the inherent uncertainties of future events. By analyzing historical data, actuaries can identify patterns and relationships that inform their risk models, which are critical for setting premium rates and forecasting future claims.
The discipline involves the use of random variables, which allow actuaries to estimate outcomes and create predictive models that aid in managing the financial stability of insurance companies. With the rise of data mining and predictive analytics, actuaries are increasingly able to analyze vast datasets to uncover new correlations and improve risk assessments. This continuous feedback loop of data evaluation helps actuaries refine their models and assumptions over time, enhancing the accuracy of their predictions. Overall, actuarial statistics play a crucial role in enabling financial institutions to navigate the complexities of uncertainty and risk management effectively.
On this Page
Subject Terms
Actuarial Statistics
Actuarial science, and specifically probability and statistics, deal with the concepts of uncertainty and risk. Many of the decisions that we make everyday involve uncertainty; actuarial statistics are used by actuaries to assess the financial risk that is inherent in insurance, pensions, or investment plans. Statistics are used by actuaries to determine the best way to collect data and analyze data. As actuaries analyze data, outcomes are studied that can reveal patterns related to risk and human behavior. Actuaries are experts at modeling risk and applying probabilistic decision-making. It is critical for actuaries to understand the conditions and processes under which historical data was obtained as well as the evaluation and quality assessment of available data. This article discusses the application of random variables that actuaries apply when building risk models. A discussion of data mining, predictive analytics and the identification of new data correlations follow as a means of understanding the constantly changing methods used by actuaries in applying statistical methods to risk analysis.
Keywords Correlation-Statistics; Credit Rating Insurance; Insurance Statistics; Predictive Analysis; Random Variables; Statistics and Probability
Actuarial Science > Actuarial Statistics
Overview
The world is an uncertain place and human beings have always been uncomfortable with the realization that they must live with risk. Risk, like death and taxes is a foregone conclusion and cannot be avoided. Since humans are a risk adverse species, it is easy to see why insurance is seen as a necessary evil. Uncertainty is the reason that insurance companies exist. People and organizations pay premiums to insurers because the insurer will assume the majority of financial risk for the insured. Insurance companies are able to assume risks by pooling large numbers of policy holders together and estimating that only a small number will make a claim.
“Insurance works through the magic of the Law of Large Numbers. This law assures that when a large number of people face a low-probability event, the proportion experiencing the event will be close to the expected proportion. For instance, with a pool of 100,000 people who each face a 1 percent risk, the law of large numbers dictates that 1,100 people or more will have losses only one time in 1,000” (Zeckhauser, 2003, ¶6).
When determining ratings, insurance companies must manage a large number of unknowns, including:
- Potential number of claims to be paid to insured (future claims payments).
- Uncertainty about how much to charge for premiums (premium rating projections).
- Predictive modeling of future events and trends (risk projections).
- Volatility in financial markets that affect the assets of an insurance company.
The Need for Actuaries
Insurers rely on the analysis of historical data to project the likelihood of future events; the data that is analyzed by actuaries is compiled by insurers, government agencies and private companies. Today, data mining has become commonplace due to the great availability of electronic data available from a variety of sources. The more data points that can be collected about a policy holder, the greater the number of variables an insurer or actuary has to describe an entity (individual policy holder or fund.)
An additional challenge for actuaries is that they must determine premium rates that will cover the costs of operations of an insurer without knowing how many claims will be paid out in the future. The competitive nature of insurance today requires that insurers keep rates competitive, so the job of the actuary is very challenging indeed. Actuaries balance the asset reserves of insurance companies while at the same time hedging the risks associated with setting rates. "Insurance companies pay claims out of premiums that were set some time before the claim arose and that were based on information drawn from an even earlier period. Obviously, in determining the premium it should charge, the insurer must forecast its expected claims and expenses." In short, an insurance company cannot be certain of its future assets, profits or its solvency (Hart, Buchanan, & Howe, 1996); all this risk makes being an insurer-very risky indeed.
A “fundamental task of the actuary is to use historical observations to draw conclusions about future outcomes. This is similar to the work of the statistician; it is the context that defines the work of the actuary. Therefore, it is appropriate that the initial principles be taken from probability and statistics” (“Principles,” 1999, p.5).
Insurers face a risk of ruin or insolvency if premiums are charged that are equal to the expected costs of operation. An insurer can reduce the risk of insolvency through the following means (Hart, Buchanan, & Howe, 1996):
- Increase its capital.
- Increase its profit margin
- Reduce exposure on risk (re-insurance).
- Increase the numbers of risks (risk pool).
- Reduce correlations between risks.
"Claims are not settled as soon as they occur and may take many years to be finalized. The capital requirements of an insurer, particularly one writing long tail classes of business, are quite complex as care is required not just for the period of policy exposure but until the last claim has been settled" (Hart, Buchanan, & Howe, 1996).
Variables
"Insurance relies on pooling to reduce relative variability. If all of the risks in a pool are identical, then each should contribute an equal amount to the pool. In practice, however, the risks in the pool are often different and these differences need to be taken into account. The purpose of risk classification is to find variables, called rating variables, that can be used to distinguish between different levels of risk and to quantify the differences on the basis of these variables" (Hart, Buchanan, & Howe, 1996).
There are "usually a large number of potential rating variables, each of which could reasonably be expected to affect the risk. It is seldom practical to use all possible rating variables. Some, such as ethnic origin, are barred under anti-discrimination legislation or are socially unacceptable. Some, such as mileage, are difficult to collect, unreliable, or both. Some have only a small effect. Too many rating variables would create an unwieldy rating structure. The idea is to find a small number of variables that explain as much of the variation between risks as possible" (Hart, Buchanan, & Howe, 1996).
Probability & Statistics
The theory of probability has its foundations in mathematics of the seventeenth and eighteenth centuries, and introduced the study of random variables. Early study of probability emphasized games of chance where the number of possible outcomes was finite. The physical characteristics of card or dice give strong clues to the evaluation of underlying probabilities; there are a finite number of numbers on a die, or cards in a deck. A later concept of probability theory introduced the concept of the continuous random variable. In the use of continuous random variables, probabilities need to be obtained empirically (via experimentation or observation) (Trowbridge, 1989).
The use of random variables leads only to estimates in outcomes. For example, if the height of 100 men is sampled and the average of the selected group is calculated, the average is considered an estimate only. If an average is calculated on a group of 100 different men, the number will likely be different. In this example, the sample size of 100 subjects may not be large enough to be statistically significant to yield a true average. It is also critical that the sample size is truly selected randomly and doesn't have any inherent commonalities. The theory of probability is critical to actuaries in determining averages when using random variables — a good example would be the building of actuarial mortality tables that estimate life expectancy (Trowbridge, 1989).
Random variables are applied daily by actuaries to prepare actuarial tables that define distributions. There are a number of actuarial variables that are commonly used; one in particular that is relevant today due to its effect on pension and retirement plans is called time until termination. A random variable associated with length of time typically falls into one of the following categories:
- Length of remaining years of life.
- Length of period of disability or employment.
- Time between a claim event and the payment of the claim.
The complexity of determining and using appropriate variables such as time until termination is very relevant for companies in determining employee pension plans. Length of time is often studied via what is called the transformation into another variable that has relevance when used with the original random variable. For example, when determining how much money a company needs to put away to fund its employee pension program, age is not the only variable the needs to be considered. Mortality isn't the only variable that will affect pension benefits; employees that leave a company, retire or become disabled also affect the outcomes of actuarial models. Actuaries need to take into consideration a number or variables when advising clients about how they will affect their businesses.
Statistical analysis forms the very foundation of the actuary's analysis of data that will become an integral part of the predictive modeling process. Two of the major factors influencing the impact of actuarial statistical analysis are surely the access to large digital data repositories and sophisticated computer applications. Access to more data doesn't necessarily mean that the data used for in the analysis process is always accurate (as will be illustrated later in this article). By its nature, actuarial analysis derives estimated values that allow for errors in the raw data. Actuarial analysis is a cyclical process that involves a never ending feedback loop. An actuary defines a set of assumptions or variables to be run through the predictive modeling process and then outcomes of the process are used to verify the reliability of the raw data. The feedback loop serves as a mechanism to constantly improve on the data quality and therefore the predictive models. This essay describes some of the most common variables that actuaries use in their analysis. A discussion on how new variables are being used by actuaries in the data mining process is also covered and enhances the readers knowledge of the value that actuaries provide as statisticians.
Applications
Variables & Assumptions
(Note: the term assumption will be used interchangeably with variable in the remainder of this article.)
The study of random variables is also known as statistics and probability helps humans make sense of uncertainty. Actuaries deal with assessing uncertainty and risk — actuarial science projects the possible outcomes of situations where uncertainty is inherent (Trowbridge, 1989).
Risk is a consequence of uncertainty which can be analyzed in terms of statistical distributions. Tools for actuarial analysis include (Committee of Professional Responsibility, 2006):
- Techniques for analyzing statistical distributions.
- Forecasting techniques that determine required rates of premium that estimate future cash flow for financial purposes or planning.
“The choice of assumption or the availability of valid data upon which to base model assumptions is a critical element in the modeling process. It is possible that comprehensive and accurate data is not available to the actuary, or that the actuary has been requested to use a specific pre-defined set of assumptions in the modeling process” (Committee of Professional Responsibility, 2006, p.3).
Actuarial Assumptions
Actuarial assumptions (aka:variables) are used by actuaries in evaluating many kinds of data. There are a number of common variables that actuaries use when building data tables or actuarial models. Assumptions help the actuary project future needs and requirements for clients by inputting the variables into actuarial models, running scenarios and evaluating the outcomes. There are a number of common applications in which actuaries apply assumptions; a well known application is the company-sponsored retirement plan. Actuaries are instrumental in applying assumptions when developing models for pre-funded retirement plans. Contributions made to pre-funded retirement plans are invested before they are needed to pay benefits and expenses. A plan contributor or future retiree wants to know how much money he or she will need to put away to deliver a given level of promised benefits at retirement. An actuary conducts statistical analysis of a number of variables and advises fund managers and participants regarding how much money will be needed. The actuary will use different variables in the analysis including those that calculate fund growth by number of years and projected changes in interest rates.
The types of assumptions used to determine pre-funded contributions fall into the two following categories ( Franklin, 2006):
- Demographic assumptions, such as the probability and time frame to receive benefits.
- Economic assumptions, such as calculating the value of future benefits in terms of today's dollars.
There are many assumptions that are critical to building reliable actuarial models; one of the most critical is the interest rate assumption. Interest rate assumptions reflect the time value of money. Again, related to the example of the pre-funded retirement plan, contributions will continue to grow until the fund holder begins to draw benefits. The time value of money assumes that less money is needed today to provide a given benefit in the future. This enforces the advice that many financial planners give to young workers — investments made earlier in a career will benefit greatly from additional years in the fund over the lifetime of the fund. Actuarial assumptions involve complex analyses of asset allocations for such funds. A given fund is likely to have very diversified portfolios including: Stocks, bonds and real estate. Each of these asset classes will have a different rate of growth, risk and overall investment outlook.
Assumptions are not static variables and actuaries typically re-evaluate these variables every three years. Assumptions are compared to fund experiences which are the outcomes of actuarial models. Over a three year period, some variables are likely to change and assumptions will need to be re-verified. In the case of retirement plans, assumptions regarding employee turnover, retirement, and disability are likely to have an impact on the outcome of actuarial models and therefore need to be revisited.
Issues
Predictive Analysis as a Competitive Tool
Predictive analytics is a commonly used term that refers to the practice of data mining. Data mining is really just another term for using actuarial statistics to find averages, data spreads and most importantly correlations between data fields. Increasing competition for business is one of the key drivers for insurance companies to enhance their data mining capabilities. Insurance companies have access to unprecedented amounts of data on potential policy holders and predictive analytics and the power of computer applications is changing the way insurers screen potential policy holders and keep rates competitive.
Credit-Based Analysis
Credit-based insurance allows actuaries to make a correlation between and individual's credit score and the likelihood of that individual filing a claim. Insurers are leveraging credit ratings as a variable that not only determines ratings but also as a means to reject customers as high risk. The practice of using credit rating as a variable is controversial and is under scrutiny from many consumer watchdog groups. One challenge for all data mining activities is access to current and accurate data. Critics of credit-based insurance
warn that the credit data from which credit scores are derived is suspect of being inaccurate and out of date. Actuarial outcomes from data mining are only as accurate as the underlying data that is being mined ("Caution!," 2006). Insurers are hesitant to reveal just how the credit rating variable is being used to determine policy rates. Insurers are quick to point out that the application of the credit rating variable, like many other actuarial assumptions, are proprietary and are considered highly valued strategic tools for insurance companies.
Actuaries report that in the case of auto insurance, there's a clear "statistical correlation between scores and claims, and scoring shifts costs from drivers who files fewer claims to those who file more" ("Caution!," 2006). Credit scoring allows insurance companies to sort customers into hundreds of tiers by analysis of credit scores whereas more traditional risk variables only allowed for sorting customers into a relatively few tiers. Statistics have long been the key for determining rates and most consumers are familiar with the variables that serve as ratings criteria. Typical criteria (variables) in determining auto insurance rates, for example, are: Age, gender, marital status, zip code, and driving history.
In its infancy, credit-based statistical analysis was conducted on nearly 1 million records of archived data — each record contained approximately 100 fields of information. Of the 100 or so fields that were stored in the credit rating database, approximately 30 fields were found to have a correlation with claims payouts. These findings led to the creation of credit-based auto and homeowners insurance scores ("Caution!," 2006).
Predictive analytics is a powerful tool and actuaries and insurers have welcomed the introduction of new variables that they see as offering a real competitive advantage. Not everyone is so sure that insurers have the best interest of policy holders at heart when using credit scores to set premiums. One insurance regulator sought to clarify just what data is being correlated when using credit based variables. "What they are really looking to see with insurance scores is who is most likely to file a claim, not who's most likely to have an accident" ("Caution!," 2006).
Demographic Data Mining
Predictive analytics and data mining are revealing an ever increasing array of variables that actuaries can use to determine risk and ratings. One area of rapid change will involve data mining of demographic data and in particular will allow the application of age as a variable in non-traditional ways. Insurance statistics project that by 2050, the population of the United States will include 87 million persons (21% of entire population) who are age 65 or older (Cavanaugh, 2007). It's acknowledged that actuaries are already beginning research that is evaluating the correlation between age and liability and property claims. As people live longer and healthier lives, they will live independently for more years; owning homes and driving cars. Actuaries are now looking at the affects of age as a variable in ways that were not typically analyzed previously. "Predictive analysis is a huge hot topic in insurance that assumes the age of the policyholder and all sorts of factors. Actuaries are working with data mining companies on predictive analysis on what characteristics or behaviors correlate with losses and in what combination" (Cavanaugh, 2007).
Terms & Concepts
Actuarial Assumption: The possible number of variables that actuaries use to conduct statistical analysis; variables are input into actuarial models to run a scenario.
Assumption Variables: See actuarial assumption.
Correlation: A measure of the degree to which variables are related; the amount of change in one random variable occuring simultaneously with the amount of change in another.
Data Mining: See Predictive analytics.
Degree of Uncertainty: A measure of the possible variation of values acquired by the random variable separate from its expected value.
Experiment: An observation of a given occurrence under specifically controlled conditions.
Expected Value: The probability-weighted average of the numerical values belonging to a random variable. If the average exists, it is referred to as the expected value of the random variable.
Exposure Measure: The scaling factor that correlates the predicted value of one or more random variables over a collected set of phenomena.
Event: A set of one or more possible outcomes.
Law of Large Numbers: In statistics, the rule stating that as the number of identically distributed, random variables increases, their sample average will draw closer to their theoretical average.
Outcome: The result of an experiment or the running of an actuarial model.
Probability: A measure valued between zero to one that provides insight into the likelihood of a certain event occurring.
Phenomena: Occurrences that can be observed.
Predictive Analysis: Database applications that search for subtle patterns in sets of data that can be used to predict future behavior. True data mining software is not used to just change the layout of the data; rather it is designed to actually discover previously hidden relationships between the data.
Random Variables: Aka actuarial assumptions; refers to the rule that assigns a numerical value to every possible outcome.
Bibliography
Caution! The secret score behind your auto insurance. (2006). Consumer Reports, 71(8), 43-48. Retrieved January 10, 2008, from EBSCO Online Database Academic Search Premier. http://search.ebscohost.com/login.aspx?direct=true&db=aph&AN=21876419&site=ehost-live
Cavanaugh, B. (2007). Living longer, driving longer. Best's Review, 107(10), 34-38. Retrieved January 10, 2008, from EBSCO Online Database Business Source Premier. http://search.ebscohost.com/login.aspx?direct=true&db=buh&AN=23942456&site=ehost-live
Committee of Professional Responsibility. (2006). The roles of the actuary in the selection and application of actuarial models, no. 7. Retrieved January 11, 2008, from http://www.actuary.org/pdf/prof/models%5fjune06.pdf
Franklin, C. (2006). Multiemployer defined benefit plan funding: Actuarial assumptions. Benefits & Compensation Digest, 43(6), 36-42. Retrieved January 9, 2008, from EBSCO Online Database Business Source Premier. http://search.ebscohost.com/login.aspx?direct=true&db=buh&AN=21148972&site=ehost-live
Gomez-Deniz, E., & Calderin-Ojeda, E. (2010). A study of Bayesian local robustness with applications in actuarial statistics. Journal of Applied Statistics, 37(9), 1537-1546. Retrieved November 15, 2013, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=53539533&site=ehost-live
Hart, D., Buchanan, R. & Howe, B. (1996). Nature and operation. In Actuarial practice of general insurance (7th ed.). Retrieved January 6, 2007, from http://www.actuaries.asn.au/NR/rdonlyres/8D07821C-0ED1-4578-9521-1B81E1E7253D/3178/1NatureandOperation16doc.pdf
Lin, X., Rongming, W., & Dingjun, Y. (2012). Joint distributions of some actuarial random vectors for the Cox risk model. Applied Stochastic Models in Business & Industry, 28(5), 420-429.Retrieved November 15, 2013, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=82370441&site=ehost-live
Neuwirth, P. (2003). Changes in actuarial assumptions/plan designs after FASB 106. CPA online. Retrieved January 8, 2008, from http://www.nysscpa.org/cpajournal/old/14465869.htm
“Principles underlying actuarial science.” (1999). Casualty Actuarial Society. Retrieved August 10, 2010 from http://casact.org/research/
Trowbridge, C. (1989). Fundamental concepts of actuarial science. Actuarial Education and Research Fund. Retrieved January 11, 2008, from http://www.actuarialfoundation.org/research%5fedu/fundamental.pdf
Tsai, C., & Chung, S. (2013). Actuarial applications of the linear hazard transform in mortality immunization. Insurance: Mathematics & Economics, 53(1), 48-63. Retrieved November 15, 2013, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=89139271&site=ehost-live
Walsh, M. (2004, June 2). Actuaries under scrutiny on pension fund pacts. New York Times. Retrieved January 9, 2008, from http://query.nytimes.com/gst/fullpage.html?res=9902E7DF1431F931A35755C0A9629C8B63
Zeckhauser, R. (2003). Insurance. Concise encyclopedia of economics. Retrieved January 10, 2008, from: http://www.econlib.org/library/Enc/Insurance.html
Suggested Reading
Auerbach, M. (2007). Tea leaves difficult to read for emerging professional liability risks. National Underwriter / Property & Casualty Risk & Benefits Management, 111(41), 40-46. Retrieved January 7, 2008, from EBSCO Online Database Business Source Premier. http://search.ebscohost.com/login.aspx?direct=true&db=buh&AN=27512893&site=ehost-live
Connolly, J. (2007). NAIC ponders data agent for preferred mortality tables. National Underwriter / Life & Health Financial Services, 111(15), 65-65. Retrieved January 7, 2008, from EBSCO Online Database Business Source Premier. http://search.ebscohost.com/login.aspx?direct=true&db=buh&AN=24819329&site=ehost-live
Crawford, G. (2003). Putting a new spin on actuaries. Pensions & Investments, 31(24), 8-8. Retrieved January 3, 2008, from EBSCO Online Database Business Source Premier. http://search.ebscohost.com/login.aspx?direct=true&db=buh&AN=11580700&site=ehost-live