RESEARCH STARTER

Engineering Statistics

Engineering Statistics is a vital field that applies statistical methods to solve practical engineering problems by analyzing data and understanding variability. It encompasses two main classes of statistics: descriptive and inferential. Descriptive statistics help summarize and present large datasets in a comprehensible manner, utilizing techniques such as measures of central tendency and variability. On the other hand, inferential statistics allow engineers to test hypotheses and draw conclusions about populations based on sample data, determining the likelihood that observed results are due to chance.

Engineers often follow a systematic problem-solving approach that begins with defining a problem clearly, identifying relevant factors, and developing models to represent the situation. These models are then tested through experimental research to refine solutions and validate their effectiveness. Statistical techniques are widely applied in various areas, including product design, process development, and quality control, facilitating better decision-making in engineering projects.

To effectively address complex engineering challenges, researchers must design experiments that control extraneous variables and accurately emulate real-world conditions. This ensures that results are meaningful and applicable. Overall, Engineering Statistics plays a crucial role in enhancing the quality, efficiency, and innovation within engineering disciplines.

Authored By: Wienclaw, Ruth A. 1 of 3
Published In: 2019 2 of 3
Related Articles:
Correlation adjusted debiased Lasso: debiasing the Lasso with inaccurate covariate model.;Discussion of 'Statistical inference for streamed longitudinal data'.;Hypothesis tests in ordinal predictive models with optimal accuracy.;Statistical inference and machine learning for big data.;Statistical Inference Using Partially Accelerated Life Test Model for the Weibull Population Mean Unified Hybrid Censored Data.
Correlation adjusted debiased Lasso: debiasing the Lasso with inaccurate covariate model.; Discussion of 'Statistical inference for streamed longitudinal data'.; Hypothesis tests in ordinal predictive models with optimal accuracy.; Statistical inference and machine learning for big data.; Statistical Inference Using Partially Accelerated Life Test Model for the Weibull Population Mean Unified Hybrid Censored Data.
3 of 3

Full Article

Engineers are vital to the success and effectiveness of many businesses in the 21st century. The successful solution of engineering problems must be based on an understanding of variability and how to apply the principles of mathematical statistics to real world problems. Descriptive statistics are used to reduce large amounts of data and describe them in ways that are easily comprehendible. Inferential statistics are used to test hypotheses to determine if the results of a study occur at a rate that is unlikely to be due to chance. Statistics offer a wide range of methods to test hypotheses, each of which is appropriate to a different type of experimental design. A good research design depends in part on two factors: controlling the situation so that the research is only measuring what it is supposed to measure and including as many of the relevant factors as possible so that the research fairly emulates the real world experience. Statistical techniques can be applied to the gamut of engineering problems including new product design, process development, and quality control.

Engineers are vital to the success and effectiveness of many businesses in the 21st century. Most of the products used by society and the processes used to produce them are designed by engineers. Engineers are also involved in developing solutions to many problems faced by modern society, such as climate change, and are widely valued for their skill at problem solving.

The Steps of Problem Solving

As shown in Figure, 1, the engineering approach to problem solving comprises multiple steps.

First, the engineer must develop a clear and concise description of the problem. Engineering is an applied scientific discipline, and engineering problems are typically quantified so that they can be better analyzed. A clear description of the problem helps in this endeavor.
After the problem has been clearly defined, the engineer next identifies the important factors that bind the problem or play a role in the solution. This is often a tentative list and is revisited and revised as new data are compiled.
After the important factors have been tentatively identified, the engineer next proposes a model based on scientific or engineering knowledge of the problem. This model is a representation of a situation, system, or subsystem. At this point in the process, a conceptual model that describes the situation or system under investigation is articulated. The conceptual model may also be used in the development of a later mathematical or computer model that mathematically represents the system or situation being studied. As part of the model-building process, the engineer articulates the assumptions used in building the model and any limits within which it applies.
After the model is developed, experimental research is designed and data is collected to test how well the model reflects the real-world situation. The model is then refined on the basis of this data and manipulated to better assist in developing a solution to the problem. Further empirical research is then conducted to confirm that tine proposed solution to the problem is both effective and efficient.
Based on the results of this study, the engineer draw s conclusions and makes recommendations on the best way to proceed in order to solve the; problem.

The successful solution of engineering problems must be based on an understanding of variability and how to apply the principles of mathematical statistics to real-world problems. Mathematical statistics is a branch of mathematics that deals with the analysis and interpretation of data. Mathematical statistics provides the theoretical underpinnings for various applied statistical disciplines, including engineering statistics, in which data is analyzed to find answers to quantifiable questions. Engineering statistics is the application of these tools and techniques to the analysis of real-world problems. The discipline of engineering statistics is concerned with the collection, presentation, analysis, and use of data in order to make practical decisions. Statistical methods are useful in helping the engineer understand the underlying variability that can be observed in systems and phenomena. For example, in manufacturing, some percentage of products always has defects, no matter how standardized or efficient the process. Statistics can help quality-control engineers better understand why this occurs and design processes or equipment that will help reduce the number of defective products produced.

Classes of Statistics

There are two general classes of statistics: descriptive and inferential statistics.

Descriptive Statistics

Descriptive statistics are used to describe and summarize large amounts of data in ways that are easily comprehensible. Descriptive statistics include various graphing techniques, measures of central tendency, and measures of variability.

Graphing techniques are used to help the engineer aggregate and visually portray data so that it can be better understood. Some of the graphing techniques used by engineers include histograms, frequency distributions, stem-and-leaf plots, and time-series plots.
Measures of central tendency, sometimes referred to less accurately as "averages," are used to estimate the midpoint of a distribution. The three types of measures of central tendency are the median (the number in the middle: of the distribution), the mode (the number occurring most often in the distribution), and tire mean (a mathematically derived measure in which the sum of all data in the distribution is divided by the number of data points in the distribution).
Measure s of variability show how widely dispersed the values are over the distribution. The standard deviation the derived index of the degree to which scores differ from the mean of the distribution.

Inferential Statistics

Although descriptive statistics can be useful in describing data, they do not allow engineers to draw conclusions or inferences from the data. Inferential statistics are used to test hypotheses to determine if the results of a study have statistical significance, meaning that they occur at a rate that is unlikely to be due to chance. A hypothesis is an empirically testable declarative statement about the relationship between the independent and dependent variables and their corresponding measures. The independent variable is the variable that is being manipulated by the researcher. For example, an engineer might want to know if the design for the new bridge truss is stronger than the; old design. The independent variable is the type of truss. The dependent variable, so called because its value depends on the value of the independent variable, is the amount of weight the bridge can bear. In this example, the value of the dependent variable depends on which truss design is used.

Testing Hypotheses

In order to test a hypothesis, it must be stated in two ways. The null hypothesis (H0) is the statement that there is no statistical difference between the status quo and the experimental condition -- in other words, that the independent value being studied makes no difference to the end result. For example, a null hypothesis about the strength of the two truss designs might be that there is no difference in the amount of weight that can be carried by the new design as opposed to the old design. The alternative hypothesis (H1) states that there is a relationship between the two variables -- for example, that the new design can support more weight.

After the engineer formulates the null hypothesis, an experimental design is developed that allows it to be empirically tested. The engineer then collects data to determine whether or not the experimental condition had any effect on the outcome. After the data has been collected, it is statistically analyzed to determine whether the null hypothesis should be accepted (i.e., there is no difference between the strength of the two designs) or rejected (i.e., there is a difference between the two designs). As shown in Figure 2, accepting the null hypothesis means that if the data in the population is normally distributed, the results probably fall in the unshaded portion of the distribution and thus are probably due to chance. However, if the results lie in the shaded portion of the graph, the null hypothesis needs to be rejected and the alternative hypothesis accepted. This means that there is a statistical significance that any difference observed between the strength of the two designs is probably due not to chance but to a real underlying difference in how much weight they can hold.

Analysis of Hypothesis Data

Developing a model and collecting data alone are insufficient to determine whether to accept or reject the null hypothesis. The engineer needs to determine how the data will be analyzed. Statistics offers a wide range of methods to test hypotheses, each of which is appropriate to a different type of experimental design. T-tests are used to analyze the mean of a population or compare the means of two different populations. if one wishes to compare the means of two populations, however, a z statistic may be used instead. Another statistical technique that is frequently used to analyze data in applied settings is analysis of variance (ANOVA). This is a family of techniques that enables tire; engineer to analyze the joint and separate effects of multiple independent variables on a single dependent variable and to determine the statistical significance of the effect. For example, ANOVA might be used if one wishes to determine the relative strength of three different truss designs. Multivariate analysis of variance (MANOVA) is an extension of this set of techniques that allows the engineers to test hypotheses on more complex problems involving the simultaneous effects of multiple independent variables on multiple dependent variables.

Correlation Coefficients

Other types of applied statistics allow engineers to predict one variable from the knowledge of another variable. For example, if one were designing a new user interface, it might be helpful to know the demographics of the people who will use it. Younger people might prefer a user interface that is primarily graphical in nature, similar to the interfaces that they are used to using with other application software. Older people or those who are not used to the graphical interface, on the other hand, might want more text support, pull-down menus, pop-up explanations, or even a hard-copy user's manual. One way to answer such questions is by determining the relationship between the two variables, in this case the age of the user and the attitude toward the user interface. Correlation coefficients allow engineers to determine whether the two variables are positively correlated (i.e., the older people become, the more they like the graphical user interface), negatively correlated (i.e., the older people become, the less they like the graphical user interface), or not correlated at all.

However, the real world is complex, and the problems facing engineers do not always have easy answers involving only two variables. For example, consumers' attitudes toward the user interface may depend not only on their age but on other factors as well, including how familiar they are with this type of interface, how often they use computers, what other application software they use, and how frequently they use it. One way to analyze this type of complex situation is through multiple regression analysis. This is a family of statistical techniques that allow one to predict the value of the dependent variable when given the values of one or more independent variables. Multiple regression analyzes the effects of multiple predictors on outcomes so that the engineer has a better understanding of their relative contributions as well as the factors that make up a user's preference or other question of interest.

These are only a few of the statistical techniques that are used every day by engineers. Statistical techniques can be applied to the gamut of engineering problems, including new product design, process development, and quality control.

Applications

Research Design

Engineering statistics is an applied field used to help make practical decisions about real-world problems. To assist the engineer in this task, a variety of research studies can be conducted. In general, the goal of research is to describe, explain, and predict behavior. Therefore, a good research design depends in part on two factors: controlling the situation so that the research is only measuring what it is supposed to measure and including as many of the relevant factors as possible so that the research fairly emulates the real-world experience.

Variables Important to Research

In the simplest research design, a stimulus (e.g., a text-based user interface vs. a graphical user interface) is presented to the research subjects (e.g., potential users), and a response is observed and recorded (e.g., which user interface allowed the subjects to complete the task more quickly). Three types of variables are important in research: independent variables, dependent variables, and extraneous variables.

The independent variable is the stimulus or experimental condition that is hypothesized to affect behavior.
The dependent variable is the observed effect on behavior caused by the independent variable.
Extraneous variables are variables that affect the outcome of the experiment (e.g., the ease with which the subjects navigate the user interface) that have nothing to do with the independent variable (i.e., the experimental condition being tested).

Extraneous variables, though irrelevant to the research question being asked, can affect the outcome of the experiment in various ways. For example, if the subject is tired after a long day at work and looking forward to going home and relaxing, any user interface will seem too much trouble to learn if it keeps the subject from the goal of going home on time. Such variables need to be controlled as much as possible. For example, the engineer could test all subjects during the middle of the day, when they are refreshed and can concentrate on the task of learning the new interface. It is, of course, impossible to control literally every possible extraneous variable. However, the more extraneous variables that are accounted for and controlled in the experimental design, the more meaningful the results will be.

Research Techniques

Engineering tasks tend to be real-world oriented and demand practical applications. Therefore, engineering research studies often need to be made under real-world conditions. There are a number of common research techniques that can be used to investigate engineering problems. Laboratory experiments allow engineers the most control over extraneous variables. For example, the test of the user interface could always be held at the same time of day in a room with no distractions. However, this situation is far removed from the reality of how most people learn to use a new piece of software. Another approach to engineering research is to use a simulation. This can allow the engineer to bring in more real-world conditions but still control many of the extraneous variables. For example, the test of a new engineering design could be done in conditions that approximate the real world or even using a computer model that mathematically attempts to predict the outcome given the change in various variables. Engineering research can also be conducted as a field experiment in which people are asked to learn a new software package at work or at home under the actual conditions in which they normally would do this task. Although this approach has the advantage of being more realistic, it also has the disadvantage of giving the engineer less control over extraneous variables.

Conclusion

Another approach to engineering research is field study, which is an examination of how people behave in the real world. For example, to develop a more efficient process for producing widgets, the engineer could observe how production workers currently approach the task and what the results of each of approach are. This data could be statistically analyzed to determine which approach or steps within the approach are most efficient.

Terms & Concepts

Analysis of Variance (ANOVA): A family of statistical techniques that analyze the joint and separate effects of multiple independent variables on a single dependent variable and determine the statistical significance of the effect.

Dependent Variable: The outcome variable or resulting behavior that changes depending on whether the subject receives the control or experimental condition.

Descriptive Statistics: A subset of mathematical statistics that describes and summaries data.

Hypothesis: An empirically testable declaration that certain variables and their corresponding measure are related in a specific way proposed by a theory.

Independent Variable: The variable in an experiment or research study that is intentionally manipulated in order to determine its effect on the dependent variable.

Inferential Statistics: A subset of mathematical statistics used in the analysis and interpretation of data.

Mathematical Statistics: A branch of mathematics that deals with the analysis and interpretation of data. Mathematical statistics provides the theoretical underpinnings for various applied statistical disciplines, including engineering statistics, in which data is analyzed to find answers to quantifiable questions.

Null Hypothesis (HO): The statement that the findings of the experiment will show no statistical difference between the control condition and the experimental condition.

Population: The entire group of subjects belonging to a certain category, such as all women between the ages of 18 and 27, all O-rings, or all engineering students.

Standard Deviation: A measure of variability that describes how far the typical score in a distribution is from the mean of the distribution. The larger the standard deviation, the farther away the typical score is from the midpoint of the distribution.

Statistical Significance: The degree to which an observed outcome is unlikely to have occurred due to chance.

Variable: An object in a research study that can have more than one value.

Bibliography

Black, K. (2006). Business statistics for contemporary decision making (4th ed.). New York: John Wiley & Sons.

Box, G. P., & Woodall, W. H. (2012). Innovation, quality engineering, and statistics. Quality Engineering, 24(1), 20-29. Retrieved November 26, 2013, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=69537955&site=ehost-live

John, P. W. (1990). Statistical methods in engineering and quality assurance. New York: John Wiley & Sons.

Montgomery, D. C., Runger, G. C., & Hubele, N. F. (2004). Engineering statistics (3rd ed.). New York: John Wiley & Sons.

Nishijima, K., Ko?hler, J., & Faber, M. H. (2011). Applications of statistics and probability in civil engineering. Boca Raton: CRC Press. Retrieved November 26, 2013, from EBSCO Online Database eBook Collection. http://search.ebscohost.com/login.aspx?direct=true&db=nlebk&AN=426482&site=ehost-live

Snee, R. D., & Hoerl, R. W. (2011). Proper blending. Quality Progress, 44(6), 46-49. Retrieved November 26, 2013, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=61230823&site=ehost-live

Witte, R. S. (1980). Statistics. New York: Holt, Rinehart and Winston.