Hypothesis Construction

Abstract

Based on their observations of real world phenomena, sociologists develop theories to explain and predict the behavior of humans within society. One of the first steps in testing the validity of a theory is to develop a hypothesis. A hypothesis is an empirically verifiable declaration that certain variables and their corresponding measures are related in a specific way proposed by a theory. To be of use in testing the validity of a theory, hypotheses must be stated so that they can be tested with the tools of inferential statistics. To this end, hypotheses express the relationship between the independent and dependent variables proposed by a theory in a way that permits them to be tested to determine the statistical likelihood of the observed results being due to chance or an underlying factor.

Overview

No matter where we are or what we are doing, we are constantly bombarded with all sorts of information. Sometimes this information is relevant to our current or future activities, and sometimes it is not. For example, as I sit in my office dictating this article to my computer, I am primarily aware of the process of trying to transform my thoughts into words. However, if I pay attention, I am also aware of other experiences, too. Certainly I hear my voice as I dictate, but I can also hear my headset amplifying my voice, telling me that my voice recognition software is receiving the data it needs to transcribe my words. In addition, I receive other sense experiences that I am choosing to ignore at this time: the heat of the halogen lamp sitting on my desk, the sunlight streaming through my office windows, the noise of the printer as it spits out a draft copy of the article, and the warm air softly blowing from a heating vent. If I am quiet and listen carefully, I can also hear my heart beating as well as noises coming from outside my office.

Obviously, I do not care about all this information, nor can I process it all at the same time. Unless I am in danger of touching my lamp, the heat it puts off is irrelevant. My heartbeat is not important either, unless it develops an arrhythmia or other aberration.

Even the sound of my voice is irrelevant as long as I hear the words in my head and they get correctly transcribed onto the computer screen. I simply cannot maintain a high degree of attention to all these sense experiences at the same time, so I ignore most of them and focus merely on the ones that are important to the task at hand.

Just as I need to pay attention to or ignore the various inputs I receive as I sit in my office, so, too we must pay attention to or ignore the various inputs we receive as we interact with others. For example, if I am having difficulty downloading an article from a database, there are many potential reasons for the problem: I may have entered an incorrect access code; I may no longer have access authorization; my computer hardware or software may be malfunctioning; the database may be experiencing a technical problem; or the host server or my internet service provider may be experiencing a technical problem. If I am unable to troubleshoot the problem on my own, I may contact technical support to gather additional data so that I can narrow down the source of the problem. Technical support may be able to give me additional data, or point out data that I am ignoring so that, between us, we can solve the problem.

As we work together, we develop and test hypotheses. For example, our initial hypothesis might be that I have entered an incorrect access code. The technical support person could then look up my account, confirm my access code, and ask me to enter it again. If this does not work, we might formulate a new hypothesis: that I no longer have access authorization. The technical support person could then contact the department that sets up authorizations and see if I have lost mine.

As complex as this process may be, however, troubleshooting a computer problem is a relatively simple task compared to interpreting human behavior. Sociologists task themselves with this work as they constantly formulate and reformulate hypotheses based on their observations in order to describe and predict the behavior of people within society.

Applications

What is a Hypothesis? Hypotheses are developed from the observations of a researcher or research team. For example, based on my observation that I am much more likely to receive prompt and courteous service when I go to a department store while I am wearing my business clothes than when I am wearing my old gardening clothes, I may develop the hypothesis that clerks in retail stores give differential service depending on the perceived socioeconomic status and social capital of the person they are serving.

In scientific terms, a hypothesis is more than a question. For example, I may wonder aloud whether there is any relationship between the way that I dress and the way that I am treated by a sales clerk. To be useful from a scientific point of view, however, I need to operationally define my terms so that I can get a testable answer to my question. An operational definition is a definition that is stated in terms that can be observed and measured. For example, "the way that I dress" is open to many interpretations. To turn my question into a hypothesis, I need to operationally define all the terms in my question. So, I might operationally define "well-dressed" to mean being clean, well groomed, and wearing business attire, and "poorly dressed" to mean being dirty, poorly groomed, and wearing old, dirty clothes. Notice that by defining the terms in this manner I have left out a number of other possible scenarios, such as wearing old but clean and mended clothes, wearing formal wear, and wearing business clothes but not being well groomed. Similarly, I need to operationally define the meaning of "good service." To do this, I might develop a series of rating scales or criteria that measure the various components of service (e.g., the number of minutes it takes for the sales clerk to respond to the customer standing at the counter, how much eye contact the sales clerk makes with the customer, how long the sales clerk listens before making a suggestion). Although these operational definitions (and their concomitant simplification of the original question) may not answer all the nuances of the original question, they do allow me to develop a hypothesis that I can actually test in the field.

Scientifically speaking, a hypothesis is an empirically verifiable declaration describing the relationship between and corresponding measures of the independent and dependent variables as proposed by a theory. The independent variable is the variable that is manipulated by the researcher. In the example above, the independent variable is the manner in which a person is dressed (e.g., in business attire or in dirty old clothes). The dependent variable, so called because its value depends upon the degree of the independent variable to which the subject is exposed, is the subject's response to the independent variable (e.g., the level of service the sales clerk offers).

Null & Alternative Hypotheses. For the purposes of empirical research, a hypothesis is stated in two ways. The null hypothesis (H0) is the statement that there is no statistical difference between the status quo and the experimental condition (i.e., the treatment being studied made no difference on the end result). For example, a null hypothesis about sales clerks' responses to the way customers dress would state that there is no difference in the way sales clerks treats customers dressed in business attire and the way they treat customers dressed in dirty, old clothes. In effect, this null hypothesis states that there is no relationship between the independent variable of how people dress and the dependent variable of the level of service offered.

The alternative hypothesis (H1), on the other hand, states that there is a relationship between the two variables (e.g., that sales clerks give better service to customers wearing business attire).

As shown in Figure 1, hypothesis construction and research design start with a theory that is based on real-world observation.

ors-soc-1101-126618.jpg

To find out if this hypothesis is true, the researcher next needs to operationally define the various terms (i.e. constructs) in the hypothesis. The researcher would then run an experiment to test the hypothesis.

In the simplest research design, a stimulus (e.g. a "customer" wearing business attire or dirty old clothes) is presented to the research subjects (e.g. sales clerks). The response of the subjects to the stimulus is observed and recorded (e.g. what level of service they gave the "customer"). There are three types of variables that are important in research. As discussed above, the variables of most concern in the design of a research study are the independent and the dependent variables. However, as shown in Figure 2, these are not necessarily the only variables that need to be controlled during a study. Extraneous variables, or variables that have nothing to do with the independent variable itself, can also affect the outcome of the experiment (e.g. the level of service given to the customer). To make my experiment valid, I need to define and control these variables.

For example, if a sales clerk has just dealt with a difficult customer or had a negative interaction with her boss, it is likely that the negative attitude created by that previous interaction will carry over to the next interaction. This transfer would be particularly likely if the person with whom the clerk just had a negative encounter was wearing clothes similar to those worn by the research confederate. Any number of extraneous variables can affect the outcome of the research and lead to an erroneous interpretation of the results. Therefore, as much as possible, these variables need to be controlled. For example, the experiment could be set up so that the confederate would be the clerk's first customer of the day, or could only approach the clerk after he or she had been free for ten minutes. Although it is impossible to control every possible extraneous variable—for instance, being the clerk's first customer does not rule out negative interactions the clerk may have had at home or while driving to work—the more of these variables that are accounted for and controlled in the experimental design, the more meaningful the experiment's results will be.

ors-soc-1101-126619.jpg

One of the reasons that researchers use hypotheses with operationally-defined variables rather than just asking general questions is so that they can statistically determine if the results they observe are due to some underlying factor or just chance. Used correctly, statistical analysis can help researchers determine if there is a relationship between the independent and dependent variables not only within the relatively restricted sample on which the research was based, but also, and more importantly, within the larger population of which the sample is assumed to be representative.

Analyzing Data. Statistical tools make certain assumptions about the nature of the data and their underlying distribution. As a result, not every statistical technique is appropriate for use with every set of data. Some researchers argue, for example, that a conclusion reached, particularly regarding hypotheses in studies related to certain fields (such as addiction studies), should involve the calculation of certain statistical factors to better substantiate the degree of effect or lack of effect (Beard, Dienes, Muirhead, & West, 2016). Further, as discussed at the beginning of this article, the world is a complex place and the relationship between an observed result (behavior) and the stimulus or stimuli that caused it can be complex. Although multivariate statistical tools can be used in some complex situations, they, too, are limited in what they can do. Therefore, designing a good research study depends in part on two factors: controlling the situation so that the research is only measuring what it is supposed to measure, and including as many of the relevant factors as possible so that the research scenario accurately emulates the real world experience.

Conclusion

A hypothesis is an empirically verifiable declaration describing the relationship between and corresponding measures of the independent and dependent variables as proposed by a theory. In sociology, hypotheses are used to transform questions about the behavior of people in groups or societies into testable research designs that can be statistically analyzed to determine the probability of the observed results being due to an underlying factor or to chance. Hypotheses employ the use of operational definitions that are stated in terms that can be observed and measured. For purposes of scientific research, hypotheses are stated two ways. The null hypothesis is the formal statement that the findings of an experiment will show no statistical difference between the current condition, or control condition, and the experimental condition. The alternative hypothesis is the formal statement that there is a statistical difference between the two conditions. The development of a good research hypothesis must take into consideration not only the independent and dependent variables that are of interest, but also any extraneous variables that may affect the resultant behavior but are not directly related to the research question. In addition, a hypothesis must be stated in such a way that it can be mathematically analyzed to determine the statistical significance of the observed results.

Terms & Concepts

Confederate: A person who assists a researcher by pretending to be part of the experimental situation while actually only playing a rehearsed part meant to stimulate a response from the research subject.

Data: (sing. datum) In statistics, data are quantifiable observations or measurements that are used as the basis of scientific research.

Dependent Variable: The outcome variable or resulting behavior that changes depending on whether the subject receives the control or experimental condition (e.g. a consumer's reaction to a new cereal).

Distribution: A set of numbers collected from data and their associated frequencies.

Empirical: Theories or evidence that are derived from or based on observation or experiment.

Hypothesis: An empirically verifiable declaration describing the relationship between and corresponding measures of the independent and dependent variables as proposed by a theory.

Independent Variable: The variable in an experiment or research study that is intentionally manipulated in order to determine its effect on the dependent variable (e.g. the independent variable of type of cereal might affect the dependent variable of the consumer's reaction to it).

Inferential Statistics: A subset of mathematical statistics used in the analysis and interpretation of data. Inferential statistics are used to make inferences, such as drawing conclusions about a population from a sample. Inferential statistics can also be used in decision-making.

Null Hypothesis (H0): The statement that the findings of the an experiment will show no statistical difference between the current condition, or control condition, and the experimental condition.

Operational Definition: A definition that is stated in terms that can be observed and measured.

Population: The entire group of subjects belonging to a certain category (e.g. all women between the ages of 18 and 27; all dry cleaning businesses; all college students).

Sample: A subset of a population. A random sample is a sample that is chosen at random from the larger population with the assumption that such samples tend to reflect the characteristics of the larger population.

Social Capital: The resources or benefits that people gain from the connections within and between their social networks.

Socioeconomic Status (SES): The position of an individual or group on the two vectors of social and economic status and their combination. Factors contributing to socioeconomic status include (but are not limited to) income, type and prestige of occupation, place of residence, and educational attainment.

Variable: An object in a research study that can have more than one value. Independent variables are stimuli that are manipulated in order to determine their effect on the dependent variables, or response variables. Extraneous variables are variables that affect the dependent variables but that are not related to the question under investigation in the study.

Bibliography

Beard, E., Dienes, Z., Muirhead, C., & West, R. (2016). Using Bayes factors for testing hypotheses about intervention effectiveness in addictions research. Addiction, 111(12), 2230–2247. Retrieved October 19, 2018, from EBSCO Online Database Sociology Source Ultimate. http://search.ebscohost.com/login.aspx?direct=true&db=sxi&AN=119309842&site=ehost-live&scope=site

Black, K. (2006). Business statistics for contemporary decision making (4th ed.). New York: John Wiley & Sons.

Calderwood, K. A. (2012). Teaching inferential statistics to social work students: A decision-making flow chart. Journal of Teaching in Social Work, 32, 133–147. Retrieved November 4, 2013 from EBSCO Online Database Sociology Source Ultimate. http://search.ebscohost.com/login.aspx?direct=true&db=sxi&AN=74639228&site=ehost-live&scope=site

Carlson, C. A., Pleasant, W. E., Weatherford, D. R., Carlson, M. A., & Bednarz, J. E. (2016). The weapon focus effect: Testing an extension of the unusualness hypothesis. Applied Psychology in Criminal Justice, 12(2), 87–100. Retrieved October 19, 2018, from EBSCO Online Database Sociology Source Ultimate. http://search.ebscohost.com/login.aspx?direct=true&db=sxi&AN=120395032&site=ehost-live&scope=site

Cohen, E. H., & Tresser, C. (2011a). Matrix assisted structural hypothesis construction. BMS: Bulletin de Methodologie Sociologique (Sage Publications Ltd.), 5–19. Retrieved November 4, 2013 from EBSCO Online Database SocINDEX with Full Text. http://search.ebscohost.com/ login.aspx?direct=true&db=sih&AN=61825831&site=ehost-live

Cohen, E. H., & Tresser, C. (2011b). Matrix assisted structural hypothesis construction: Further explorations. BMS: Bulletin de Methodologie Sociologique (Sage Publications Ltd.), 63–70. Retrieved November 4 2013 from EBSCO Online Database SocINDEX with Full Text. http://search.ebscohost.com/login.aspx?direct=true&db=sih&AN=67690797&site=ehost-live

Knuttila, K. M., & Magnan, A. Introducing sociology: A critical approach.5th ed. Oxford, UK: Oxford University Press.

Witte, R. S. (1980). Statistics. New York: Holt, Rinehart and Winston.

Suggested Reading

Anderson, M. L. & Taylor, H. F. (2002). Sociology: Understanding a diverse society (2nd ed.). Belmont, CA: Wadsworth/Thomson Learning.

Feld, S. L. (1997, Mar). Mathematics in thinking about sociology. Sociological Forum, 12, 3–9. Retrieved March 13, 2008 from EBSCO Online Database Academic Search Alumni Edition. http://search.ebscohost.com/login.aspx?direct=true&db=a2h&AN=11302788&site=ehost-live&scope=site

Gravetter, F. J. & Wallnau, L B. (2006). Statistics for the behavioral sciences. Belmont, CA: Wadsworth/Thomson Learning.

Saetnan, A. R., Lomell, H. M., & Hamer, S. (Eds.) (2011). The mutual construction of statistics and society. New York, NY: Routledge.

Schaefer, R. T. (2002). Sociology: A brief introduction (4th ed.). Boston: McGraw -Hill.

Stockard, J. (2000). Sociology: Discovering society (2nd ed.). Belmont, CA: Wadsworth/Thomson Learning.

Weakliem, D. L. (2016). Hypothesis testing and model selection in the social sciences. New York, NY: The Guilford Press.

Welkowitz, J., Cohen, B. H., & Lea, R. B. (2012). Introductory statistics for the behavioral sciences. 7th ed. Hoboken, NJ: John Wiley & Sons.

Young, R. K. & Veldman, D. J. (1977). Introductory statistics for the behavioral sciences (3rd ed.). New York: Holt, Rinehart and Winston.

Essay by Ruth A. Wienclaw, Ph.D.

Dr. Ruth A. Wienclaw holds a doctorate in industrial/organizational psychology with a specialization in organization development from the University of Memphis. She is the owner of a small business that works with organizations in both the public and private sectors, consulting on matters of strategic planning, training, and human/systems integration.