Data analytics (DA)

Data analytics (often shortened to "analytics") is the examination, exploration, and evaluation of information (data) with the goal of discovering patterns. "Data analytics" is sometimes used interchangeably, or confused, with "data analysis." Both data analytics and data analysis rely heavily on mathematics and statistics, but data analysis is usually defined more broadly, to include any examination of bodies of data, using a large array of possible tools, for the purpose of discovering information in that data or supporting a conclusion. There is considerable overlap between the methodologies and concerns of the two disciplines, both of which might draw on data mining, modeling, and data visualization. The important difference is the overall focus on patterns in analytics. Analytics is used in numerous industries and academic disciplines, from finance to law enforcement and marketing.

113931286-115584.jpg113931286-115585.jpg

Background

Some of the key concepts involved in analytics include machine learning, neural networks, and data mining. Machine learning is the branch of computer science that grew out of computational statistics. It focuses on the development of algorithms that, through modeling, learn from the data sets on which they operate and can express outputs that are data-driven decisions or predictions. A familiar example of machine learning is email spam filtering, which monitors the emails that users mark as spam, examines those emails for common characteristics, derives a profile from this examination, and predicts the probability of an email being spam. As artificial intelligence (AI) software and machine learning had only become more advanced and accessible by the 2020s, their use in data analytics had continued to evolve, with many organizations and industries considering them crucial for more efficient, comprehensive strategic decision-making. Neural networks are used in machine learning to model functions that, like biological neural networks, have a large number of possible inputs. Neural networks are often used in special kinds of decision-making and pattern-recognition applications, such as software underlying handwriting or speech recognition, vehicle control, radar systems, and the AI in computer games.

Data mining is a computer-science subfield devoted to using computers to discover patterns in data sets, especially data sets too large to be feasibly analyzed for such patterns by human operators. Computers can also search for patterns in ways that pose specific challenges to humans not because of the size of the data set but because of the nature of the data or the pattern; the way human vision operates, for instance, works against the ability to perceive certain patterns in visual data that become apparent when data is mined by a program. The methodology was first made possible by Bayes’s theorem, named for the eighteenth-century English statistician Thomas Bayes, whose work was also one of the first significant contributions to predictive analytics (the use of data analytics to evaluate patterns in data to draw conclusions about future behavior). Bayes’s work was used by Alan Turing and his team during World War II to break the German Enigma code.

Overview

Analytics is used across numerous industries and in many kinds of applications. In the early 2000s, Oakland Athletics general manager Billy Beane and baseball statistician Bill James revolutionized baseball by bringing analytics to the sport, and to the ways the Athletics evaluated player performance and made draft decisions. James’s "sabermetrics" downplayed traditional player performance metrics such as home runs in favor of analyzing the large body of data available on baseball games to determine the player performance statistics that had the most impact on team wins. Baseball happens to lend itself especially well to analytics because of the large number of games played per season combined with the more than century-long history of the sport and the large number of players per team. These three attributes create a large and rich data set.

Analytics has become fundamental to the way that the internet has been monetized, especially after it became clear that e-commerce alone was not sufficient to do so. Web analytics is a specific area of marketing analytics, relying on data collected by websites about the internet users who visit them. Modern web analytics typically includes not only search keywords and "referrer data" (information about how the user arrived at the website), data about their engagement with the website itself (the pages one has viewed, for example), and basic identity information (such as one’s geographic location) but also extended information about one’s internet activity on other websites, which, as a result of social media, can mean a wealth of demographic information. Tracking these activities allows marketers to tailor online advertisements for specific demographics. For instance, a user who watches an online superhero-movie trailer may be more likely to be shown a targeted ad for a new comic book; a person who changes their Facebook relationship status to "engaged" may be shown ads for wedding-planning services.

In the finance and banking industry, analytics is used to predict risk. Potential investments are analyzed according to their past performance when possible, or the performance of investments that fit a similar profile. Banks considering loans to individuals or businesses similarly rely on analytics to model the probable financial performance of the individual or business in the future. Naturally, there is some dispute about which factors are logical, practical, or acceptable to consider in these models.

Applied mathematicians (or even engineers or physicists) who specialize in developing the mathematical and statistical approaches underlying analytics are called quantitative analysts, or "quants." The increased reliance on predictive analytics to guide greater numbers of business decisions has sometimes been called big data, a term that can also refer to data sets too large to be processed by traditional means. The COVID-19 pandemic of the early 2020s highlighted the potential for big data analytics in the health-care field. Throughout the world, organizations, governments, and health-care facilities relied upon data analytics, based on large amounts of data gathered from a variety of digital and real-time sources, to quickly gain accurate insight into the spread of the novel coronavirus and the best means of controlling it.

Bibliography

Azeem, Haris. "Healthcare Data Analytics during the COVID-19 Pandemic." Astera, 18 Apr. 2024, www.astera.com/industry/healthcare/healthcare-data-analytics-during-covid-19/. Accessed 3 Sept. 2024.

Davenport, Thomas H. Analytics at Work: Smarter Decisions, Better Results. Harvard Business Review, 2010.

Davenport, Thomas H. Keeping Up with the Quants: Your Guide to Understanding and Using Analytics. Harvard Business Review, 2013.

Foreman, John W. Data Smart: Using Data Science to Transform Information into Insight. Wiley, 2013.

"The Future of Data Analytics: AI and Machine Learning Trends." International Association of Business Analytics Certification, 27 Sept. 2023, iabac.org/blog/the-future-of-data-analytics-ai-and-machine-learning-trends. Accessed 3 Sept. 2024.

McGrayne, Sharon Bertsch. The Theory That Would Not Die: How Bayes’ Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy. Yale UP, 2012.

Mlodinow, Leonard. The Drunkard’s Walk: How Randomness Rules Our Lives. Vintage, 2009.

Provost, Foster, and Tom Fawcett. Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking. O’Reilly, 2013.

Silver, Nate. The Signal and the Noise: Why So Many Predictions Fail—But Some Don’t. Penguin, 2015.

Wheelan, Charles. Naked Statistics: Stripping the Dread from the Data. Norton, 2014.