Search engines and mathematics

Search engines are sophisticated tools designed to efficiently locate information across vast amounts of digital data, utilizing principles of mathematics and data analysis. They operate primarily by indexing information from various sources, including the Internet and personal computers, to provide relevant search results based on user queries. Different types of search engines exist, such as desktop search engines for individual computers, enterprise search engines for organizational data, and Web search engines that analyze billions of webpages.

Web search engines function through a three-step process: collecting information via crawlers, indexing that data for future retrieval, and presenting search results based on relevance. The algorithms used to sort and rank these results often incorporate mathematical methods, including probability and matrix algebra, which helps ensure that more relevant pages appear higher in the search results list. Popular search engines like Google, Yahoo, and Bing exemplify the advancements in this field, continuously evolving to manage the growing complexity of online information. Understanding how mathematics underpins these technologies can enhance one’s grasp of their functionality and effectiveness.

Authored By: Bag, Sukantadev 1 of 3
Published In: 2022 2 of 3
Related Topics:
Data mining; Algorithm (mathematics)
3 of 3

Search engines and mathematics

Summary: Using complex and sometimes proprietary algorithms, search engines locate and rank requested information, usually on the Internet or in a database.

Search engines are used for finding information from digitally stored data. Based on a search criterion like a word or phrase, search engines find information from the Internet and personal computers and present search results appropriately. A search engine is a very efficient tool for effortlessly finding information from millions of Web sites and their Webpages. For example, information on movies or weather forecast from the Internet can be easily found using search engines. To sort through vast amount of data, search engines use statistics, probability, mathematics, and data analysis.

Types of Search Engines

Different types of search engines are developed for different purposes. The simplest one is a desktop search engine, which is used for finding information stored within a computer. An enterprise search engine searches for digitally stored information within only one organization. A Web search engine looks for information on the World Wide Web (WWW). Sometimes, federated search engines are used for searching online databases or related items. Though there are different types, the term “search engine” generally refers to Web search engines.

Search Mechanisms

Searching for a word or phrase in a document in a computer is very simple and sophisticated; search engines are not needed for this. A program simply reads the whole or selected part of the document, looks for where the intended word or phrase is located, and highlights the locations in the document.

Desktop search engines perform more complicated searches. These engines read all files and folders kept in the computer to collect information and index them. Indexing is a method of storing information about files and folders considering several factors like file names, contents, types, authors, and locations of files. It uses mathematical manipulations involving numbers, operations, and data mining. Once indexing is finished, the engine follows that index for searching. For example, if the word algebra is searched in a computer, the engine reads the index and tries to find out where the word algebra is located (if anywhere), and it shows the resulting files or folders.

The most complicated and interesting search engines are Web search engines. The Web contains billions of Web pages, and each page contains information. These search engines search for information from almost all of them. These engines generally work in three major steps: (1) collecting information from the Web, (2) indexing, and (3) presenting search results.

For reading Webpages and collecting information, almost all Web search engines have their own computer program, often called a “crawler.” A Web search engine may have one or more crawlers. The information collected by crawlers contains subject matters, hyperlinks, images, and other information. Next, the search engines index the collected data and store them for future retrieval. The index is like a giant catalogue and involves huge mathematical applications to prepare. When a search criterion is given for searching, search engines follow this index; they find which Webpages contain the information and present results as lists of links to those pages.

A challenging task for Web search engines is to present the search results properly and quickly. While showing the results, it is expected that the more relevant pages corresponding to the search criterion should appear earlier than less relevant pages. Different search engines have different algorithms for arranging pages based on relevance. For example, the Google search engine uses an algorithm called PageRank for this purpose. It uses probability, data analysis, matrix algebra, and related fields.

Examples of Search Engines

Web search engines began to be developed in the 1990s and are constantly improving to handle the increasing size and content of the Web. Many of the individuals who develop and refine search engines have degrees in mathematics. Popular search engines like AltaVista (launched in 1995), Google (1998), Yahoo Search (2004), and Bing (2010) are only a few examples. Google Desktop, GNOME Storage, Windows Search, and Easyfind are among the most popular desktop search engines, while OpenSearchServer and DataparkSearch are good examples of enterprise search engines.

Bibliography

Levene, Mark. An Introduction to Search Engines and Web Navigation. London: Pearson, 2005.

Voorhees, E. M. Natural Language Processing and Information Retrieval. New York: Oxford University Press, 2000.