Data Management
Data management refers to the comprehensive processes used to acquire, store, manage, and utilize data in an organization, particularly in the context of electronic information systems. It encompasses both content management systems (CMS) and database management systems (DBMS). CMS focus on managing entire documents or parts of documents, allowing for the organization and retrieval of various file types such as text and images. In contrast, DBMS deal with data at a much finer granularity, handling individual data values that are essential for transactional processes, like those used in banking.
The advent of information technology has revolutionized how organizations manage data and documents, enabling automation of traditional processes. Tasks that once required manual effort, such as document creation and data retrieval, can now be conducted more efficiently through electronic means. This shift not only streamlines operations but also fosters better collaboration across different departments within an organization, as data can be shared and accessed more readily.
Data management systems are crucial for maintaining data integrity, reducing redundancy, and ensuring that information is accessible when needed. By leveraging technology, organizations can effectively manage vast amounts of data, enhance productivity, and support informed decision-making. Overall, data management is an essential component for modern organizations aiming for operational efficiency and coherence in their data practices.
Data Management
When thinking about managing electronic information, many people first think about an electronic analogy of conventional business processes. However, information technology also helps businesses transform their processes and perform tasks in new ways. To be effective, information systems need to manage both content and data. Content management systems manage entire documents or parts of documents. Database management systems deal with a much finer level of granulation than content management systems. Database management systems are software programs that allow users to manage the data in a database. These systems allow users to share data, thereby cutting down on costs and improving consistency across the organization.
The advent of computers and the ensuing Information Age has changed the way that many of people live their lives and do business in the twenty-first century. At home, technology cooks the TV dinner in the microwave and automatically turns on the porch lights at dusk. At the office, the ubiquitous computer workstation on the desktop allows employees to write and distribute documents and communicate not only with those across the street, but across the globe. Many tasks that were previously done by hand -- writing letters, balancing the monthly books, inputting time card information -- have been automated. However, these are not just cosmetic changes that allow people to do the same things with different technology. In many cases, the very business processes that an organization uses have been transformed through the application of information technology. For example, producing a document no longer requires handwritten drafts and multiple submissions to a typing pool for changes. Today, the pencils often stay in the drawer and composition and correction of documents is done directly on the computer, bypassing the need for the secretary or typist. Further, documents no longer need to be printed out for distribution, but can be electronically transmitted to whomever needs them. Comments can be made and tracked electronically then returned electronically to the original author for coordination and finalization. Document storage and retrieval have also been made easier through technology. No longer does one have to search through piles of documents, file drawers, or dusty cardboard boxes, but one can easily retrieve documents off of a hard drive, the cloud, or other storage device.
Information technology allows organizations to do all this and more. Information technology includes the use of computers, communications networks, and knowledge in the creation, storage, and dispersal of data and information. Information technology comprises a wide range of items and abilities for use in the creation, storage, and distribution of information. Information technology can be linked together into an information system that facilitates the flow of information and data between people or departments. However, as anyone who has ever experienced a computer crash knows, these technologies are only useful to the extent that they allow one to actually retrieve the data that is stored within them.
Functions of Information Technology
As shown in Figure 1, information technology has several basic functions that may occur sequentially or simultaneously. First, the information system captures data by compiling detailed records of activities for later analysis or processing. For example, data capture might include the collection of patron information and book information when a book is checked out of the library, the assignment of seats on an airplane or in a theatre, or the collection of customer information for orders taken over the Internet. These data are then processed by converting, analyzing, or synthesizing them into information that can be used by the organization and its employees. This function includes handling and transforming data into information. For example, word processing allows users to create documents and other text-based documents and image processing converts visual information such as graphics and photographs into a format that can be stored or manipulated in the information system and/or transmitted across the network. Information technology can also be used to generate data through processing by organizing data and information into a useful form such as in the generation of a document or multimedia presentation. In addition, data and information must also be stored on a computer so that they can be retrieved and processed at a later time. They can also be transmitted by information systems and distributed to other parties via a communications network.
When thinking about managing electronic information, many people first think about an electronic analogy of conventional business processes. For example, most businesses need to generate, store, and access various documents (e.g., word processing documents, spreadsheets, images and graphics). In the past, this would be done through manual means: a document might be typed, a ledger might be handwritten and manually computed, and photographs might be available as negative or positive images. These documents would then be stored in various physical filing systems and retrieved by hand. However, information technology allows people to perform these tasks using computers and other mobile devices. Information also allows the electronic storage and retrieval of these items. These processes are handled by electronic document management systems that track and store electronic documents and/or images of paper documents or by content management systems that allows users to manage the content of a collection of data including computer files, audio files, graphics and images, electronic documents, and web content.
Data Management Systems vs. Content Management Systems
In addition to managing documents, most information systems also have data that need to be managed. Data management systems allow the management of the data housed in an information system at a more granular level. As shown in Table 1, there are a number of parameters on which data management systems and content management systems differ. Content management systems manage items on a high level and are used both as electronic repositories for operational data and as archives for long-term data storage. Historically, these tasks were performed using a library (card catalogs, for example, are a non-electronic type of content management system) or filing cabinet or system. The goal of content management systems is to respond to user queries by identifying documents that may contain information of interest and presenting a list of these documents to the user. This information is readable and accessible by humans and includes various documents such as files used for business that are primarily text or image-based. An example of an item managed by a content management system would be a document such as a memo or report that contains text, photos, or graphics that humans can understand without further processing. Major users of content management systems include industries that need to be able to access documents. For example, the insurance industry needs to keep various documents on file such as policies, claims, reports, and photos of damage. To perform their tasks, content management systems rely on metadata that enable them to track documents. Examples of metadata include key words, titles, and time stamp information that allow the system to track the documents.
Content Management Data Management Initial problem domain Library and filing cabinet General ledger and spreadsheet Who uses it Ordinary people, work groups Applications, specialists Type of data Human readable with heavy text or image Strongly data focused, heavily computational Data granularity Entire document or document fragment Individual data values Usage scenario Archival and retrieval Analytic, transactional application Search task Parametric and text, to find documents Mostly parametric with text extension, to find data values Typical data set sizes Petabytes (one petabyte = one quadrillion bytes) Terabytes (one terabyte = one trillion bytes) Performance metric Time to first response (user interactions) Transactions/queries per second
Content management systems manage entire documents or parts of documents. Data management systems, on the other hand, deal with a much finer level of granulation than content management systems. In fact, data management systems deal with individual data values, not of value on their own to most humans, but which are heavily computational. These values are analogous not to something stored in the stacks of a library or in a file cabinet, but to those that are contained in a ledger. In fact, industries such as banking that previously did much of their record keeping in the form of ledgers rather than text documents are the primary users of data management systems. Examples of the kinds of data managed in these systems are account codes, names, shipment dates, and balance in formation. The data tend to be short (e.g., numbers, dates, short character strings) and do not include much natural language or rich-media data. This type of data is used in applications such as automatic teller machines or point of sales devices in retail stores. As opposed to documents and other content, data are used by software applications programs or by information technology specialists. The level of granularity of a data management system is much finer than that of a content management system.
There are other differences between content management and data management as well. Content management is used to archive and retrieve information whereas data management is used to analyze or otherwise manipulate data. Content searches are parametric and text-based, and are used to find documents. Data searches, on the other hand, are primarily parametric and are used to find data values. Reflecting the differences in the users and use, there are also differences in how content and data managements systems' effectiveness is judged. Content management systems are rated by how quickly they respond to user requests. Because these systems need to deal with large volumes of unstructured data, transaction time can be relatively long compared to that of data management systems. The performance of data management systems, on the other hand, is evaluated by its throughput (e.g., how many transactions or queries they can handle per second). The size of the two types of systems also differs. Content management systems can deal with data sets on the order of petabytes. This larger size is needed because the content management system needs more than the simple tabulated format and must maintain various relationships. Data management systems, on the other hand, do not have this requirement and typically deal with data sets that are on the order of terabytes.
Range & Purpose of Data
Organizations acquire and use a wide range of data for various purposes. For example, it may collect and store identifying information about its customers (e.g., name, address, phone number, account number) as well as other attributes (e.g., account balance, customer status). Frequently, these data are shared between users in the organization. For example, the billing department may need this information in order to send out a monthly statement. The accounting department may need the same data in order to track whether or not the bill was paid. The marketing department may use these data to distribute catalogs or other sales literature. Although each department could have its own database that housed only the information that it needed for its interactions with the customer, many of the data items are used by more than one department. As capturing, storing, and maintaining data in a database can be expensive, a company can save money by using a shared database rather than separate databases. In addition, sharing data means that everyone is working off the same data set. This practice will also ensure more consistency and everyone who needs access to the data will have the information that s/he needs.
However, not everyone in the organization will need the same data from the database. For example, although the accounting department may need to know the status of a customer's account, the marketing department may not. In addition, even when multiple people or departments need the same data, they frequently need it organized in different ways. Although both accounting and billing may need the same inputs, the latter will need the information generated in the form of a bill or invoice whereas the former will not.
Database Management Systems
Database management systems are software programs that allow users to manage the data in a database. Database management systems are designed to increase the accessibility of data and the productivity of the user. Database management systems work in tangent with other programs on the information system, maintaining the structure of the data and working with the other programs so that the data can be located and retrieved. Database management systems also accept new data from application programs and write these into the appropriate storage location on the system.
There are five functions to database management systems. First, database management systems integrate databases to provide the information required to solve problems or perform operations. Second, database management systems reduce data redundancy across the various databases so that a datum need only be stored in the system in one location rather than in multiple locations. Third, database management systems enable the information system to give access to and share information between employees at various locations. Fourth, database management maintains the integrity of the various databases by controlling access to data, providing security, and ensuring that data are available when they are needed. Finally, database management systems help databases evolve so that they continue to meet the needs of their users.
Figure 2 shows the relationship of a database management system to other programs in the computer memory. The application program (a software program that performs functions not related to the running of the computer itself) requests the database management system to locate and retrieve data according to a schema -- a predefined structure. The database management system interacts with the operating systems for the computer and the network to transmit the data over the network.
Applications
An example of a database management system in action is that of Chubb Corporation. Chubb is one of the top ten publicly traded insurance organizations in the US and has more than 13,000 employees, with over 130 offices in more than 30 countries. The company also works with approximately 8,000 independent insurance agents and brokers across the globe. One of the characteristics that has made Chubb so successful is its ability to create custom insurance products quickly and cost-effectively. This complex task is further complicated by the fact that insurance laws differ widely not only from country to country, but from state to state.
To manage this complex task, Chubb created a system based on an object-oriented database that serves as the central resource for packaging the various data necessary into custom-designed policies for homeowner, vehicle, excess liability, and valuable article coverage. Object-oriented databases store reusable objects that contain data about themselves and how they are to be processed. The object-oriented database stores these data as well as information about the objects. Instead of storing different product rules for each state or location, Chubb's system stores the various rate and rule information into business objects that contain data about themselves and about how they are to be processed. The system contains more than 1,000 classes with approximately 18 methods for each class. For example, if writing a policy for jewelry, the system will include classes such as precious stones, gold, and silver. Within the precious stones class are methods such as valuation, risk of loss, allowable coverage, and premium calculation. When a Chubb agent wants to create a new policy, s/ he enters the information about the customer's needs (e.g., type of insurance, amount of coverage desired) into the system. The system then retrieves the objects relevant to the situation and prepares a premium quote for the customer based on the stored methods describing state regulatory constraints. One of the benefits of this system is that business objects can be reused. For example, if multiple states have the same regulations concerning homeowners insurance, the same object can be used to prepare policies in all these states.
Terms & Concepts
Business Process: Any of a number of linked activities that transforms an input into the organization into an output that is delivered to the customer. Business processes include management processes, operational processes (e.g., purchasing, manufacturing, marketing), and supporting processes, (accounting, human resources).
Content Management System (CMS): A software system that allows users to manage the content of a collection of data including computer files, audio files, graphics and images, electronic documents, and web content. Examples of content management systems include web content management systems, document management systems, and workflow management systems for article publication.
Database: A collection of data items used for multiple purposes that is stored on a computer.
Database Management System (DBMS): A software program that allows users to manage the data in a database. Database management systems are designed to increase the accessibility of data and the productivity of the user.
Electronic Document Management System (EDMS): A computer system that tracks and stores electronic documents (e.g., word processing files) and/or images of paper documents.
Granularity: The relative size, scale, degree of detail, or depth of penetration that defines an object or activity. Actions or items at a finer level of granularity enable those at a coarser granularity. For example, a yardstick that is broken up into inches has finer granularity than a yardstick that is broken up into feet.
Information Technology: The use of computers, communications networks, and knowledge in the creation, storage, and dispersal of data and information. Information technology comprises a wide range of items and abilities for use in the creation, storage, and distribution of information.
Information System: A system that facilitates the flow of information and data between people or departments.
Metadata: Data that describe aspects of other data, especially the structure, context, and meaning of raw data. In information technology, metadata are used to organize and interpret data so that it can be converted into meaningful information. Types of metadata include detailed compilations (e.g., data dictionaries with information about individual data elements) and descriptions (e.g., title, key words in an HTML page).
Network: A set of computers that are electronically linked together.
Processing: The activity of converting, analyzing, computing, and synthesizing data or information stored in a computer so that it is in a useful form.
Spreadsheet: A table of values arranged in rows and columns in which the values have predefined relationships. Spreadsheet application software allows users to create and manipulate spreadsheets electronically.
Workstation: A desktop computer that is connected to a network. Workstations are also sometimes referred to as clients or nodes.
Bibliography
Biesdorf, S., Court, D., & Willmott, P. (2013). Big data: What's your plan?. Mckinsey Quarterly, (2), 40-51. Retrieved November 20, 2013 from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=87315650&site=ehost-live
Helland, P. (2011). If you have too much data, then "good enough" is good enough. Communications of the ACM, 54 (6), 40-47. Retrieved November 20, 2013 from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=6323 1735&site=ehost-live
Mackie, M. (2013). Proven practices for content management. KM World, 22 (7), S3-S4. Retrieved November 20, 2013 from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=89366902&site=ehost-live
Senn, J. A. (2004). Information technology: Principles, practices, opportunities (3rd ed.). Upper Saddle River, NJ: Pearson/Prentice Hall.
Somani, A., Choy, D., & Kleewein, J. C. (2002). Bringing together content and data management systems: Challenges and opportunities. IBM Systems Journal, 41(4), 686-696. Retrieved July 19, 2007, from EBSCO Online Database Business Source Complete. http://search. ebscohost.com/login.aspx?direct=true&db=bth&AN=7928975&site=ehost-live
Suggested Reading
Drnevich, P. L., & Croson, D. C. (2013). Information technology and business-level strategy: Toward an integrated theoretical perspective. MIS Quarterly, 37 (2), 483-509. Retrieved November 20, 2013 from EBSCO Online Database Business Source Complete. http://search.ebsco-host.com/login.aspx?direct=true&db=bth&AN=87371536&site=ehost-live
Francalanci, C. & Piuri, V. (1999). Designing information technology architectures: A cost-oriented methodology. Journal of Information Technology, 14(2), 181-192. Retrieved July 19, 2007, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=3869916&site=ehost-live
Marchand, D. A., & Peppard, J. (2013). Why IT fumbles analytics. Harvard Business Review, 91 (1), 104-112. Retrieved November 20, 2013 from EBSCO Online Database Business Source Complete. http://search.ebsco-host.com/login.aspx?direct=true&db=bth&AN=84424084&site=ehost-live
Rahman, N. (2007). Refreshing data warehouses with near real-time updates. Journal of Computer Information Systems, 47(3), 71-80. Retrieved July 19, 2007, from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=bth&AN=25144337&site=ehost-live
Winer, L. R. & Carrère, M. A. (1991). Qualitative information system for data management. Qualitative Sociology, 14(3), 245-262. Retrieved July 19, 2007, from EBSCO Online Database Business Source Complete. http://search.ebsco-host.com/login.aspx?direct=true&db=bth&AN=10953576&site=ehost-live