|
The quality of data input determines the quality of information output. Systems analysts can support accurate data entry through achievement of three broad objectives: effective coding, effective and efficient data capture and entry, and assuring quality through validation. Coding aids in reaching the objective of efficiency, since data that are coded require less time to enter and reduce the number of items entered. Coding can also help in appropriate sorting of data during the data transformation process. Additionally, coded data can save valuable memory and storage space. In establishing a coding system, systems analysts should follow these guidelines:
The simple sequence code is a number that is assigned to something if it needs to be numbered. It therefore has no relation to the data itself. Classification codes are used to distinguish one group of data, with special characteristics, from another. Classification codes can consist of either a single letter or number. The block sequence code is an extension of the sequence code. The advantage of the block sequence code is that the data are grouped according to common characteristics, while still taking advantage of the simplicity of assigning the next available number within the block to the next item needing identification. A mnemonic is a memory aid. Any code that helps the data-entry person remember how to enter the data or the end-user remember how to use the information can be considered a mnemonic. Mnemonic coding can be less arbitrary, and therefore easier to remember, than numeric coding schemes. Compare, for example, a gender coding system that uses "F" for Female and "M" for Male with an arbitrary numeric coding of gender where perhaps "1" means Female and "2" means Male. Or, perhaps it should be "1" for Male and "2" for Female? Or, why not "7" for Male and "4" for Female? The arbitrary nature of numeric coding makes it more difficult for the user. Date Formats
An effective format for the storage of date values is the eight-digit YYYYMMDD format as it allows for easy sorting by date. Note the importance of using four digits for the year. This eliminates any ambiguity in whether a value such as 01 means the year 1901 or the year 2001. Using four digits also insures that the correct sort sequence will be maintained in a group of records that include year values both before and after the turn of the century (e.g., 1999, 2000, 2001). Remember, however, that the date format you use for storage of a date value need not be the same date format that you present to the user via the user interface or require of the user for data entry. While YYYYMMDD may be useful for the storage of date values it is not how human beings commonly write or read dates. A person is more likely to be familiar with using dates that are in MMDDYY format. That is, a person is much more likely to be comfortable writing the date December 25, 2001 as "12/25/01" than "20011225." Fortunately, it is a simple matter to code a routine that can be inserted between the user interface or data entry routines and the data storage routines that read from or write to magnetic disk. Thus, date values can be saved on disk in whatever format is deemed convenient for storage and sorting while at the same time being presented in the user interface, data entry routines, and printed reports in whatever format is deemed convenient and familiar for human users. Data Entry Methods
Tests for validating input data include: test for missing data, test for correct field length, test for class or composition, test for range or reasonableness, test for invalid values, test for comparison with stored data, setting up self-validating codes, and using check digits. Tests for class or composition are used to check whether data fields are correctly filled in with either numbers or letters. Tests for range or reasonableness do not permit a user to input a date such as October 32. This is sometimes called a sanity check. |