The Center for Biomedical Informatics
State University of Campinas, Brazil


Research Abstracts


A DISTRIBUTED MICROCOMPUTER-BASED SYSTEM FOR ANALYSIS OF MORTALITY DATA USING OPTICAL DISKS

Lobo da Costa Jr, M1; Sabbatini, RME2; Becker, RA3 and Tardelli, A4

1Department of Statistics and Epidemiology, School of Public Health, University of São Paulo, São Paulo; 2Center for Biomedical Informatics, State University of Campinas, Campinas; 3National Division of Epidemiology, Ministry of Health, Brasilia; 4Latin-American and Caribbean Center for Information in Health Sciences, Pan-American Health Organization, São Paulo, Brazil


The analysis of mortality data using country-wide data is very important for many epidemiological and public health projects at all levels of the health care and prevention system, particularly in developing countries, where data on morbidity are usually scarce. Many institutions and persons at the local, national and international levels are permanently interested into collecting, sorting, retrieving and analyzing the demographic, personal and nosological data available through the official certificates of death. However, although many countries have centralized computers containing mortality data, from where annual reports and customized searches and analyses can be made, large geographic distances, high costs, budgetary constraints and other factors hinder considerably the remote use of such data.

After analyzing these aspects, and aiming at a different concept of database access by remote end-users, the National Division of Epidemiology of the Brazilian Federal Ministry of Health started a project, in collaboration with the Center of Biomedical Informatics of the State University of Campinas, State of S<176>o Paulo, the School of Public Health of the University of S<176>o Paulo, and the Latin-American and Caribbean Center for Information in Health Sciences, Pan-American Health Organization, S<176>o Paulo, to develop and distribute: 1) Computer-readable copies of the national mortality database, using a standard format, either in its entirety (by means of a CD-ROM (Compact Disk Read Only Memory) optical disk) or in logically-divided subsets (by means of magnetic 5 1/4" diskettes); both forms being readable by standard IBM-PC-compatible microcomputers; 2) A user-friendly software package for microcomputers, provided with the capability of simple statistical analysis of the distributed mortality database, such as the production of cross-tables, time series and graphs.

The mortality database was produced by downloading the mainframe-based records to a IBM-PC microcomputer connected to a micro-to-mainframe terminal emulator card. It comprised all the original DC records from 1979 to 1986, in a excess of six million records, and with a total size of almost 550 Mbytes. Two databases were produced: one, which was called Reduced Set, had the 14 most important variables recorded onto the DC. The other database, called Complete Set, had the full set of 39 variables. The recording format was a card-image, fixed-format numerically coded ASCII string, so that it could be read by the most common microcomputer software packages.

The microcomputer software was written in the compiled Turbo BASIC language (Borland International, Inc.), for a minimal hardware platform consisting of a MS-DOS-based 16-bit microcomputer, PC-XT compatible, with 640 Kbyte of RAM, a 5 1/4" 360 Kbyte diskette drive, a 20 Mbyte or larger hard disk, CGA video graphics and a graphical 80-column impact matrix printer. The software has the following characteristics: 1) User friendliness, achieved through a simple, point-and-choose, menu-based text interface; as well as interactive auxiliary code explanation files; 2) Speed, achieved through several programming techniques. Maximum performance was measured at ca. 130.000 records/minute in the simplest analytical mode, in a PC 386 with mathematical coprocessor. The resulting graphs and tables may be sent, on user's choice, either to the screen, printer or disk file. In the last case, they can be easily incorporated into word processing files. The last analysis done can be reviewed in the screen, without the need to repeat them. The scheme of the database is stored into a separate ASCII configuration file. It contains the name of the fields, its length, admissible codes and ranges, etc. The same happens with a file containing error, warning and prompt messages used by the program. Thus, it is quite easy to customize the software for other idioms or mortality databases with different layouts. The program has also an option to produce other data files containing a subset of the distributed database, by means of selection criteria, which is useful to speed data analysis or to distribute data subsets. The program modules are: 1) Definition of the file set; 2) Selective listing of the database; 3) Simple frequency statistics and histogram; 4) Cross tables analysis; 5) Time series analysis and graphs; 6) Review of previous analyses; 7) Production of database subsets; and 8 ) Access to auxiliary code databases.

The diskette-based subsets of the national mortality databases were distributed to all 26 Brazilian State Secretaries of Health, where they are currently being employed for many purposes, but mainly for epidemiological investigations comprising the regions of interest of the Unified Health System offices. Moreover, an additional 80 copies of several customized subsets were distributed on demand to investigators and health authorities located in many cities, states, and even different countries. The CD-ROM-based full set of mortality records were distributed also to state authorities, medical and public health schools, as well as to investigators of other countries. Of the more than 200 copies produced, only about 30 were distributed, mainly because CD-ROM drives are not much disseminated in Brazil, and must be imported at higher costs than those usually found in developed countries. A new project is underway to fund a wider distribution of CD-ROMs, to update it with data on years 1987 to 1989, and to donate CD-ROM drivers to the user institutions.

The present initiative was greeted with great enthusiasm on the part of the users of mortality data in our country. This can be ascertained by the unexpected large number of persons and institutions which asked for the database. Not only they were made independent of the delays and vagaries involved in getting statistical analyses from the central computing facility of the Ministry of Health, but they were also able to "play" freely with the data, performing as many exploratory data analyses as desired, in an interactive manner. The fast analyses times contributed to that. Investigations that were not previously pursued due to the difficulties in gaining access to data, suddenly became possible. Moreover, the long-term computing demand by external users on the Ministry was considerably reduced.

Hardware requirements were kept simple and inexpensive, so that practically all users were able to find the proper resources. The easeness of operating the program, even by people without formal training in computing, contributed greatly to its popularity and real use. Training requirements were kept to a minimum.


Presented at:

II Conferencia Internacional de Informatica Medica, Congreso Internacional de Informatica, La Habana, Cuba, Febr. 1992
Return to HomePage Return to Abstracts Index

Last Updated: March 2, 1996

renato@sabbatini.com