hot info

Common Data Styles

on Senin, 15 Juni 2009

Common Data Styles
What does XML data look like? Three popular ways of modeling machine-readable information are

The tabular style, familiar from spreadsheets

The graph style, familiar from relational databases and the World Wide Web

The hierarchical style, familiar from computer file systems

XML can use any of the three styles, but it is optimized for the hierarchical (or tree) style. Database specialists are particularly fond of the graph style, as it simplifies database import and export and is most suited for fully normalized data. Developers of desktop applications often prefer the tabular style, especially for initialization files, as it is easy to set up; some use simple lists, which are equivalent to a single-column table. This section looks at the strengths and weaknesses of all three styles.

4.3.1. The Tabular Style
At their best, tables are a space-efficient way of representing structured information for machines or for humans. A column in a table represents a labeled fieldone piece of information about a thingwhereas each row represents all the fields for the same thing, as in Table 4-1.

Table 4-1. Simple Data Table Employee
ID
Title
Unit
Specialization
Years

Janet Mulville
e000234
Senior Consultant
Database
Data modeling
9

Ahmed Said
e000345
Project Manager
Systems
System integration
11

Julie Fujikawa
e009122
Intern
Systems
System integration
1





The arrival of the consumer spreadsheet with VisiCalc in 1979 gave end users their first chance to work with structured information directly, and its tabular format proved easy to understand and work with. Twenty-five years later, the spreadsheet is still at the center of many business applications; in fact, much business software is nothing more than customizations built on top of Microsoft's Excel spreadsheet.

The tabular style is especially effective when every data object has roughly the same kinds and quantities of information. But this style becomes awkward quickly when different objects have different kinds of information or when information can repeat itself. In Table 4-1, it could be that the more senior employees have more than one specialization; for example, Janet Mulville might also have 5 years of experience with C++ programming and 3 years of experience with application server design. How would that fit into this table? The initial solution is to start repeating the Specialization and Years columns, as in Table 4-2.

Table 4-2. Messy Data Table Employee
ID
Title
Unit
Spec-1
Years-1
Spec-2
Years-2
Spec-3
Years-3

Janet Mulville
e000234
Senior Consultant
Database
Data modeling
9
C++
6
Appl. servers
3

Ahmed Said
e000345
Project Manager
Systems
System integration
11

Julie Fujikawa
e009122
Intern
Systems
System integration
1





Because the structure is a table, Ahmed Said and Julie Fujikawa have blank fields for additional specializations hat they do not have. If a new employee arrives with five specializations, the table will add columns for all the rows, so again, all other users will have more unnecessary blank fields. When Julie is temporarily assigned to the Database unit for 50 percent of her time, an additional Unit column will be needed as well, and so on, until it becomes extremely difficult for a person or a machine to make much sense of the information.

XML has techniques to make tabular information a little more readable and efficient, however. As a starting point, consider Listing 4-1, which is a direct rendition into XML of the information in Table 4-2.

Listing 4-1. Raw Table in XML


Employee Information





Employee
ID
Title
Unit
Spec-1
Years-1
Spec-2
Years-2
Spec-3
Years-3







Janet Mulville
e000234
Senior Consultant
Database
Data modeling
9
C++
6
Appl. servers
3


Ahmed Said
e000345
Project Manager
Systems
System integration
11







Julie Fujikawa
e009122
Intern
Systems
System integration
1












XML can improve on that representation in several ways while still staying in the spirit of the tabular approach. First, because XML already labels every element, a header row labeling the columns is not needed; instead, the column labels can appear as element names. Then, once each entry is labeled, the blank ones need not be included.[2] The result is the much more readable and space-efficient XML table in Listing 4-2, although it is still far from ideal XML markup.

[2] In fact, the blank entries in Listing 4-1 could have been left out anyway, as they are all trailing; however, any nonblank entries at the ends of the rows would have had to be included.

Listing 4-2. Readable Table in XML



Janet Mulville
e000234
Senior Consultant
Database
Data modeling
9
C++
6
Appl. servers
3



Ahmed Said
e000345
Project Manager
Systems
System integration
11



Julie Fujikawa
e009122
Intern
Systems
System integration
1






0 komentar:

Posting Komentar