Page Model


ONE_HTML_PAGE ::= List( BUSINESS )

BUSINESS ::= < Name, Address, City, State, AreaCode, Phone >


NOTE:
each HTML page is seen as a list of businesses, and each business description consists of the 6 items Name, Address, City, State, AreaCode, and Phone. In database terms, an HTML page represents a table with six columns (i.e., Name, Address, City, State, AreaCode, Phone).

Items to be extracted: Name, Address, City, State, AreaCode, Phone.


SAMPLE EXTRACTION OUTPUT: as the pages have a highly regular structure (i.e., the 6 items are always present and always appear in the same order), and, furthermore, each document contains several descriptions of businesses, we present a single example of BigBook document.