S1: Ch. 22: XML and Databases
Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S2: Generic Motivation
  • Many web-based app. servers use a dedicated DBMS as a XML store
    • Store XML objects,e.g. documents, messages
    • ...Ex. Fig. 26.3 XML element < project > , pp. 847
    • ...Exercise: Contrast XML and HTML (Fig. 26.2, pp. 845)
    • Retreive a couple of XML object as per client request
    • During execution, they manipulate XML data internally
    • ...and send those to web browsers for display and manipulation!
  • Example Application Domains
    • Content management, e.g. Legal Documents
    • Work-flow management within/across organizations
    • ...Web-based EDI - XML based communication across organization
  • Workload assumptions
    • Database is dedicated to web-based applications
    • Limited set of queries - get/put/xpath query XML documents
    • Database is shared among only web-apps, not exposed to users
  • Comparison with relational model assumptions
    • Applications: what-if analysis, data mining, ...
    • Human may interact with DBMS directly or via DSS tools
    • Unlimited set of ad-hoc queries (e.g. 3NF)
    • Database may be shared across multiple applications
  • Optimizing DBMS dedicated to one application
    • Denormalize schema to favor workload
    • Reduce semantic gap with the application domain
    • ...Add or map ontologies of databases and application software
    • ...If application is web-centric, then SQL should add XML concepts


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S3: An Example - EDI
  • Business Goal: Electronic Data Interchange
    • Reduces delays, costs, and errors
    • Ability to react quick to market conditions
    • Reduce inventory costs via just-in-time management
  • EDI (Electronic Data Interchange)
    • Purpose: allows businesses to exchange data
    • ...via a electronic messaging services
    • Provides
    • ...a collection of standard message formats
    • ...a standard element dictionary
    • EDI Elements
    • ...Business Language
    • ...Business Practices
    • ...Trading Partner Profile
    • ...Logging and Archiving
    • ...Acknowledgements
    • ...Application APIs
    • ...Transaction Expertise
    • ...Message Standards
  • Problem: Data communication among DBMS across a supplier chain
    • Entities: Merchandise, Purchase Orders, Shipping Order,
    • ...Delivery Receipts, Invoices, Payments
    • Organizations: Retailers, Vendors, Shippers
    • Each organizations has its own DBMS
    • ...diverse relational schemas
    • ...diverse brand-names DBMS software
    • ...diverse data-type associated with common data-item, e.g. date
  • Issues - Domain Specific
    • Federated Schemas, e.g. element dictionary
    • Standard message formats for data exchange
    • Communication Protocols
    • Translators between SDEF and local DBMS
  • Approaches - domain specific standards
    • pre-internet: standard file formats, communication protocols
    • post-internet: XML data format, web-style http protocol
  • XML is a language to define standard data-structures
    • Pros: web-compliant, getting popular
    • Cons: text format, usually large


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S4: XML and XML/EDI
  • XML (eXtensible Markup Language)
    • a file format for representing data
    • a schema for describing data-structures
    • Mechanism to extend and annotate HTML with semantic info.
  • XML Terminology
    • Tagging Standard
    • Script Attachments
    • Transaction Validation
    • Search Techniques
    • Linking and Reference
    • Multimedia
    • WWW
    • Authoring Tools
  • %XML/EDI
    • A Standard to exchange EDI data
    • ...e.g. invoices, project status
    • Search, decode, manipulate, display data from
    • ...Database, catalogs, workflow document, message
    • ...messages, transaction (exchanged via an API)
    • Implement business language, rules and objects
    • ...EDI dictionaries


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S5: Conceptual Data Models
  • Choices
    • DTD - older, popular, weak in data modeling
    • ...Figure 26.4, pp. 848, DTD for projects
    • XML Schema - new, richer data modeling like ERDs
    • ...Ex. Figure 26.5, pp. 850-3, XML Schema for company
  • XML Schema Basics:
    • grammar based syntax with following concepts
    • schema, element, attributes
    • Type System: Types - simpleType, Complex
    • ...constructors - sequence, compositor
    • ...group (of elements), attributeGroup
    • Constraints - unique, key, keyref
    • ...Null and default values
    • Modulatization: abstract data types, namespaces
    • Documentation and other annotations
    • Ref.: www.xml.com/pub/a/2000/11/29/schemas/part1.html
  • Q. Study latest draft of XML schema standard from web source,
    • e.g. http://www.xml.com/pub/a/2000/11/29/schemas/part1.html
    • Does it model inheritance ?
    • Does it provide diagramming conventions to support analysis ?
  • Converting EER to XML - Issues
    • XML looks for a hierarchy of entities
    • Many hierarchies may be possible
    • ...Ex. choices for roots - Fig. 26.8, 26.10, 26.12
    • ...Choices for XML schemas: Fig. 26.9, 11, 13
    • EER can have Cycles, e.g. Figure 26.6 (pp. 857)
    • ...Breaking the cycle, Fig. 26.13 (pp. 862)


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S6: Conceptual Data Model: Comparison with EER
  • Compare EER diagram and XML Schema for company database
    • EER diagram: Fig 3.2 pp. 54, Fig 4.1 pp. 87,
    • XML Schema (Fig. 26.5, pp. 850-3)
  • XML vs. EER - detailed comparison
    • Pros: modularization primitives, e.g. namespace
    • Cons: Does not model relationships well!
    • ...Too detailed like SQL3, e.g. ADTs type constructors
    • ...Relationship name, min/max cardinality constraints
    • ...identifying relationships, weak entities
    • ...disjoint/overlapping sub-classes
  • XML Schema vs. EER - High level comparison
    • XML Schema - web-based comm. among applications
    • ...text based, nested tree like syntax
    • ...Readable by programmers but not by business users
    • ...Likely to be processed by diverse computer programs
    • ...Data transfer among multiple applications
    • EER - communication b/w analyst and business users
    • ...readable by business users, graphic notation ,
    • ...not easily processed by diverse computer programs
    • ...abstract - omits details e.g. attribute data-types
    • ...Shared database across multiple applications
  • Should one compare XML with EER ?


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S7: Logical Data Model
  • XML vs. Relational Model - High level comparison
    • XML - web-based comm. among applications
    • ...Goal: structure and exchange documents via web
    • ...More of a data format than a data model
    • ...Schema - explicit specification is optional
    • ...Schema may be implicit, partial, incomplete, and may change
    • ...Schema tags are duplicated across document
    • Relational Model-
    • ...Goals Store large volume of data, enable efficient access
    • ...Ensure data quality via integrity constraints, e.g. domain, ref.
    • ...Schema is mandatory, stored in a DBMS
    • ...Tuples are checked for consistency with schema before insertion
    • ... Schema evolution requires revision of legacy data
    • ...Schema stored once in a DBMS
  • An Example domain-specific XML
  • XML and Relational Model - Detailed Comparison
    • Pros:
    • ...Web-based data transfer, may become popular!
    • Cons:
    • ...Relationships are represented by predicates
    • ...Hierarchical - awkward for cycles
    • ..., e.g. two relationships between a entity pair
    • ...Inheritance: no distinction b/w part-of and sub-type
    • ...Normal Forms - not popular yet with XML Schema!


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S8: Query Languages
  • XML Standards
    • XPath - tree transformation language
  • Basic Elements of XPath
    • Path Expressions
    • Example - Figure 26.14, pp. 863
    • FLWR expressions
    • Example - Figure 26.15, pp. 864
  • Comparison with relational algebra
    • + Tree traversal for pre-materialized join
    • - Joins, ad-hoc queries
    • ...Relational algebra - declarative query language
    • ...Normal Forms - allow ad-hoc querying for what if analysis


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S9: Physical data storage
  • Storage Manager Choices
    • Relational Storage Manager
    • ...Exploring XML documents to tables
    • ...Extracting XML documents from tables
    • Object Storage Manager
    • Native Storage Data Manager
  • Indexing
    • Path expression index
  • Query Processing


Copyright: S. Shekhar, C. S. Dept., University of Minnesota, Minneapolis, MN 55455. ,,.(home)