S1: Ch. 22: XML and Databases
- Learning Goals:
- Learn about DBMS relevant XML standards
- Compare XML with Database Concepts
- Topics:
- Motivation
- Conceptual Data Model - XML Schema
- Logical Data Model - XML, XPath
- Physical Data Model - storage models
- Source: Textbook Chapter 26, Articles from Web
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S2: Generic Motivation
- Many web-based app. servers use a dedicated DBMS as a XML store
- Store XML objects,e.g. documents, messages
- ...Ex. Fig. 26.3 XML element < project > , pp. 847
- ...Exercise: Contrast XML and HTML (Fig. 26.2, pp. 845)
- Retreive a couple of XML object as per client request
- During execution, they manipulate XML data internally
- ...and send those to web browsers for display and manipulation!
- Example Application Domains
- Content management, e.g. Legal Documents
- Work-flow management within/across organizations
- ...Web-based EDI - XML based communication across organization
- Workload assumptions
- Database is dedicated to web-based applications
- Limited set of queries - get/put/xpath query XML documents
- Database is shared among only web-apps, not exposed to users
- Comparison with relational model assumptions
- Applications: what-if analysis, data mining, ...
- Human may interact with DBMS directly or via DSS tools
- Unlimited set of ad-hoc queries (e.g. 3NF)
- Database may be shared across multiple applications
- Optimizing DBMS dedicated to one application
- Denormalize schema to favor workload
- Reduce semantic gap with the application domain
- ...Add or map ontologies of databases and application software
- ...If application is web-centric, then SQL should add XML concepts
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S3: An Example - EDI
- Business Goal: Electronic Data Interchange
- Reduces delays, costs, and errors
- Ability to react quick to market conditions
- Reduce inventory costs via just-in-time management
- EDI (Electronic Data Interchange)
- Purpose: allows businesses to exchange data
- ...via a electronic messaging services
- Provides
- ...a collection of standard message formats
- ...a standard element dictionary
- EDI Elements
- ...Business Language
- ...Business Practices
- ...Trading Partner Profile
- ...Logging and Archiving
- ...Acknowledgements
- ...Application APIs
- ...Transaction Expertise
- ...Message Standards
- Problem: Data communication among DBMS across a supplier chain
- Entities: Merchandise, Purchase Orders, Shipping Order,
- ...Delivery Receipts, Invoices, Payments
- Organizations: Retailers, Vendors, Shippers
- Each organizations has its own DBMS
- ...diverse relational schemas
- ...diverse brand-names DBMS software
- ...diverse data-type associated with common data-item, e.g. date
- Issues - Domain Specific
- Federated Schemas, e.g. element dictionary
- Standard message formats for data exchange
- Communication Protocols
- Translators between SDEF and local DBMS
- Approaches - domain specific standards
- pre-internet: standard file formats, communication protocols
- post-internet: XML data format, web-style http protocol
- XML is a language to define standard data-structures
- Pros: web-compliant, getting popular
- Cons: text format, usually large
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S4: XML and XML/EDI
- XML (eXtensible Markup Language)
- a file format for representing data
- a schema for describing data-structures
- Mechanism to extend and annotate HTML with semantic info.
- XML Terminology
- Tagging Standard
- Script Attachments
- Transaction Validation
- Search Techniques
- Linking and Reference
- Multimedia
- WWW
- Authoring Tools
- %XML/EDI
- A Standard to exchange EDI data
- ...e.g. invoices, project status
- Search, decode, manipulate, display data from
- ...Database, catalogs, workflow document, message
- ...messages, transaction (exchanged via an API)
- Implement business language, rules and objects
- ...EDI dictionaries
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S5: Conceptual Data Models
- Choices
- DTD - older, popular, weak in data modeling
- ...Figure 26.4, pp. 848, DTD for projects
- XML Schema - new, richer data modeling like ERDs
- ...Ex. Figure 26.5, pp. 850-3, XML Schema for company
- XML Schema Basics:
- grammar based syntax with following concepts
- schema, element, attributes
- Type System: Types - simpleType, Complex
- ...constructors - sequence, compositor
- ...group (of elements), attributeGroup
- Constraints - unique, key, keyref
- ...Null and default values
- Modulatization: abstract data types, namespaces
- Documentation and other annotations
- Ref.: www.xml.com/pub/a/2000/11/29/schemas/part1.html
- Q. Study latest draft of XML schema standard from web source,
- e.g. http://www.xml.com/pub/a/2000/11/29/schemas/part1.html
- Does it model inheritance ?
- Does it provide diagramming conventions to support analysis ?
- Converting EER to XML - Issues
- XML looks for a hierarchy of entities
- Many hierarchies may be possible
- ...Ex. choices for roots - Fig. 26.8, 26.10, 26.12
- ...Choices for XML schemas: Fig. 26.9, 11, 13
- EER can have Cycles, e.g. Figure 26.6 (pp. 857)
- ...Breaking the cycle, Fig. 26.13 (pp. 862)
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S6: Conceptual Data Model: Comparison with EER
- Compare EER diagram and XML Schema for company database
- EER diagram: Fig 3.2 pp. 54, Fig 4.1 pp. 87,
- XML Schema (Fig. 26.5, pp. 850-3)
- XML vs. EER - detailed comparison
- Pros: modularization primitives, e.g. namespace
- Cons: Does not model relationships well!
- ...Too detailed like SQL3, e.g. ADTs type constructors
- ...Relationship name, min/max cardinality constraints
- ...identifying relationships, weak entities
- ...disjoint/overlapping sub-classes
- XML Schema vs. EER - High level comparison
- XML Schema - web-based comm. among applications
- ...text based, nested tree like syntax
- ...Readable by programmers but not by business users
- ...Likely to be processed by diverse computer programs
- ...Data transfer among multiple applications
- EER - communication b/w analyst and business users
- ...readable by business users, graphic notation ,
- ...not easily processed by diverse computer programs
- ...abstract - omits details e.g. attribute data-types
- ...Shared database across multiple applications
- Should one compare XML with EER ?
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S7: Logical Data Model
- XML vs. Relational Model - High level comparison
- XML - web-based comm. among applications
- ...Goal: structure and exchange documents via web
- ...More of a data format than a data model
- ...Schema - explicit specification is optional
- ...Schema may be implicit, partial, incomplete, and may change
- ...Schema tags are duplicated across document
- Relational Model-
- ...Goals Store large volume of data, enable efficient access
- ...Ensure data quality via integrity constraints, e.g. domain, ref.
- ...Schema is mandatory, stored in a DBMS
- ...Tuples are checked for consistency with schema before insertion
- ... Schema evolution requires revision of legacy data
- ...Schema stored once in a DBMS
- An Example domain-specific XML
- XML and Relational Model - Detailed Comparison
- Pros:
- ...Web-based data transfer, may become popular!
- Cons:
- ...Relationships are represented by predicates
- ...Hierarchical - awkward for cycles
- ..., e.g. two relationships between a entity pair
- ...Inheritance: no distinction b/w part-of and sub-type
- ...Normal Forms - not popular yet with XML Schema!
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S8: Query Languages
- XML Standards
- XPath - tree transformation language
- Basic Elements of XPath
- Path Expressions
- Example - Figure 26.14, pp. 863
- FLWR expressions
- Example - Figure 26.15, pp. 864
- Comparison with relational algebra
- + Tree traversal for pre-materialized join
- - Joins, ad-hoc queries
- ...Relational algebra - declarative query language
- ...Normal Forms - allow ad-hoc querying for what if analysis
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)
S9: Physical data storage
- Storage Manager Choices
- Relational Storage Manager
- ...Exploring XML documents to tables
- ...Extracting XML documents from tables
- Object Storage Manager
- Native Storage Data Manager
- Indexing
- Query Processing
Copyright: S. Shekhar, C. S. Dept.,
University of Minnesota, Minneapolis, MN 55455. ,,.(home)