Novell is now a part of Micro Focus

SQL Integrator: A Data Request Broker for Heterogeneous Data Access

Articles and Tips: article

STEVEN B. DAVIS, PH.D.
B2Systems, Inc.
5951 Encina Road
Santa Barbara, CA 93117

01 May 1998


http://www.b2systems.com

Find out why Novell recently licensed SQL Integrator from B2Systems to provide seamless access to data across a number of different database systems.

Introduction

SQL Integrator is a data request broker that provides a multiple-tier, multiple-database enterprise environment for connecting ODBC and JDBC compliant applications between client systems and database servers. Unlike conventional database drivers, which only access a single physical data source, SQL Integrator has a unique capability to access and join multiple data sources in a single database query. From the client viewpoint, SQL Integrator presents multiple distributed physical databases as a single logical database.

SQL Integrator is a product sold by Novell worldwide, under an OEM arrangement with B2Systems, Inc. This AppNote provides a conceptual introduction to SQL Integrator, describing its components and architecture. It then gives a brief description of how the product is installed and administered. It concludes with a look at programming aspects of SQL Integrator.

To learn more about SQL Integrator, contact your Novell Authorized Reseller or Novell at:

888-321-4272 http://www.novell.com

Components

SQL Integrator includes client, data broker, and data driver components. These components can reside on the same or different computer systems within a network.

  • The client components support the connection of SQL Integrator to end user applications or application development tools that use OBDC or JDBC. The client components also support a manager for maintaining the data dictionary and for direct interactive access of database tables.

  • The data broker components support client connectivity over a network, query parsing, optimization and distribution, and data joining across heterogeneous databases.

  • The data driver components support connectivity to physical databases or flat files.

The components can be summarized in the following table.


Component Name

Component Functionality

Client Components

SQL I - ODBC

Microsoft ODBC API

SQL I - JDBC

JavaSoft Java SQL classes

SQL I - MGR

Database Dictionary Manager and Interactive SQL

Data Broker Components

SQL I - DRB

Data Request Broker

SQL I - DD

Data Dictionary

Data Driver Components (local or remote to the Data Broker)

SQL I - ORACLE

Oracle Database Connection

SQL I - SYBASE

Sybase Database Connection

SQL I - MS-SQL

Microsoft SQL Server Database Connection

SQL I - DB2

IBM DB2 Database Connection

SQL I - INFORMIX

Informix Database Connection

SQL I - INGRES

Ingres Database Connection

SQL I - RDB

Oracle/RDB Database Connection

SQL I - FILES

Flat Files Database Connection

SQL I - REMOTE

SQL Integrator Remote Data Broker Connection

Architecture

The SQL Integrator architecture includes the client, data broker and data driver components listed in the above table. The architecture is designed for maximum flexibility in meeting the needs of enterprise wide application development and deployment. The components have been modularized so that they can be installed on multiple tiers within an enterprise network. The following examples illustrate several possible configurations. The examples show how a joined SQL query from a client is distributed by the data request broker to multiple databases.

Three Tier Application (Intranet or Internet)

A three tier application example is a Java applet running on a Windows system (client) that uses SQL I - JDBC to access a NetWare data broker (see Figure 1). The data broker accesses an Oracle database running on the same system (with a local data driver) and uses the network to access an MS-SQL database on a Windows NT server (using a remote data driver).

Figure 1: Three tier application.

Two Tier Application (Client-Server)

A two tier application example is a Visual Basic application running on a Windows system that uses SQL I - ODBC to access an HP-UX data broker that is also running Oracle and Sybase databases (using local data drivers), as shown in Figure 2.

Figure 2: Two tier application.

Single Tier Application (Single System)

A single tier application example is a Java applet running on an VAX OpenVMS system (client) that uses SQL I - JDBC to access RDB tables and flat RMS files on the same VAX system (data broker and local data driver), as shown in Figure 3.

Figure 3: Single tier application.

Architecture Diagram

The complete SQL Integrator component architecture is shown in Figure 4.

Figure 4: SQL Integrator component architecture.

In general, the data drivers can be configured on the same node as the data broker (local) or on a different node (remote), depending on the location of the physical database. The one exception is the data driver for flat files, which can only be configured for local access.

Client Components

SQL I - ODBC. ODBC (Open Database Connectivity) is an API (Application Programming Interface) developed by Microsoft. ODBC is Microsoft's implementation of the X/Open SQL CLI (Call Level Interface) that defines how client/server interactions are implemented for database applications. ODBC also supports the SQL grammar and syntax specified in the ANSI SQL-92 standard.

ODBC was developed to provide vendor neutral access to data sources from a client application. With ODBC, a client application is not restricted to any one vendor's proprietary interface, and the client application can connect to any ODBC-compliant data source.

SQL I - ODBC is a Microsoft compliant implementation of the ODBC API (version 2.5). As such, any client application or application development environment can connect to SQL I - ODBC. SQL I - ODBC provides an enhanced ODBC API that can access database tables from multiple heterogeneous networked databases and can join heterogeneous database tables in a single query.

SQL I - JDBC. JDBC (Java Database Connectivity) is a vendor neutral API developed by JavaSoft. The interface is very similar in concept to the Microsoft ODBC API and provides Java programmers with a uniform database interface to a wide range of relational databases. JDBC is a standard part of Java and is included in JDK (Java Development Kit) 1.1. Like ODBC, a Java JDBC application is not restricted to any one vendor's proprietary interface, and the Java application can connect to any JDBC-compliant data source.

SQL I - JDBC is a 100% Java JDBC compliant implementation of the JDBC API (version 1.1). Any Java application, applet, or application development environment can connect to SQL I - JDBC. SQL I - JDBC provides enhanced JDBC classes that can access database tables from multiple heterogeneous databases and can join heterogeneous database tables in a single query.

Dictionary Manager and Interactive SQL. SQL Integrator supports access to multiple physical databases by maintaining a data dictionary. The data dictionary resides on the same node as the SQL Integrator data broker components, which can be the same or different node as the other SQL Integrator components. This data dictionary has a metadata catalog with entries that point to physical database metadata.

A client application built using SQL Integrator will be independent of the physical database and can join heterogeneous database tables. The application is only aware of the data dictionary metadata information (table names, column names, etc.) and not the physical source of the data. The physical database source can be changed, for example, from local files to Oracle, or from Oracle to MS-SQL, with no change required in the application.

SQL I - MGR is an SQL Integrator client application for creating and maintaining SQL Integrator data dictionaries, for importing physical metadata information, and exporting SQL definitions. SQL I - MGR can also create and maintain local or remote flat files (sequential or indexed) for use in database applications where a physical database is not required or not available. Flat files can also be used as readily available data source for an application under development, especially if the metadata structure is also under development. A sample SQL I - MGR screen showing cataloged metadata is shown in Figure 5.

Figure 5: Sample SQL I-MGR screen.

SQL I - MGR also supports Structured Query Language (SQL), which is a well-defined ANSI (American National Standards Institute) standard language for database access. Database vendors strive to support the grammar and syntax of the SQL language, but inevitably change the implementation to support database specific features.

SQL I - SQL is a version of SQL that is based on the ANSI SQL-92 standard and that is database independent. SQL I - SQL eliminates the need to learn each vendor's dialect of SQL, not all of which are SQL-92 compatible.

SQL I - SQL also provides enhanced datatype support that may be missing from a vendor-specific database. SQL I - SQL is useful for interactive query and update of physical database tables and local files defined in the data dictionary, and can be used to test SQL syntax before embedding it in an application.

Data Broker Components

Data Dictionary. The SQL Integrator data dictionary resides on the same node as the SQL Integrator data broker components. The data dictionary has a metadata catalog, and the entries in the catalog point to the physical database tables or database views. Database views are very useful when the application only needs a subset of fields in a table, or the application requires frequent joins of multiple database tables.

The data dictionary can also have multiple entries (synonyms) for the same database table or view. Synonyms can be very useful in database applications that require self-joins, like in a parts explosion. In the discussion which follows, all references to physical database tables include views and synonyms.

Metadata information stored in the data dictionary includes the following information for each physical database table:

  • Table name

  • Column name, datatype, nullability

  • Primary and foreign keys

  • Indexes

  • Database type (Oracle, MS-SQL, and so on)

  • Database name and location

This metadata can be imported from a physical database to the data dictionary. The data dictionary also supports local files as a source of relational data.

The SQL Integrator data dictionary supports client applications that are truly database independent. The physical database can be changed without any effect on the application. In contrast, even though ODBC and JDBC provide vendor neutral programming interfaces, these interfaces themselves are not database independent. For example, some database vendors do not support ANSI SQL datatypes for timestamp and time intervals. Consequently, applications built with vendor supplied interfaces must be aware of which datatypes are supported. In contrast, SQL Integrator is truly database independent. If an ANSI SQL datatype is not supported by the physical database, the datatype is emulated by the data dictionary and mapped to an appropriate physical database type.

Data Request Broker. SQL Integrator contains a data request broker which is a sophisticated SQL processor, optimizer and distributor. The SQL processor supports full read/write capability for all database connections. SQL processing begins by decomposing an SQL statement (which may have embedded subqueries) into its constituent parts:

  • Data query statements (read data)

    • select

  • Data manipulation statements (write data)

    • update

    • insert

    • delete

Data queries are processed first, since the results may be needed for a data manipulation statement. For example, a statement like:

update emp set birthday = (select birthday from personnel where empno = 1234)

would require the select statement to be processed first, since the result is needed for the update statement.

Data Query Processing. SQL I - DRB can decompose and distribute data queries as close to the data source as possible. SQL I - DRB has the ability to decompose a query that includes tables from multiple databases into subqueries for each database, which are then executed within each database for maximum performance and minimum network traffic. SQL I - DRB also performs query pushdown and query optimization.

Query Pushdown. Decomposition of a query also leads to decisions regarding how much of the query can be pushed down into the physical database. Examples of pushdown operations are aggregates , join clauses, where clauses, and order by clauses. The more homogeneous the query (for example, strictly involving one database), the more SQL operations that can be executed in the physical database. SQL I - DRB will determine how much of the query can be sent directly to the database and how much must be processed by SQL I - DRB. SQL I - DRB attempts to push as much processing down to the underlying database as possible to maximize the overall performance.

Push down processing reduces network traffic by reducing data flow across the network. For example, the SUM aggregate is pushed down into the database if all the fields in the summation come from the same database. Only the result is returned across the network. If the summation was not pushed down, then all of the field values would be returned across the interface, which would result in increased network traffic.

Query Optimization. SQL I - DRB has the capability to rewrite a database query based on keys and indices in the physical database. SQL I - DRB also distributes a heterogeneous query involving more than one database based on the cardinality and selectivity of the tables involved in the query. This information determines the order and sequence of interactions with the various databases. The primary goal of the query optimization is to reduce the bandwidth requirements between SQL Integrator and the underlying physical databases.

Finally, SQL I - DRB will rewrite the query as necessary to handle database specific SQL dialect issues. For example, conversion of a date from a string (as used in a qualifier) to an Oracle date will use the Oracle TO_DATE function, and conversion to an MS-SQL date will use the MS-SQL CONVERT function.

Data Manipulation Processing. After the data query processing finishes, database change processing starts, perhaps using intermediate results from the query processing. For these commands, the SQL statement is also decomposed, processed, and forwarded to the physical database. The processing may include rewriting the statement for specific SQL dialects.

Data Driver Components

The SQL Integrator Data Request Broker connects to physical databases using the SQL Integrator Data Driver components. There is a data driver component for each supported database (or for flat files). The database drivers connect to the physical databases in one of three ways:

  • SQL Integrator direct connection to vendor supplied connectivity library

  • SQL Integrator network connection to vendor supplied connectivity library

  • SQL Integrator network connection to a remote SQL Integrator Data Request Broker

Direct Connection to Vendor Library. The most direct form of connection is to use the vendor-supplied connectivity library running on the same computer system as the SQL Integrator data broker and data driver components. The vendor library will then connect to the vendor database, which is executing on a database server that is the same computer system or another computer system.

The following table shows the SQL Integrator database drivers and the respective vendor libraries.


Database Component
Component Name
Vendor Library

SQL I - ORACLE

Oracle Database Connection

OCI

SQL I - SYBASE

Sybase Database Connection

DBLIB

SQL I - MS-SQL

Microsoft SQL Server Database Connection

DBLIB (Windows only)

SQL I - DB2

IBM DB2 Database Connection

DDCS

SQL I - INFORMIX

Informix Database Connection

NETLIB

SQL I - INGRES

Ingres Database Connection

INET

SQL I - RDB

Oracle/RDB Database Connection

DSRI (OpenVMS only)

SQL I - FILES

Flat Files Database Connection

--

A vendor library will only be available for a direct connection to the physical database if it runs on the same computer system as the SQL I-DRB. Not all vendor libraries are available on all types of platforms.

Network Connection to Vendor Library. If the vendor-supplied connectivity library is not available on the same node as the SQL Integrator data request broker, then the data request broker can connect across the network to an SQL Integrator database driver running on a remote system. The remote data driver then connects to the vendor supplied connectivity library running on the same remote system.

Network Connection to Remote Data Request Broker. In large enterprise systems, there could be a requirement to interconnect SQL Integrator Data Request Brokers. For example, there a group of financial applications in one location and a geographically distant group of manufacturing applications in a different location. Each group uses a separate data dictionary, but there is a corporate requirement to join financial data and manufacturing data in a single application for reporting purposes.

This requirement can be met using SQL I - REMOTE, which allows access to enterprise data which is organized into several data dictionaries. SQL I - REMOTE allows SQL I - DRBs to be interconnected as shown in Figure 6.

Figure 6: Interconnecting data request brokers.

Usage Overview

The steps required to use SQL Integrator will vary from site to site and within systems at a given site. These steps include the following:

  • DBA (Database Administrator) activities

    • Installation

    • Configuration

    • Testing

  • Application Development Activities

    • Programming

    • Testing

Database Administration

Installation. SQL Integrator installation should be planned around which computer systems are client systems, which systems are application or web servers, and which systems are database servers. Use of SQL Integrator may require third-party software such as ODBC driver managers, Java development kits, database vendor connectivity software, and database server software.

Configuration. SQL Integrator is configured using the client components SQL I - MGR and SQL I - SQL. These components run on a client system and access data dictionaries on a data broker system. The data dictionaries can then access data from physical databases.

SQL I - MGR is used to create the data dictionaries and to import metadata information from physical databases on database servers. SQL I - MGR can also be used to create relational files for use in database and application prototypes.

SQL I - SQL is used to query the data dictionary tables (and hence the physical database tables) using ANSI standard SQL statements. Thus the data dictionary can be tested before a client application is available.

SQL I - SQL can also be used for data migration and replication, even across heterogeneous databases. Since physical database tables from heterogeneous databases can be catalogued in a single data dictionary, relatively simple SQL statements (such as insert into mssql_table select * from oracle_table) can be used to migrate data from one database to another. The use of ANSI standard SQL datatypes allows migration (even joins) of dissimilar physical datatypes, such as Oracle and MS-SQL dates.

Testing. There is a sample data dictionary and several ODBC and JDBC programs that are supplied for testing purposes. These programs can test a database connection and retrieve joined data (between departments and employees). The dictionary and sample programs are supplied with source code (SQL, C, and Java) to provide a starting point for programmers who will be developing SQL Integrator applications.

Application Development

Programming. Once the data dictionary and data access has been tested, application development can begin. Application development typically uses one of the following environments:

  1. Application development tools that support ODBC, for example:

    • Microsoft Access

    • Microsoft Visual Basic

    • Powersoft PowerBuilder

    • Borland Delphi

  2. Application development tools that support JDBC, for example:

    • Borland JBuilder

    • Powersoft PowerJ

    • Symantec Visual Cafe

  3. Programming languages that call the ODBC API or Java SQL classes.

    • Microsoft Visual C++

    • Watcom C++

    • Microsoft Visual J++

Regardless of the choice of application development environments, the use of SQL Integrator is transparent because of the ODBC and JDBC standards.

Testing. The application tools and programming languages have testing capability included during the development process. In addition, SQL I - SQL may be used to execute SQL statements in isolation from an application. These SQL statements can be tested for syntax, performance and efficiency. Database timing and database traces can be used to look for performance bottlenecks such as joins using non-unique indexes, sequential searches through large tables, and failure to push down SQL statements from the application to the physical database.

Summary

SQL Integrator is a data access data request broker environment that provides a multiple-tier, multiple-database enterprise environment for connecting ODBC and JDBC compliant applications to heterogeneous database sources. The product includes client components for connecting to ODBC and JDBC environments and for maintenance and testing of SQL Integrator data dictionaries. It also includes data broker components for query parsing, optimizing, and distribution even among heterogeneous databases, as well as data driver components for connecting to vendor databases in the absence of a vendor connectivity library, thus providing additional connectivity not supplied by database vendors. SQL Integrator also supports data migration and/or replication across heterogeneous databases.

* Originally published in Novell AppNotes


Disclaimer

The origin of this information may be internal or external to Novell. While Novell makes all reasonable efforts to verify this information, Novell does not make explicit or implied claims to its validity.

© Copyright Micro Focus or one of its affiliates