How to customize a rest-based Service ODBC driver
REST Service can help developers in a simple uniform interface provide Service to end users.Data analysis application scenario, however, some mature data analysis tools (such as Tableau, Excel, etc.) requires the user to provide the ODBC data source, in this case, the REST Service cannot meet all the user demand for the use of the data.This article from the perspective of implementation was introduced in detail how to existing REST Service, on the basis of finishing the development of a custom ODBC driver.Article focuses on the realization of the ODBC driver principle, combining with the code details the ODBC data interaction between with REST Service, and at the end of the article introduces the principle of ODBC client calls ODBC API, as well as the actual development of the construction of the debugging environment.
May benefit from the reader
The current mainstream data analysis tools, such as the Tableau, Microstrategy, can only be ODBC Driver, excel to access to the underlying data source.That is to say, in the process of development and a database or data warehouse, even if we have achieved in line with the SQL standard data access interface, even provides its own JDBC driver, still there is no guarantee that the data users to effectively use our data.To this end, we need additional to customize an ODBC data source Driver.
If your data is exactly similar mongo, Hbase such common database products, maybe you can consider to purchase some commercial products directly from, for example, Simba ODBC Driver to solve your need, but it will mean a big spending.More difficult is the source of data for you is not the mainstream, haven't any Driver can directly purchase can be applied to it, then customize a own ODBC Driver might be your best choice.Even if you are an ignorant of the ODBC Driver developers, this article will bring you more or less help.
In a nutshell, we use Java development team for a particular SQL engine.At the beginning of the project we have only the JDBC driver and one for the service to the web client REST Server, but we don't have an ODBC driver, so most clients do not really use our products to complete their work.
In order to solve this problem, we designed the below solution: we use REST Server uniformly accept all from the client's request, including web client and the client using ODBC Driver.REST in the Server using the JDBC driver to access our database.Of course if your client is a Java program, you can directly visit our database through JDBC, thereby saving the steps of overhead.This photo does not show this kind of situation.
Depth on the client side, we customize a proprietary ODBC Driver, it to the upper application provides the standard ODBC API, encapsulate all implementation logic.On the
underlying implementation, it calls the REST of the c + + library, send the application to the SQL query encapsulated into a REST request, sent to us the REST of the Server, and after getting the results, and to comply with the ODBC specification, returned to the upper application.
Starting from the Hello World
For developers never contact the ODBC, understand an ODBC client behavior can help understand the custom an ODBC driver needs to implement what specific API.In the image below shows a simple ODBC client program implementation, every line of code with detailed comments explaining its behavior, read through the code, it is easy to have an intuitive understanding.In order to simplify the code, we omitted all the error checking code.All SQLXXX format function, is the ODBC API defined standards.
We will this program is divided into five area, marked as A ~ E, respectively.A zone and B zone, in turn, initializes the three associated with ODBC handles, respectively is:
; The Environment handle (hEnv) : contain one or more Connection
handle.At the same time, some of the global information are included,
such as ODBC version of the client need, and the diagnostic
information environment level.
; The Connection handle (hConn) : represents a DBMS/data source
Connection, containing Connection level of information, such as
Connection timeout, isolation level, and the diagnostic
information of the Connection level.
; Statement handle (hStmt) : it can be seen as a specific query request,
such as SELECT * FROM employee.
It is worth mentioning the ODBC specification only defines the data source and how to expose the data access interface, but there is no rules, how to implement this also includes three kinds of the realization of a handle.In fact, the three types in the code handles passing
through SQLHANDLE type, and SQLHANDLE is essentially a void * type, corresponding structure to our custom.
ODBC provides a series of C language style for the application of API to support access to the query.Driver is different from an object-oriented language, using ODBC Driver applications need to be returned data memory areas ready ahead of schedule, from this perspective, the task of ODBC is correctly will data to meet the needs of users, moving into the memory area of the user to specify (may have some data transformation, for example, if the application needs to support Unicode, so the ODBC Driver may need to char types of source data into wchar types).Below the initialization in A region of A series of handle and variables, including 307 ~ 305 line on the stack of program opens up such as cache memory area.In fact, in the region of the E, we passed the variables x and I quote, so we can put the 308 ~ 309 lines of two numerical variables such as storage also returns the area of memory.
In area C, we call SQLDriverConnect function, at the same time the incoming hConn handle and connect data source need the user name, password, names and other information.We in the realization of the ODBC Driver, completed a series of hConn assignment operation (the initialization has been completed in area B), makes hConn an available connection handle.When the return value is equal to the SQL_SUCCESS SQLDriverConnect a DBMS/data source connection is formally established.
In region D, the client program to handle the Statement hStmt first initialized, and then use SQLExecDirect API directly.The API of the second parameter receives the string, is the SQL query request.
In the last E area, the client to obtain the query results.In this section of the program, the client first in the line 331 SQLColAttribute interface is utilized to extract the first column of the column name attribute, the second parameter specifies the return to the results of the first category, the third constant SQL_DESC_NAME specifies the attributes needed to sign (column name).Next, the client using the SQLBindCol interface, told the ODBC Driver it wants to fill out the return results of the first column into szColData points to this period of memory, and with constant parameters SQL_C_TCHAR told the ODBC client wants to see the return type is char data type, this process is called binding (Bind).Everything in place, the client calls the SQLFetch interface to get the results the first column of the first line, because there is no binding other return column, so the SQLFetch actually returns only the first column content.
In a more realistic client code, may be first call SQLNumResultCols interface that returns the results a total of how many columns.For each returned by the column, the client using the more convenient than SQLColAttribute interface SQLDescribeCol interface, one-time get all the basic information of the column, including the column names, types, length, etc.According to the returned column information, the client a targeted adjustment SQLBindCol parameters, in order to correctly accept corresponding returns the result.Everything in place, the client calls the SQLFetch interface, need the query results are obtained.Because each time the SQLFetch returns a result set of a line, the client program need to be repeated calls SQLFetch, until the SQLFetch no longer return SQL_SUCCESS, but return SQL_NO_DATA, said no longer have more lines can be returned.Client can according to different requirements, the consideration is how to reuse the same piece of memory space to accept the results of different rows (in a line, use a line), or in the start application can accommodate all the large pieces of memory, each time binding to different memory location (take out all the lines together again after using the result data).
Began to customize the ODBC Driver
Through the last section of the ODBC Driver needs to provide the interface has a certain understanding, if we need to write a complete ODBC Driver from scratch, we need to complete two tasks:
1. To achieve the client need all of the API, MSDN gives the ODBC specification detailed definition of the each API
(http://msdn.microsoft.com/en-us/library/ms714562(v=vs.85).aspx, fortunately we don't need to implement each interface, only need according to the client's actions necessary to find the minimum set), in Windows, we will all implementations of the API packaged as an executable module, is usually a DLL file.
2. Let the application client program can properly find our Driver, in short, we need to correctly install the ODBC Driver to the client program running on a machine.
The due to the implementation complexity, the second step is obviously lower than the first step, in addition to the introduction of the second step also helps us to understanding of
ODBC Driver has integrity.So while the second step is in fact dependent on the completion of the first step, we still give priority to introduce the realization of the second step.Here we can assume that we have achieved all the necessary API, the implementation of these apis are packed in a named driver. The DLL file.
Step 1: install the ODBC Driver
Understand the ODBC architecture
In the ODBC architecture
(http://msdn.microsoft.com/en-us/library/aa266933(v=vs.60).aspx), there are four key modules, respectively is:
API: by calling the ODBC interface to connect to data sources, send and receive data, and close the connection.Here is the only API interface, did not realize, specific implementation needs to be done in the Driver module.
Driver Manager: provide applications such as data sources available information, on demand dynamic load Driver, provide inspection, etc.
ODBC Driver: processing function method, the management application and all interactions between specific DBMS/data source.If necessary, the Driver will also the request of the standard SQL statements to a target data source native SQL format.
Data Source: consists of Data and the database engine.
The API and Driver Manager has generally is the operating system.On the Windows, we can install MDAC (Microsoft Data Access
Components,http://www.microsoft.com/en-us/download/details.aspx?id=5793) to get all the required header files have the tools related resources.In the Unix environment, has a similar UnixODBC.In the following we will only consider the ODBC development under Windows.In the development of the ODBC Driver, the underlying data source in general, is ready.So we only need to ODBC Driver registration to the Driver Manager.
The registered ODBC Driver
Driver Manager through the registry that the list of all available ODBC Driver, and their respective details.Location in (assuming the target machine installed a 64 - bit Windows) :
The 32-bit drivers:
A 64 - bit drivers:
In 32-bit Windows, for example, ODBC Drivers we open the registry key, you can see in the system all the 32 bit ODBC Driver installed in it, we will own the ODBC called ebayODBCDriver, and add the corresponding line in the ODBC Driver:
After know the name of the ODBC Driver, Driver Manager will be in the ODBC Driver father nodes, namely ODBCINST. INI for detailed information on the appropriate ODBC Driver.Essential information including Driver Setup and attribute, respectively, to tell where the ODBC Manager looking for Driver and the Setup of an executable program.The Driver correspond to the ODBC Driver, we will implement the Setup program includes some use when setting DSN API, according to the usual practice, this part of the API and ODBC Driver API also compile together in the same DLL file, so we see under ebayODBCDriver, Driver and Setup point to the same DLL files.Here can also define some of the other properties, but these are optional.
Everything is ready, we will be able to Control Panel - Administrator Tools - Data Sources (ODBC) for our ODBD Driver to create a DSN.For 32-bit ODBC Driver, we need to use the C: \ Windows \ \ SysWOW64 \ odbcad32 exe the 32-bit version of the Data Sources (ODBC).It is worth mentioning in the use of Data Sources (ODBC) in the process of creating DSN, we use the above mentioned in the Setup program interface, especially ConfigDSN interface.(http://msdn.microsoft.com/en-us/library/ms709275(v=vs.85).aspx )
In order to simplify the installation steps, we can be in the form of a Windows Installer, packing all the registered ODBC Driver logic, let users can simply by installing an exe, to complete the installation of the ODBC Driver and registration.
The second step: to achieve the ODBC API
In the MSDN on the architecture of the ODBC, ODBC Driver module is the core function is to manage applications and all interactions between specific DBMS/data source.Interaction is the carrier of data, and data means memory space to store.Had mentioned above, the ODBC Driver itself is located in a data mover, it will transfer application request to the data source, and will return to the source data line by line handling to the application.In the process of the ODBC Driver needs at least two pieces of memory area, or simply buffer: a piece of used to cache from the data source as a result, the other is used to cache is handed over to the application of the results.These two pieces of buffer not only contains the data itself, but also including the description of the data.In the data returned
to the Application, for example, the ODBC Driver not only needs to maintain the data itself, also maintains the column data type, length and so on information, between the ODBC Driver and Application, the data and information are collectively referred to as Application Row Buffer Descriptor (ARD).Accordingly, the data source to the ODBC Driver is not only the data itself, also include a column for each return yuan information description, this part of the information and data are collectively referred to as Implementation Row Buffer Descriptor (IRD).
In fact ARD in the area of memory, save the data itself is by the application in the incoming call SQLBindCol, ARD is not responsible for this period of memory application and release, and the memory to store other information needed by the ODBC Driver is responsible for the maintenance.When the application calls for example SQLNumResultCols SQLColAttribute, SQLDescribeCol interface, such as ODBC Driver to find the corresponding content in ARD, returned to the caller.When the application calls the SQLBindCol and SQLFetch ODBC Driver through ARD that the return to the location of the data should be stored (pointer), read the latest row data from the IRD, exert some necessary according to the type conversion, the handle to the specified location.
And ARD, corresponding, the IRD ODBC standard also provides two other buffer, are respectively the Application parameter descriptor (APD) and Implementation parameter descriptor (IPD), to deal with dynamic parameters in the query, this article doesn't do these two kinds of buffer in detail.The four types of buffer constitutes the four major Descriptor in the ODBC world.For more information, readers can refer
In the concrete implementation, ARD and the IRD is defined as a special structure (struct), for and on behalf of the Statement of structure GenODBCStmt (Generic ODBC Statement).The figure below shows us a GenODBCStmt are part of a structure: the first is to identify the type of label (distinguish between Connection with representatives on behalf of the Environment and the structure of the body), and then the above mentioned four kinds of different USES of descriptor, then the level of the Statement of some attribute information, SQL statements, and so on.
We detailed analysis ARD, for example, ARD the implementation is not bound by the ODBC specification, can be free.In our implementation, we use structure GENODBCARD represents a Statement corresponds to ARD.Each ARD includes some all return list of Shared information, and contains the detailed information of the different returns the column with more fine structure GenODBCARDItem representative.
Observation of GenODBCARDItem member variable, easily and one-to-one
relationship between the ODBC API.For example the DataConciseType column type that corresponds to SQLColAttribute return information, whereas DataPtr member corresponding SQLBindCol incoming memory space by the pointer.To sum up, most of the realization of the ODBC API, is essentially to ARD and different members of the IRD variable access and modification.
Using the REST API to access the data source
ARD is responsible for the interaction between the ODBC Driver and application program, the initialization of the application calls the SQLBindCol process is complete.The IRD is responsible for the interaction between the ODBC Driver and data source, its initialization needs in the process of data exchange and data sources.
We first define the REST request interface:
wchar_t* rawSql, char* serverAddr, char* username, char* passwd);
SQLResponse class encapsulates a SQL requests all return content, for each SQL queries, the REST of the Server returns a SQLResponse instance.ColumnMetas members in this instance contains information each return columns, and members of the results is given in the form of a string returns the result of each line.