cpdo  Artifact Content

Artifact cd839c08c0ded056d5ada9e4b260cd2f782aa593:

Wiki page [cpdo_mysql5] by stephan 2011-05-17 18:39:29.
D 2011-05-17T18:39:29.095
L cpdo_mysql5
P 033ea4d2fa25c849925e882ad58770d89d5d194b
U stephan
W 6855
<strong>ACHTUNG: THE CPDO WIKI IS NOW (AS OF 2011-May-17) MAINTAINED ON A DEDICATED WIKI SITE:</strong> [http://whiki.wanderinghorse.net/wikis/cpdo/?page=cpdo_mysql5]

The MySQL driver binding was a bitch to implement, but appears to work as expected and supports all features required by the cpdo interface. In my tests it is, for reasons i cannot explain, notably faster than [cpdo_sqlite3]. (Weird!)

This driver was developed with MySQL v5 (i believe it needs 5.0.3 or higher), and might or might not compile as-is on newer or older versions. If it does not, please report it (but i have no intention of back-porting to MySQL v3 if it's a major hassle to do so).

The [DSN] format for this driver is <tt>"mysql5:option1=value1;option2=value2..."</tt>. The supported options are:

   *  <tt>dbname=name_of_database</tt> (required) specifies the server-side name of the database to connect to.
   *  <tt>host=ip_or_name_of_server</tt> (required) specifies the database server host.
   *  <tt>port=integer</tt> (default=0) is the port number on the MySQL server, but note that MySQL ignores the port number when connecting to localhost via a Unix socket (which is does by default if it can).
   *  <tt>autocommit=boolean</tt> specifies whether autocommit is on or off. By default the driver-level settings are used (i.e. we don't set it if you don't set it).
   *  <tt>fieldbuffersize=integer</tt> (default=unspecified (a few kb)) specifies the buffer size to use for string/blob fields when the library cannot figure out the maximum size of the field. When fetching string/blob data, the driver uses the max size specified for that column in the table definition. If it cannot do that for some reason, it uses this buffer size. The default value is only a few kb, and not suitable for large blobs and whatnot. The library may ignore small values and use its own default instead.
    *  <tt>enablenamedparams=bool</tt> (default=true) enables/disables the custom named bound parameter handling (MySQL does not support this feature natively). This option requires additional memory allocation, and should be disabled if the client is not using bound-by-name parameters. Bound parameters are written in SQL in the form <tt>:paramName</tt>.

<b>Compatibility Notes:</b>

   *  MySQL <tt>TIME</tt>, <tt>DATE</tt>, <tt>DATETIME</tt>,and <tt>TIMESTAMP</tt> fields are fetched as strings, but there is no direct support for setting them via bound parameters in this API. Sending them via non-bound parameters which you format yourself is fine. i believe i have found a way to allow binding them as formatted strings, but it's not yet implemented.
   *  MySQL does not natively allow us to bind parameters by name, but this driver supports binding them with the syntax <tt>:paramname</tt> by using a custom parser. Use the <tt>enablenamedparams=false</tt> [DSN] option to disable it if it causes any grief on otherwise valid SQL (and please report it as a bug!). See below for more details.


   *  The MySQL libraries leak memory when connecting, and i have been unable to reliably get it to free that memory (e.g. using <tt>mysql_library_end()</tt>). i can do nothing about this, and it makes checking my own code for leaks more difficult. (That said, i check for leaks using valgrind on a regular basis, and ignore only those 11 missing frees which MySQL introduces.) Strangely enough, when building in C++ mode i do <em>not</em> see this leak (maybe valgrind doesn't report it?), but in C mode i do.
   *  This driver does not properly report when no <tt>dbname=...</tt> is specified in the DSN. Is this a bug? What does MySQL do in that case (use some default?)? Adding an error string for this requires adding a bit of infrastructure to hold the custom error message (normally we get them directly from the underlying driver).
   *  It cannot directly report the problem if the custom bound-param naming code fouls up, because that code won't know it fouled up! The error will show up as mysteriously invalid SQL. See below for more details.

<b>Potential TODOs:</b>

   *  Add a [DSN] option which will tells the bind-related functions to do more strict type checking, e.g. not allowing us to bind a string to a numeric column. This would add a good deal of code to those routines, though.
   *  Add a DSN option to define the parameter prefix character for named parameters. It is currently hard-coded to <tt>':'</tt>, which is fairly standard across drivers (Postgres uses <tt>$</tt>, from what i understand).
   *  Add a DSN parameter to set the quoting style to either ANSI or MySQL (double-quotes). This requires first changing <tt>cpdo_driver_api::constants</tt> so that the quoting-related values are returned via functions instead of const values.

<h2>Binding Parameters by Name</h2>

As mentioned above, MySQL 5.x does not natively support binding parameters by name. So we had to hack together a solution (and i'm quite satisfied with it, actually). The parser for doing this is hand-written and not terribly complex, but is <em>believed</em> work error-free on any input <em>which itself is valid SQL</em>. Since invalid SQL will not be accepted by the driver <em>anyway</em>, mis-parsing in such cases is probably not going to cause any additional problems. (The support is also generic, so that it can be re-used in other driver implementations if we need it.) It <em>will</em> produce undesired results when given weird driver-specific SQL which itself uses colon characters.

Here are some notes to explain areas which might not be intuitive...

<b>Achtung:</b> this driver <em>does not allow a parameter name to appear more than once in a given statement</em>. Doing so will result in binding only the <em>first</em> instance of that parameter when binding by parameter name. The underlying SQL driver might notice that no value was set for that parameter and give us a useful error message, but it might also misbehave.

<b>Mixing named- and non-named parameters:</b>

Consider this SQL code:

INSERT INTO T VALUES( ?, :param2, :param3, ? )

i have NO idea how drivers normally handle (or not handle) mixed-type usage like this, but this driver does the following:

To get MySQL to swallow the SQL we have to translate the parameter name parts (<tt>:param2</tt> and <tt>:param3</tt>) to a <tt>?</tt>. As a side effect of this the named properties are at index positions 2 and 3 (remember that bind parameters use 1-based indexes). To see why, consider that the above SQL is internally translated into:

INSERT INTO T VALUES( ?, ?, ?, ? )

The driver then remembers the names and index positions of each named parameter so that calls to <tt>stmt->api->bind.param_index(stmt,":param2")</tt> will (in this case) return 2.

Z a92401a1be218d5c535bca579b3b86fb