Skip to content

Conversation

@ucko
Copy link
Contributor

@ucko ucko commented May 22, 2024

Hello again.

As noted in #219 (the remainder of which this PR supersedes), we've now advanced to 1.4.12. I've proceeded to forward-port my changes to the trunk, and would appreciate it if you could please take a look and incorporate at least some of them. As usual, I'm happy to address any questions or concerns you might have.

Thanks!

ucko added 30 commits May 21, 2024 15:55
* Conditionally stub out the tests that require threading (as of 1.4.12)
  and don't yet have such logic.
* include/freetds/thread.h: Never try to add extra checks around stubs.
* In the absence of an external SQLGetPrivateProfileString, fail the
  build only when expecting to use an external driver manager, and
  expose tds_SQLGetPrivateProfileString to the odbc "connect" unit test.
  Don't let that test attempt to use SQLGetPrivateProfileStringW even in
  Unicode builds.
* src/odbc/win*.c: Accommodate static builds, with no hinstFreeTDS; in
  particular, never attempt to show FreeTDS dialog boxes there.
* src/utils/unittests/challenge.c: #include <freetds/sysdep_private.h>
  to ensure the availability of strcasecmp.
This distinction will initially be useful only for ctlib and dblib,
but applying it across the board may avoid possible confusion if it
turns out to be useful elsewhere too.
* #include <config.h> from several more internal headers to facilitate
  injecting most renaming through there; some of these headers consult
  HAVE_* macros anyway.
* Account for possible renaming of cs_dt_crack (not just aliasing it
  to ..._v2), dbopen, tdsdump_dump_buf, and tdsdump_log; for the
  latter two, change the underlying functions' names to
  tdsdump_do_... modulo possible mass renaming.
* Introduce a <freetds/test_assert.h> that at baseline simply forwards
  to <assert.h>.
* Include it from */unittests/*.c (even those that didn't include
  <assert.h>) and for good measure from dblib/unittests/common.h (which
  did), as late as possible to avoid losing out to <assert.h>
  inclusions coming from elsewhere.
* Remove all direct <assert.h> inclusion from tests.
OpenServer instances may well report "Unknown language request" for
it, resulting in outright login failures.
When the user erroneously supplies NULL data for a column which does not
use a nullable type and is not otherwise known to be nullable, cleanly
fail.  Pressing on anyway (in hopes that a server-side default would
take effect?) was liable to trigger legitimate assertion failures in
tds_checks.c.  To that end, take the null_error callback and invoke it
before bailing unless it is itself NULL.
* odbc_sql2tds: Treat SYB5INT8 like other numeric types.
* tds_generic_get: Handle non-blob types with four-byte sizes (e.g.,
  XSYBCHAR at least some of the time).
* tds_generic_put_info: Send size 255 (0xff) for NULL short character
  types in output columns.
* Read defaults, send them back explicitly as needed, and clean up
  their storage afterwards.
* tds5_bcp_add_variable_columns:
** Send spaces in lieu of fully empty SYB(VAR)CHAR data.
** Send correct textpos values in all cases.
** When writing the adjustment table, avoid a potential Sybase "row
   format error" by eliding the (effective) column count when it would
   have repeated the largest adjustment table entry.
* tds_deinit_bcpinfo: Free and reset current_row for transfers in.
  Avoid potentially trying to interpret fragments of outgoing TDS
  data as BLOB pointers.
... i.e., of VARCHAR(MAX) and the like.
* tds.h (is_blob_col): Consider column_type in addition to column_varint_size.
* Retire 5 as a nominal column_varint_size value in favor of unified logic.
Call the registered interrupt handler (if any) for all three.
If an output column of type VARCHAR(n) or the like has no current
size, send the type's maximum size; otherwise, the server will fill in
only the first character/byte.
As of ASE 16.0, Sybase servers have started allowing dynamic query
(prepared statement) declarations with IMAGE or (N)TEXT parameters.
However, subsequent attempts to instantiate these queries have been failing
with message 3805, "The token datastream length was not correct."  In such
cases, switch on dynamic query emulation (as already needed for older
Sybase versions that immediately reject these declarations) and explicitly
discard column information to avoid misconstruing the status of subsequent
queries that yield no row results.
Don't bother trying to receive data following a packet marked as last;
rather, bail if still short.  Whenever bailing, close the socket on
the way out to contain the damage from having fallen out of sync.
Under these protocol versions, MS SQL Server describes these columns
as having NVARCHAR(n) types rather than traditional DATETIME types,
due to DATETIME's range limitations.  (Likewise for TIME, which
however still needs to come in as a preconverted UTF-16 string to
compensate for combined API and protocol limitations.)

- ctlib/blk.c: Perform character conversion on strings converted from
  fixed-width types such as CS_DATETIME_TYPE when a comparison of
  column-size limits appears to indicate an encoding difference.
- tds/config.c: Adjust STD_DATETIME_FMT to match MS's (ISO-ish)
  tastes; in particular, ensure that the entire date portion fits
  within the first 10 characters.
Accommodate iconv implementations (such as the Citrus one used by modern
FreeBSD systems) that ever indicate invalid input by reporting a positive
count of irreversible characters rather than by setting errno to EILSEQ.
Let column_size exceed column_cur_size for BLOBs, whose reported
maximum may have been merely nominal, and doesn't really matter
because their storage is generally allocated separately, based on
actual result sizes.
* tds.h (TDSMESSAGE): Add an osstr field.
* ctutil.c (_ct_handle_client_message): Propagate it to osstring(len).
* tds/login.c (tds_save): Clear osstr to avoid invalid reads when
  trying to connect to a dead server.
* tds/ncbi_strerror.c: New.
* tds/util.c:
** #include ncbi_strerror.c, whose definitions are all static.
** tdserror: Populate the osstr field; clean it when done.
* Introduce CS_NLONGCHAR_TYPE and CS_NVARCHAR_TYPE, both corresponding to
  SYBNVARCHAR (along with CS_UNICHAR_TYPE); have ct_param automatically
  promote CS_{LONG,VAR}CHAR_TYPE to CS_N{LONG,VAR}CHAR_TYPE for TDS 7+
  when non-ASCII characters appear in the data.  (Unnecessary use of
  Unicode types can make parameterized queries against 8-bit text
  columns run very slowly.)
* ctlib/cs.c (cs_convert): Formally support Unicode destination types
  (SYBNVARCHAR and SYBNTEXT).
* ctlib/ct.c (paraminfoalloc): When setting maximum length automatically,
  account for possible conversion to UCS-2.
* tds/convert.c (tds_convert), tds/tds_willconvert.*: Formally support
  Unicode types, giving type 103 (0x67) the benefit of the doubt as
  SYBNVARCHAR rather than SYBSENSITIVITY.
* tds/data.c (tds_set_param_type): Map SYBNVARCHAR to XSYBNVARCHAR for
  TDS 7+.
* tds/unittests/convert.c: Accommodate Unicode types, albeit with
  strings left narrow to match real-life usage, wherein encoding
  conversion occurs separately.
* Extend _cs_convert to take an additional handle argument (allowed to be
  NULL) to provide for the possibility of replacing the caller's original
  output buffer with the one tdsconvert allocated itself (as it always
  does for blobs), rather than copying the blob data.
* blk.c (_blk_get_col_data): Provide a non-NULL handle, both for
  efficiency's sake and because tds_alloc_bcp_column_data imposes a 4 KiB
  cap with the expectation that callers will take this approach.  (dblib
  already does.)
Most failure scenarios accidentally returned CS_FAIL, which ironically
corresponds to TDS_SUCCESS.
To that end, factor a static _blk_clean_desc function out of blk_done's
CS_BLK_ALL case.
* tds.h (struct tds_bcpinfo): Add more fields to enable the necessary
  bookkeeping.
* ctlib/blk.c: Implement blk_textfer; adjust other functions accordingly.
* tds/{bulk,data}.c: Allow for deferred, and possibly piecemeal, blob
  submission; in particular, make tds_bcp_send_record resumable.
* Automatically truncate strings that would overflow their destinations,
  mainly for the sake of bulk insertions.
* Fully honor *BINARY and IMAGE data buffer size limits in cs_convert.
* Update unit tests (ctlib cs_convert, dblib t0019) accordingly.
* Allow _cs_convert to accept an explicit TDS destination type; direct
  it to copy cres.dta if that destination type is SYBMS{DATE,TIME}*.
* _blk_get_col_data: Don't give _cs_convert_not_client any data to
  convert (blindly in the wrong direction!); rather, if it indicates
  that the column holds a new MS type, supply an explicit TDS
  destination type to _cs_convert.
* _cs_dt_crack_v2: Formally populate datetzone from dr.timezone (still
  always 0 in practice).
@ucko
Copy link
Contributor Author

ucko commented May 28, 2024

It looks like splitting out 1512645's changes to bulk.c, which are thankfully independent of its other changes, should let me split the largest series roughly in half. OTOH, git-explode may have missed some purely logical dependencies for which I'll need to account explicitly.

This was referenced May 30, 2024
@ucko
Copy link
Contributor Author

ucko commented May 30, 2024

Withdrawing in favor of smaller PRs (#558-#589).

@ucko ucko closed this May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants