Simple DB Access

Fri Aug 4 23:23:04 CEST 2006

I dont think you have to make it harder to use adhoc concat SQL, just
make it stupidly easy to use binds.  Screw the whole 'prepare
statement' b.s. - that is why people dont use bind variables, you have
to create tons of extra variables and tracking things just to use a
bind.  Instead, push all that down into the API, and just provide an
api like:

execSql( string SQLText, bind1value, bind2value, bind3value )

or something.  Hash the SQLText in the API so you can see if this was
already prepared or not.

Now, what is easier to write:

execSql( "insert into foo (a,b,c)  values(" + escapeForSql( variable1
) + "," + .  <etc etc> )

or

execSql( "insert into foo (a,b,c) values (:a,:b,:c)", variable1,
variable2, variable3 )

in my mythical pseudo code.

The issue isnt that we need to make using hand built sql harder - its
hard enough! - but that using binds needs to be as simple as possible.
 As soon as people see the whole 'prepared statement' they freak out
and create injection holes in their website.

So forget about register_statement and execute - that is not making it
easier to use.  Instead just have a singular, like so:

-export( exec/2 ) .
as:
exec( SqlText, ListOfBinds ) .

if it is a function that returns values, return a list of tuples
(result set).  Remember that for like 90% of the database APIs unless
you do "magic" the client side will retrieve the entire resultset from
the server anyways.  Many times I've thought that I could retrieve one
row at a time, then my client crashed because it tried to pull down
500,000 rows.  You need a special call series for those cases (which
are generally special) - optimize the most common case, that would be
returning a small to medium # of rows (eg: 1-500 rows).

Also, for SQL that returns no rows (update, insert, etc), the # of
rows affected is your resultset.  Don't provide 2 API calls that the
programmer needs to figure out which one to call depending on select
or update.

That is my API design guide.  Having using our own internal API to OCI
which behaves like I described more or less (except more like
sprintf/scanf in terms of providing binds, ie: pointers) vs JDBC,
Hibernate, PostgreSQL C api, i highly prefer our own API, despite the
C warts it has.

So for the love of all good API design, PLEASE don't make the
developer do all the work!

-ryan

On 8/4/06, Christian S <chsu79@REDACTED> wrote:
> On 8/2/06, Ryan Rawson <ryanobjc@REDACTED> wrote:
> > Being a long time developer on Oracle I now don't understand this
> > 'prepare' concept.  Every single SQL statement that takes in user
> > input should use bind variables.  If this makes you have to prepare
> > statements, then so be it (our oracle API doesnt require prepares, but
> > forces us to use binds).
>
> JDBC, java's sql api, only have the PreparedStatement class if one want to
> bind rather than construct SQL by string concatenation. I guess
> binding is a more
> accurate name for what I meant. Anyway, the postgres protocol, called
> FE/BE: http://developer.postgresql.org/docs/postgres/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY
>
> FE/BE in extended queries will first parse a string and refers to the
> result as a prepared statement. It can then be reused multiple time
> using binding. Binding one of these prepared statements yields, what
> the pg documentation calls, a 'portal'. The portal can then be
> executed. I'm not exactly sure why this extra step is a good idea, but
> it is the way it is.
>
> > With binds you don't have SQL injection holes.  It's as simple as that.
> >
> > Say no to constructed SQL.
>
> My line exactly. Too bad I havent figured out a way to make it harder
> to construct strings than using binding. Maybe if this edbc
> application itself required statements to be registered by a name, and
> then executed by name. Something like
>
> ok = edbc:register_statement(Db, foo, "SELECT foo FROM foobars WHERE
> foo = "++Bar),
> case edbc:execute_statement(Db, foo) of
>   PossibleResult -> ...
> end,
>
> And the binding version would be
> ok = edbc:register_statement(Db, prepared, "SELECT foo FROM foobars
> WHERE foo = ?foo"),
> case edbc:execute(Db, foo, [{foo, "bar"}]) of
>   PossibleResult ->
> end
>
> About the syntax for the "to-be-bounds" in statements. They vary from
> rdbm to rdbm, i used "?foo" other use "?" and refer to them by
> occuring order number, and then what?. To make it coherent across
> several databases one wants to parse the sql query to find them, and
> then produce a string again, a string that the target database's
> parser will understand.
> I think, how do other language's sql api deal with it?
>