For all intents and purposes, PostgreSQL has less of a need for CREATE PROCEDURE than other databases aside from looking more like other databases.
For example in SQL Server -> 2005 - although you can write functions that return tables and so forth, you have to resort to writing
CLR functions marked as unsafe to actually update data in a stored function. This gets pretty messy and has its own limitations so you have no choice but to use a stored procedures, which can not be called from within an SQL query.
In MySQL 5.1 the abilities of functions are even more limiting - they can't even return a dataset.
In PostgreSQL, you can
write a function marked as VOLATILE that updates data and that can do all sorts of wacky things that are useful
but considered by some to be perverse such as the following:
SELECT rule_id, rule_name, fnprocess_rule(rule_id) As process_result
WHERE brules.category = 'Pay Employees'
ORDER BY brules.rule_order
Another thing stored procedures can usually do that functions can not is to return multiple result sets. PostgreSQL can simulate such behavior
by creating a function that returns a set of refcursors. See this .NET example Getting full results in a DataSet object: Using refcursors way down the page,
that demonstrates creating a postgresql function that returns a set of refcursors to return multiple result sets using the Npgsql driver.
Prior to PostgreSQL 8.1, people could yell and scream, but PostgreSQL doesn't support Output Parameters. As weird as it is for a function to support
such a thing, PostgreSQL 8.1+ do support output parameters and ODBC drivers and such can even use the standard CALL interface to grab those values.
At a glance it appears that PostgreSQL functions do all that stored procedures do plus more. So the question is, is there any reason
for PostgreSQL to support bonafide stored procedures aside from the obvious To be more compatible with other databases and not have to answer the philosophical question, But you really don't support stored procedures?.
There must be some efficiency benefits to declaring something as a store procedure and having it called in that way. Not quite sure if anyone has done benchmarks on that.
So for the time being PostgreSQL functions have the uncanny role of having a beak like a duck and the flexibility of a beaver, but having the makeup of a Platypus.
The real question is: Are "Stored functions" (or whatever you will call them) pre-compiled by Postgres so that when you call them the database does not have to re-compile them? From what I understand, this is one of the major advantages of stored procedures in other databases, that by "stored" they mean "pre-compiled".
pre-compiled I've always found to be a confusing term as far as databases are concerned and especially when talking about Postgres because there are so many choices of languages to build functions.
When people usually ask that question, it seems what they really mean to ask is can functions use cached plans. The answer is yes. In fact Postgres can use cached answers as well using the STABLE, IMMUTABLE and other function qualifications and of course the C functions are obviously compiled.
The idea that only stored procedures can use cached plans is old and for all intensive purposes is a myth. In most sophisticated databases it hasn't been true for over 10 years and therefore is a poor reason to choose a stored procedure.
For example if you write a dynamic query like in php or asp.net or whatever, SQL Server, Postgres, Oracle etc. can tell whether to cache a plan or not based on the pattern of the statement (even sometimes when its not a parameterized statement) so even those use cached (pre-compiled plans).
You can use prepared statement outside stored procedures. But who do it? With stored procedures you have prepared statements gratis, without any more work. Second, with stored procedures you can better decompose application. But main impact on speed is minimalisation of network and conversion traffic. plpgsql variables are in native PostgreSQL format, when you would do same work outside, you have to do lot of conversation:
server,tcp tcp, libpq, drivers, com, ..
I think its common at least in .NET to write prepared statements w/o stored proces and a lot of that is done for you in .NET with data adapters etc. and a lot of ORM wrapper type classes I think do it too.
For simple inserts and updates we usually don't bother with stored procs, but like you said for real stuff where you've got lines and lines of code especially when its used in multiple sections of an app, we usually use stored procs and functions because it compartmentalizes the logic nicely and keeps network traffic (of transfering the statements across) to a minimum.
Using stored procs/fns provide increased database security and architectural decoupling for data-driven applications that cannot be matched through ORM or prepared statements. For example - you must take complicated measures to prevent SQL injection whenever ad-hoc or prepared statements are allowed. As for decoupling: the wide and deep table that worked well in proof-of concept stages of development may require refactoring, normalization, partitioning, etc when preparing for production. With the defined interface of a stored proc/fn, the application tier does not need to be made aware of the details of the database design changes. It is akin to providing a service interface without exposing the inner structures of the tier.
You sure your comment about prepared statements is right?
Writing prepared statements more or less forces you to define the type of parameters at least in jdbc and ado.net.
Although I guess it would be possible to create a badly implemented driver that doesn't protect you, but I haven't seen that.
For example in java jdbc one would write a prepared statement of the form
"Update atable set avalue = ?, anothervalue = ? WHERE anid = ?";
And then you would do
st.setString(1, "new value for avalue");
Note the setString, setInt etc. forces you to declare what is past in and throws an error if it fails. In all drivers I've worked with
setString properly escapes quotes etc.
Now on the other hand - I have seen people write easy to hack stored procs to get around the limitations of stored procs.
E.g. if you do a lot of analytical apps - writing every permutation of criteria as stored procs gets tiring. So
some misguided folks, thinking stored procs are the holy grail of sql injection protection
write a stored proc that does something like this the below. This is pseudo stored proc code.
CREATE PROCEDURE adhocsql(@somewhere varchar(8000)) AS
EXECUTE("SELECT * FROM atable WHERE " + @somewhere);
In this case - no matter how you call the above - even in a prepared call, you better really sanitize that @somewhere. You would have been better doing an adhoc sql query.
I have mixed feelings about this. On the one hand I think you are right its kind of unclear and probably should be avoided for clarity and maintainability, but on the other hand its useful for simulating for loops in sql that do something.
Take this example that kills all queries currently running by a particular user.
SELECT procpid, pg_cancel_backend(procpid)
WHERE usename = 'joeymemoryhogger';
Sure it may be considered a perverse thing to do, but sometimes perverse problems call for perverse solutions. :)
Yes you would write a plpgsql function. You'd have to go the plpgsql route since you need to write dynamic sql. The quote_ident is to both escape keywords and chars correctly and to prevent sql injection attacks.
CREATE FUNCTION Upd_date(tablename varchar(100))
EXECUTE ' UPDATE ' || quote_ident(tablename) || ' SET UpdDate = CURRENT_TIMESTAMP';
LANGUAGE 'plpgsql' VOLATILE;
You might also want to return something instead of void.
The other nice thing is you can run it against a list
From my point of view, the lack of real stored procedures is a big limitation of Postgres. My problem is that with functions one can not output a query without knowing the rowtype up front. Either the initializer has to specify it with the “AS” keyword for "SETOF RECORD" returning functions, or it has to be explicitely specified in the definition of the function. In other words, the function is not able to define it's own output for it to be equivalent to “SELECT * FROM function(args);”
Yes, I am familiar with the out parameters, but still, they require the output of the function to be specified in the definition. If I wanted to write what I want in IN/OUT parameters, then I would be something similar to “INOUT VARIADIC integer” as the argument of the function, or at least “OUT VARIADIC integer. But Postgres doesn't seem to support VIARIADIC in combination with OUT parameters... And all that should be equivalent with “SELECT * from f(INOUT VARIADIC integer)”.
One of the things the procedures give you is the possibility to get back a set without having to predefine an output nor let the initializer specify one. That latter one is, as I argued elsewhere, at best a bad workaround. I have tried many things in Postgres, but have not been able to return a not-predefined set with it's functions (excluding the cursor workaround).
I hear you about that limitation, but is this really an issue with stored procedures? Seems like you are more complaining about the stored function implementation. Sorry to get worked up about semantics here.
I don't know about other databases, but in SQL Server and MySQL, I can't use stored procedures in a FROM clause, so they don't satisfy your noted requirement either. So the equivalent in PostgreSQL would be to return refcursors which you have already noted? Still can't use in FROM.
Granted I can't just return a TABLE like I can in a SQL Server stored function and use that in the FROM; that I greatly miss in PostgreSQL. I agree that is limitation of the stored function implementation in PostgreSQL, but that is still about stored functions, not stored procedures.
I am writing an application in which I only want to call user-defined stored functions/procedures. (It allows me to change/optimize the applications functionality without having to recompile, redeploy, etc.) So, whether I constantly write “CALL f();”, and fetch the results, or “SELECT * FROM f();” and fetch the results, for me it doesn't really matter, as long as all SQL is handled by the database. The point is: In Postgres I can't do either.
Consider my comments as a reply to the emphasized sentence in the main text: “So the question is, is there any reason for PostgreSQL to support bonafide stored procedures aside from the obvious "To be more compatible with other databases" [...]” My answer is: yes, there is.
I agree with Davor.
And this is a limitation when you try to migrate to postgres, based on other sql languages. Why do I need to specify a list of fields to use the result of a stored function ?
I just want to call the proc and receive the result.
For example: if the result set of a stored function is changed, and this sp is used many times in an application, then I need to add the new result set in all call of my application...
Really, this is very hard to understand.
In other sql languages you can just write the proc, call it and work with the result set. Simple.