People coming from SQL Server and MySQL often complain about how you can't query other databases
from within a PostgreSQL database. In Microsoft SQL Server, there is a concept of querying across databases
on the same server
with dbname.dbo.sometable and querying across servers (even of different types e.g. Oracle) by setting up a linked server and doing
something such as servername.dbname.dbo.sometable or using the OPENROWSET(..) or OPENQUERY(...) syntax.
MySQL has a similar feature by using syntax dbname.sometable, but MySQL lacks schemas so there is no way to segregate a
database into nice buckets as you can with SQL Server and PostgreSQL.
In this article we shall provide some examples of using the contrib module - dblink to query local PostgreSQL databases and remote PostgreSQL databases.
DbLink is probably most comparable in structure to SQL Server's OpenRowset functionality.
It lacks the power of SQL Server's Linked Server
approach or OPENQUERY that allows for synchronized joins between linked servers/databases and local tables and updates/inserts on linked servers. This makes it not terribly useful
in cases where you need to join lots of data with local data. It does however come in handy for bulk copy operations from one database/server to another.
DbLink is located in share/contribs/dblink.sql of your PostgreSQL install.To use it, load the dblink.sql file in the database you would like to use it in.
Querying other local PostgreSQL Databases with DbLink
DBLink returns a generic data row type. One of the annoying things about this is that in order to use it,
you need to specify the output structure. Perhaps in future versions of PostgreSQL, this limitation of lack of introspection of
generic row types will be irradicated. The lack of introspection is an issue not only for DbLink, but for other generic row type returning functions such as TableFunc cross tab functions.
Below is an example of querying a database on the same server and cluster using DbLink. Note if no username and password is
specified, then DbLink connects with whatever account you are currently using.
FROM dblink('dbname=somedb','SELECT town, pop1980 FROM towns')
AS towns(town varchar(21), pop1980 integer);
If you are running on a non-standard port (something other than 5432), then you will need to specify the port as follows:
FROM dblink('dbname=ma_geocoder port=5433',
'SELECT gid, fullname, the_geom_4269
AS p(gid int,fullname varchar(100), the_geom geometry);
Querying Remote servers with DbLink and Bulk Insert
DBLink comes in particularly handy for querying remote PgServers and doing bulk inserts. To do so, you will need to specify the full credentials of the remote server.
Below is an example of this.
FROM dblink('dbname=somedb port=5432
host=someserver user=someuser password=somepwd',
'SELECT gid, area, perimeter, state, county,
tract, blockgroup, block,
AS blockgroups(gid int,area numeric(12,3),
perimeter numeric(12,3), state char(2),
county char(3), tract char(6),
block char(4), the_geom geometry);
Joining with Local Data
This is the major area where DbLink falls short. While you can join
remote dblinked tables with local tables, for large tables, this tends to be really slow as indexes are not used and
DBLinked function call needs to return the full dataset before they can be joined. Below is a simple example of a join:
SELECT realestate.address, realestate.parcel, s.sale_year, s.sale_amount,
FROM realestate INNER JOIN
dblink('dbname=somedb port=5432 host=someserver
'SELECT parcel_id, sale_year,
sale_amount FROM parcel_sales')
AS s(parcel_id char(10),sale_year int, sale_amount int)
ON realestate.parcel_id = s.parcel_id;
If sales table is relatively huge, then you may be better forcing a nested join by creating a function that queries dblink
and putting that in the SELECT clause and having a dynamic dblink query of the form.
CREATE TYPE payment AS
CREATE OR REPLACE FUNCTION fn_remote_payments(cust int)
RETURNS SETOF payment AS
SELECT * FROM dblink('dbname=pagila host=localhost user=pagila',
'SELECT payment_id, customer_id, staff_id,
rental_id, amount, payment_date
FROM payment WHERE customer_id = ' || $1) As
p(payment_id int,customer_id smallint,
rental_id integer, amount numeric(5,2),
payment_date timestamp )
LANGUAGE 'sql' VOLATILE;
SELECT rs.customer_id, realestate.address,
FROM realestate rs
WHERE rs.last_name = 'SMITH';
DBLink is not as feature rich as something like Microsoft SQL Server Linked Server or Oracle's DbLink.
We can think of 2 features which hopefully will be implemented in future versions of PostgreSQL that will improve DbLink among
- Introspection: As mentioned earlier in this article. This would allow you to do things like
SELECT p.* FROM dblink('...', 'SELECT a, b,c FROM sometable') p
without having to specify the output structure of the table. NOTE: That SQL Server supports this
syntax with OPENROWSET, OPENQUERY and linked servers where you can do
FROM OPENROWSET(connectionstring, 'SELECT a,b,c FROM sometable') p
SELECT p.* FROM
- Pipelining: This is the idea of a set returning function returning values before the full set of data is materialized.
We are not sure how helpful this would be but could conceivably help with creating synchronized joins.
Tracked: Jan 29, 01:40
Tracked: Jan 27, 05:38