PostgreSQL⾃定义⾃动类型转换操作(CAST)背景
PostgreSQL是⼀个强类型数据库,因此你输⼊的变量、常量是什么类型,是强绑定的,例如
在调⽤操作符时,需要通过操作符边上的数据类型,选择对应的操作符。
在调⽤函数时,需要根据输⼊的类型,选择对应的函数。
如果类型不匹配,就会报操作符不存在,或者函数不存在的错误。
postgres=# select '1' + '1';
ERROR: operator is not unique: unknown + unknown
LINE 1: select '1' + '1';
^
HINT: Could not choose a best candidate operator. You might need to add explicit type casts.
那么使⽤起来是不是很不⽅便呢?
PostgreSQL开放了类型转换的接⼝,同时也内置了很多的⾃动类型转换。来简化操作。
查看⽬前已有的类型转换:
postgres=# \dC+
List of casts
Source type  |  Target type  |  Function  | Implicit? | Description
-----------------------------+-----------------------------+--------------------+---------------+-------------
"char"      | character    | bpchar    | in assignment |
"char"      | character varying  | text    | in assignment |
"char"      | integer      | int4    | no  |
"char"      | text      | text    | yes  |
abstime      | date      | date    | in assignment |
abstime      | integer      | (binary coercible) | no  |
abstime      | time without time zone  | time    | in assignment |  ................................
timestamp without time zone | timestamp with time zone | timestamptz  | yes  |
timestamp without time zone | timestamp without time zone | timestamp  | yes  |
xml      | character    | (binary coercible) | in assignment |
xml      | character varying  | (binary coercible) | in assignment |
xml      | text      | (binary coercible) | in assignment |
(246 rows)
如果你发现有些类型转换没有内置,怎么办呢?我们可以⾃定义转换。
当然你也可以使⽤这种语法,对类型进⾏强制转换:
CAST(x AS typename)
or
x::typename
如何⾃定义类型转换(CAST)
⾃定义CAST的语法如下:
CREATE CAST (source_type AS target_type)
WITH FUNCTION function_name [ (argument_type [, ...]) ]
[ AS ASSIGNMENT | AS IMPLICIT ]
CREATE CAST (source_type AS target_type)
WITHOUT FUNCTION
[ AS ASSIGNMENT | AS IMPLICIT ]
CREATE CAST (source_type AS target_type)
WITH INOUT
[ AS ASSIGNMENT | AS IMPLICIT ]
解释:
1、WITH FUNCTION,表⽰转换需要⽤到什么函数。
2、WITHOUT FUNCTION,表⽰被转换的两个类型,在数据库的存储中⼀致,即物理存储⼀致。例如text和varchar的物理存储⼀致。不需要转换函数。
Two types can be binary coercible,
which means that the conversion can be performed “for free” without invoking any function.
This requires that corresponding values use the same internal representation.
For instance, the types text and varchar are binary coercible both ways.
Binary coercibility is not necessarily a symmetric relationship.
For example, the cast from xml to text can be performed for free in the present implementation,
but the reverse direction requires a function that performs at least a syntax check.
(Two types that are binary coercible both ways are also referred to as binary compatible.)
3、WITH INOUT,表⽰使⽤内置的IO函数进⾏转换。每⼀种类型,都有INPUT 和OUTPUT函数。使⽤这种⽅法,好处是不需要重新写转换函数。
除⾮有特殊需求,我们建议直接使⽤IO函数来进⾏转换。
List of functions
Schema |  Name  | Result data type | Argument data types | Type
------------+-----------------+------------------+---------------------+--------
pg_catalog | textin  | text    | cstring    | normal
pg_catalog | textout  | cstring  | text    | normal
pg_catalog | date_in  | date    | cstring    | normal
pg_catalog | date_out  | cstring  | date    | normal
You can define a cast as an I/O conversion cast by using the WITH INOUT syntax.
An I/O conversion cast is performed by invoking the output function of the source data type,
and passing the resulting string to the input function of the target data type.
In many common cases, this feature avoids the need to write a separate cast function for conversion.
An I/O conversion cast acts the same as a regular function-based cast; only the implementation is different.
4、AS ASSIGNMENT,表⽰在赋值时,⾃动对类型进⾏转换。例如字段类型为TEXT,输⼊的类型为INT,那么可以创建⼀个 cast(int as text) as ASSIGNMENT。
If the cast is marked AS ASSIGNMENT then it can be invoked implicitly when assigning a value to a column of the target data type.
For example, supposing that foo.f1 is a column of type text, then:
INSERT INTO foo (f1) VALUES (42);
will be allowed if the cast from type integer to type text is marked AS ASSIGNMENT,
cstring转为intotherwise not.
(We generally use the term assignment cast to describe this kind of cast.)
5、AS IMPLICIT,表⽰在表达式中,或者在赋值操作中,都对类型进⾏⾃动转换。(包含了AS ASSIGNMENT,它只对赋值进⾏转换)
If the cast is marked AS IMPLICIT then it can be invoked implicitly in any context,
whether assignment or internally in an expression.
(We generally use the term implicit cast to describe this kind of cast.)
For example, consider this query:
SELECT 2 + 4.0;
The parser initially marks the constants as being of type integer and numeric respectively.
There is no integer + numeric operator in the system catalogs, but there is a numeric + numeric operator.
The query will therefore succeed if a cast from integer to numeric is available and is marked AS IMPLICIT —
which in fact it is.
The parser will apply the implicit cast and resolve the query as if it had been written
SELECT CAST ( 2 AS numeric ) + 4.0;
6、注意,AS IMPLICIT需要谨慎使⽤,为什么呢?因为操作符会涉及到多个算⼦,如果有多个转换,⽬前数据库并不知道应该选择哪个?
Now, the catalogs also provide a cast from numeric to integer.
If that cast were marked AS IMPLICIT — (which it is not — )
then the parser would be faced with choosing between the above interpretation and
the alternative of casting the numeric constant to integer and applying the integer + integer operator.
Lacking any knowledge of which choice to prefer, it would give up and declare the query ambiguous.
The fact that only one of the two casts is implicit is the way in which we teach the parser to prefer resolution of
a mixed numeric-and-integer expression as numeric;
there is no built-in knowledge about that.
因此,建议谨慎使⽤AS IMPLICIT。建议使⽤AS IMPLICIT的CAST应该是⾮失真转换转换,例如从INT转换为TEXT,或者int 转换为numeric。
⽽失真转换,不建议使⽤as implicit,例如numeric转换为int。
It is wise to be conservative about marking casts as implicit.
An overabundance of implicit casting paths can cause PostgreSQL to choose surprising interpretations of commands,
or to be unable to resolve commands at all because there are multiple possible interpretations.
A good rule of thumb is to make a cast implicitly invokable only for information-preserving
transformations between types in the same general type category.
For example, the cast from int2 to int4 can reasonably be implicit,
but the cast from float8 to int4 should probably be assignment-only.
Cross-type-category casts, such as text to int4, are best made explicit-only.
注意事项 + 例⼦
不能嵌套转换。例⼦
1、将text转换为date
错误⽅法
create or replace function text_to_date(text) returns date as $$
select cast($1 as date);
$$ language sql strict;
create cast (text as date) with function text_to_date(text) as implicit;
嵌套转换后出现死循环
postgres=# select text '2017-01-01' + 1;
ERROR: stack depth limit exceeded
HINT: Increase the configuration parameter "max_stack_depth" (currently 2048kB), after ensuring the platform's stack depth limit is adequate. CONTEXT: SQL function "text_to_date" during startup
SQL function "text_to_date" statement 1
SQL function "text_to_date" statement 1
SQL function "text_to_date" statement 1
......
正确⽅法
create or replace function text_to_date(text) returns date as $$
select to_date($1,'yyyy-mm-dd');
$$ language sql strict;
create cast (text as date) with function text_to_date(text) as implicit;
postgres=# select text '2017-01-01' + 1;
column
------------
2017-01-02
(1 row)
我们还可以直接使⽤IO函数来转换:
postgres=# create cast (text as date) with inout as implicit;
CREATE CAST
postgres=# select text '2017-01-01' + 1;
column
------------
2017-01-02
(1 row)
补充:PostgreSQL 整型int与布尔boolean的⾃动转换设置(含⾃定义cast与cast规则介绍)
背景
在使⽤数据库时,经常会遇到⼀些因为客户端输⼊的类型与数据库定义的类型不匹配导致的错误问题。例如数据库定义的是布尔类型,⽽输⼊的是整型:
postgres=# create table cas_test(id int, c1 boolean);
CREATE TABLE
postgres=# \set VERBOSITY verbose
postgres=# insert into cas_test values (1, int '1');
ERROR: 42804: column "c1" is of type boolean but expression is of type integer
LINE 1: insert into cas_test values (1, int '1');
^
HINT: You will need to rewrite or cast the expression.
LOCATION: transformAssignedExpr, parse_target.c:591
⼜或者数据库定义的是时间,⽤户输⼊的是字符串:
postgres=# create table tbl123(id int, crt_time timestamp);
CREATE TABLE
postgres=# insert into tbl123 values (1, text '2017-01-01 10:00:00');
ERROR: column "crt_time" is of type timestamp without time zone but expression is of type text
LINE 1: insert into tbl123 values (1, text '2017-01-01 10:00:00');
^
HINT: You will need to rewrite or cast the expression.
从错误提⽰来看,数据库已经很清晰的告诉你为什么了。那么怎么让数据库⾃动转换呢?PostgreSQL有⼀个语法,⽀持数据类型的转换(赋值、参数、表达式等位置的⾃动转换)。postgres=# \h create cast
Command:  CREATE CAST
Description: define a new cast
Syntax:
CREATE CAST (source_type AS target_type)
WITH FUNCTION function_name [ (argument_type [, ...]) ]
[ AS ASSIGNMENT | AS IMPLICIT ]
CREATE CAST (source_type AS target_type)
WITHOUT FUNCTION
[ AS ASSIGNMENT | AS IMPLICIT ]
CREATE CAST (source_type AS target_type)
WITH INOUT
[ AS ASSIGNMENT | AS IMPLICIT ]
数据库内置了很多转换法则:
postgres=# \dC
List of casts
Source type  |  Target type  |  Function  | Implicit?
-
----------------------------+-----------------------------+---------------------------+---------------
abstime      | date      | date      | in assignment
abstime      | integer      | (binary coercible)  | no
abstime      | timestamp without time zone | timestamp    | yes
........
integer      | boolean      | bool      | no
类型的⾃动转换实际上也是有⼀定的规则的,例如赋值、参数算是两种规则。具体含义见如下⽂档:
《》
我们看到整型转布尔是有内置的转换规则的,那么为什么没有⾃动转呢?
postgres=# \dC
List of casts
Source type  |  Target type  |  Function  | Implicit?
-----------------------------+-----------------------------+---------------------------+---------------
integer      | boolean      | bool      | no
和⾃动转换的规则有关,no表⽰不会⾃动转换,只有当我们强制指定转换时,才会触发转换的动作:postgres=# select cast ((int '1') as boolean);
bool
------
t
(1 row)
pg_cast⾥⾯的context转换为可读的内容(e表⽰no, a表⽰assignment, 否则表⽰implicit)
如果让数据库赋值时⾃动将字符串转换为时间,⾃动将整型转换为布尔
1、如果数据库已经内置了转换规则,那么可以通过更新系统表的⽅式,修改⾃动转换规则。
例如,将这个INT转BOOLEAN的规则,修改为assignment的规则。
postgres=# update pg_cast set castcontext='a' where castsource ='integer'::regtype and casttarget='boolean'::regtype; UPDATE 1
修改后,我们再查看这个转换规则,就变成这样了
\dC
List of casts
Source type  |  Target type  |  Function  | Implicit?
-----------------------------+-----------------------------+---------------------------+---------------
integer      | boolean      | bool      | in assignment
现在你可以将int⾃动写⼊为BOOLEAN了。
postgres=# create table cas_test(id int, c1 boolean);
CREATE TABLE
postgres=# insert into cas_test values (1, int '1');
INSERT 0 1
2、如果系统中没有两种类型转换的CAST规则,那么我们需要⾃定义⼀个。
例如
postgres=# create cast (text as timestamp) with inout as ASSIGNMENT;
CREATE CAST
List of casts
Source type  |  Target type  |  Function  | Implicit?
-----------------------------+-----------------------------+---------------------------+---------------
text      | timestamp without time zone | (binary coercible)  | in assignment
这样就可以⾃动将TEXT转换为TIMESTAMP了。
postgres=# insert into tbl123 values (1, text '2017-01-01 10:00:00');
INSERT 0 1
postgres=# select * from tbl123;
id |  crt_time
----+---------------------
1 | 2017-01-01 10:00:00
(1 row)
删掉这个转换,就会报错。
postgres=# drop cast (text as timestamp);
DROP CAST
postgres=# insert into tbl123 values (1, text '2017-01-01 10:00:00');
ERROR: column "crt_time" is of type timestamp without time zone but expression is of type text
LINE 1: insert into tbl123 values (1, text '2017-01-01 10:00:00');
^
HINT: You will need to rewrite or cast the expression.