Oracle PL/SQL function to split strings into tokens
I was trying to find how to split or tokenise (tokenize) a string in Oracle SQL. Technically you can do it directly with just instr and substr but your statement becomes very long and hard to read, mainly because you can't reuse the result of an instr. I found this which was interesting but not what I needed, so I wrote my own function to do the job. I'll post it here in the hope that it will be useful to someone.
Update 17/5/07
I've added the notes above and colourised the code to make it look prettier. To the commenters below, you're welcome!
Update 12/9/08
This version provided by Anonymous below supports negative indexes, eg
create or replace function get_token( the_list varchar2, the_index number, delim varchar2 := ',' ) return varchar2 is start_pos number; end_pos number; begin if the_index = 1 then start_pos := 1; else start_pos := instr(the_list, delim, 1, the_index - 1); if start_pos = 0 then return null; else start_pos := start_pos + length(delim); end if; end if; end_pos := instr(the_list, delim, start_pos, 1); if end_pos = 0 then return substr(the_list, start_pos); else return substr(the_list, start_pos, end_pos - start_pos); end if; end get_token; /Example usage:
select get_token('foo,bar,baz',1), -- 'foo' get_token('foo,bar,baz',3), -- 'baz' -- get_token('a,,b',2), -- '' (null) get_token('a,,b',3), -- 'b' -- get_token('a|b|c',2,'|'), -- 'b' get_token('a|b|c',4,'|') -- '' (null) from dual /Notes
- Remember that the index starts at one not zero just like strings in PL/SQL.
- Empty tokens are counted. You get null if you ask for an empty token.
- You also get null if you ask for an index greater than the number of tokens.
- The delimiter defaults to a comma but you can specify any delimiter. A delimiter of more than one char should work also.
Update 17/5/07
I've added the notes above and colourised the code to make it look prettier. To the commenters below, you're welcome!
Update 12/9/08
This version provided by Anonymous below supports negative indexes, eg
get_token('foo,bar,baz',-1), -- 'baz' get_token('foo,bar,baz',-2), -- 'bar'Also check the comments below for a regex based version that can handle quoted csv style fields.
create or replace function get_token ( the_list varchar2, the_index number, delim varchar2 :=';' ) return varchar2 is start_pos number; end_pos number; begin if the_index = 1 then start_pos := 1; elsif the_index < 0 then start_pos := instr(the_list, delim, -1, abs(the_index)) + 1; else start_pos := instr(the_list, delim, 1, the_index - 1); if start_pos = 0 then return null; else start_pos := start_pos + length(delim); end if; end if; if the_index < 0 then end_pos := instr(the_list, delim, start_pos+1, 1); else end_pos := instr(the_list, delim, start_pos, 1); end if; if end_pos = 0 then return substr(the_list, start_pos); else return substr(the_list, start_pos, end_pos - start_pos); end if; end get_token;
No comments:
Post a Comment