Wednesday, August 3, 2011

Oracle PL/SQL function to split strings into tokens

I was trying to find how to split or tokenise (tokenize) a string in Oracle SQL. Technically you can do it directly with just instr and substr but your statement becomes very long and hard to read, mainly because you can't reuse the result of an instr. I found this which was interesting but not what I needed, so I wrote my own function to do the job. I'll post it here in the hope that it will be useful to someone.

create or replace function get_token(
the_list varchar2,
the_index number,
delim varchar2 := ','
)
return varchar2
is
start_pos number;
end_pos number;
begin
if the_index = 1 then
start_pos := 1;
else
start_pos := instr(the_list, delim, 1, the_index - 1);
if start_pos = 0 then
return null;
else
start_pos := start_pos + length(delim);
end if;
end if;

end_pos := instr(the_list, delim, start_pos, 1);

if end_pos = 0 then
return substr(the_list, start_pos);
else
return substr(the_list, start_pos, end_pos - start_pos);
end if;

end get_token;
/

Example usage:

select
get_token('foo,bar,baz',1), -- 'foo'
get_token('foo,bar,baz',3), -- 'baz'
--
get_token('a,,b',2), -- '' (null)
get_token('a,,b',3), -- 'b'
--
get_token('a|b|c',2,'|'), -- 'b'
get_token('a|b|c',4,'|') -- '' (null)
from
dual
/

Notes
  • Remember that the index starts at one not zero just like strings in PL/SQL.
  • Empty tokens are counted. You get null if you ask for an empty token.
  • You also get null if you ask for an index greater than the number of tokens.
  • The delimiter defaults to a comma but you can specify any delimiter. A delimiter of more than one char should work also.

This version provided by Anonymous below supports negative indexes, eg
get_token('foo,bar,baz',-1), -- 'baz' get_token('foo,bar,baz',-2), -- 'bar' 
Also check the comments below for a regex based version that can handle quoted csv style fields.
create or replace function get_token (
the_list varchar2,
the_index number,
delim varchar2 :=';'
)
return varchar2
is
start_pos number;
end_pos number;

begin
if the_index = 1 then
start_pos := 1;

elsif the_index < 0 then
start_pos := instr(the_list, delim, -1, abs(the_index)) + 1;

else
start_pos := instr(the_list, delim, 1, the_index - 1);

if start_pos = 0 then
return null;
else
start_pos := start_pos + length(delim);
end if;

end if;

if the_index < 0 then
end_pos := instr(the_list, delim, start_pos+1, 1);
else
end_pos := instr(the_list, delim, start_pos, 1);
end if;

if end_pos = 0 then
return substr(the_list, start_pos);
else
return substr(the_list, start_pos, end_pos - start_pos);
end if;

end get_token;

2 comments:

jeje said...

Le Nike Dunk Supreme 08 noir violet avis nike pas cher en ligne jaune. Les en-têtes doivent être assez basiques et sont généralement conçus pour contraster avec toute la marque ainsi que la page. Avec un ensemble de techniques Nike Air Max Traditionnel BW Dark Heavy Skies Azure Whitened Hommes Nike Air Max 91! Vous pouvez facilement effectuer correctement dans les aires de jeux. Le 1219C2 a ensuite été recouvert d'un adidas zx 700 noir pas cher fini mat phosphate non réfléchissant pour réduire l'éblouissement. Vous et moi savons tous les deux que si vous ne le faites pas, vous vous découvrirez à un moment donné en vous demandant et si? Il n'y a rien de pire que de devoir nous demander cela. Après le nettoyage, vous devez asics baskets femme essuyer les restes d'eau immédiatement avec un chiffon sec.

Anonymous said...

golden goose sneaker
curry 6
pg 1
spongebob kyrie 5
fear of god
off white
kyrie irving shoes
bape outlet
palm angels
supreme official