Monday, November 25, 2013

I have an ArrayList of Strings, and I want to remove repeated strings from it. How can I do this?
share|improve this question
add comment

16 Answers

up vote164down voteaccepted
If you don't want duplicates in a Collection, you should consider why you're using a Collection that allows duplicates. The easiest way to remove repeated elements is to add the contents to a Set (which will not allow duplicates) and then add the Set back to the ArrayList:
ArrayList al = new ArrayList();
// add elements to al, including duplicates
HashSet hs = new HashSet();
hs.addAll(al);
al.clear();
al.addAll(hs);
Of course, this destroys the ordering of the elements in the ArrayList.
share|improve this answer
38 
See also LinkedHashSet, if you wish to retain the order. –  volley Dec 9 '09 at 20:38
 
But this will just create the set without duplicates , I want to know which number was duplicate in O(n) time–  Chetan Mar 29 '12 at 19:43
 
Chetan, finding the items in O(n) is possible if the set of possible values is small (think Byte or Short); a BitSet or similar can then be used to store and look up already encountered values in O(1) time. But then again - with such a small value set, doing it in O(n log n) might not be a problem anyway since n is low. (This comment is not applicable to original poster, who needs to do this with String.) –  volley May 3 '12 at 12:38
 
@Chetan finding all duplicates from ArrayList in O(n), its important to have correctly defined equals method on objects which you have in the list (no problem for numbers): public Set findDuplicates(List list) { Set items = new HashSet(); Set duplicates = new HashSet(); for (Object item : list) { if (items.contains(item)) { duplicates.add(item); } else { items.add(item); } } return duplicates; } –  Ondrej Bozek Jun 20 '12 at 12:06 
 
this is great, and it gets even better if you change HashSet to LinkedHashSet –  Kevik Jul 23 at 11:07
show 1 more comment
Although converting the ArrayList to a HashSet effectively removes duplicates, if you need to preserve insertion order, I'd rather suggest you to use this variant
// list is some List of Strings
Set<String> s = new LinkedHashSet<String>(list);
Then, if you need to get back a List reference, you can use again the conversion constructor.
share|improve this answer
5 
Does LinkedHashSet make any guarantees as to which of several duplicates are kept from the list? For instance, if position 1, 3, and 5 are duplicates in the original list, can we assume that this process will remove 3 and 5? Or maybe remove 1 and 3? Thanks. –  Matt Briançon May 1 '11 at 2:20
5 
@Matt: yes, it does guarantee that. The docs say: "This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order). Note that insertion order is not affected if an element is re-inserted into the set." –  abahgat May 2 '11 at 9:00
 
Very interesting. I have a different situation here. I am not trying to sort String but another object called AwardYearSource. This class has an int attribute called year. So I want to remove duplicates based on the year. i.e if there is year 2010 mentioned more than once, I want to remove that AwardYearSource object. How can I do that? –  WowBow Apr 16 '12 at 15:27
 
@WowBow For example you can define Wrapper object which holds AwardYearSource. And define this Wrapper objects equals method based on AwardYearSources year field. Then you can use Set with these Wrapper objects. –  Ondrej Bozek Jun 20 '12 at 12:19
 
@WowBow or implement Comparable/Comparator –  shrini1000 Jan 11 at 5:09
add comment
If you don't want duplicates, use a Set instead of a List. To convert a List to a Set you can use the following code:
// list is some List of Strings
Set<String> s = new HashSet<String>(list);
If really necessary you can use the same construction to convert a Set back into a List.
share|improve this answer
add comment
There is also ImmutableSet from guava-libraries as an option:
ImmutableSet.copyOf(list);
share|improve this answer
add comment
Here's a way that doesn't affect your list ordering:
ArrayList l1 = new ArrayList();
ArrayList l2 = new ArrayList();

Iterator iterator = l1.iterator();

        while (iterator.hasNext())
        {
            YourClass o = (YourClass) iterador.next();
            if(!l2.contains(o)) l2.add(o);
        }
l1 is the original list, and l2 is the list whithout repeated items (Make sure YourClass has the equals method acording to what you want to stand for equality)
share|improve this answer
add comment