Sunday, January 5, 2014

What is the most efficient Java Collections library?
A few years ago, I did a lot of Java and had the impression back then that trove is the best (most efficient) Java Collections implementation. But when I read the answers to the question "Most useful free Java libraries?" I noticed that trove is hardly mentioned. So which Java Collections library is best now?
UPDATE: To clarify, I mostly want to know what library to use when I have to store millions of entries in a hash table etc. (need a small runtime and memory footprint).
share|improve this question
 
What are the keys and values in this table? If they're not primitives, what's wrong with the normal HashMap etc? –  Jon Skeet Mar 10 '09 at 12:31
 
For a very large map you might want a probing implementation, or even inlined like a database table. – Tom Hawtin - tackline Mar 10 '09 at 12:51
 
Interestingly I see no mention of Colt here which was subsequently subsumed into Mahout. –  smartnut007 Feb 10 '12 at 0:03
add comment

closed as not constructive by casperOne Sep 14 '12 at 15:11

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.If this question can be reworded to fit the rules in the help center, please edit the question.

12 Answers

up vote49down voteaccepted
From inspection, it looks like Trove is just a library of collections for primitive types - it's not like it's meant to be adding a lot of functionality over the normal collections in the JDK.
Personally (and I'm biased) I love Guava (including the former Google Java Collections project). It makes various tasks (including collections) a lot easier, in a way which is at least reasonably efficient. Given that collection operations rarely form a bottleneck in my code (in my experience) this is "better" than a collections API which may be more efficient but doesn't make my code as readable.
Given that the overlap between Trove and the Guava is pretty much nil, perhaps you could clarify what you're actually looking for from a collections library.
share|improve this answer
 
should be mentioned that for most tasks google collections is too complex, and java collections are more than sufficient. –  Andreas Petersson Mar 10 '09 at 12:14
3 
@Andreas: Can't say I agree. Not that it's a "one or the other" scenario - I use the regular collections (with helpers like the Lists class) and then use Iterables etc when I need to. Use the complexity only when it helps you. –  Jon Skeet Mar 10 '09 at 12:30
4 
after reading my own comment several months after using G-C extensively - I disagree with my past opinion, and agree fully with yours. use the helper methods/classes extensively, they make much of the code more readable and safer. –  Andreas Petersson Sep 7 '09 at 21:48
 
@Andreas: Thanks for coming back and saying so - I'm glad to hear that GJC is helping :) –  Jon Skeet Sep 8 '09 at 5:19
 
Hey, Jon, Google Java Collections is now Guava. You might want to update your post for future references :)–  Artur Czajka Oct 25 '11 at 18:24
show 1 more comment
The question is (now) about storing lots of data, which can be represented using primitive types likeint, in a Map. Some of the answers here are very misleading in my opinion. Let's see why.
I modified the benchmark from trove to measure both runtime and memory consumption. I also addedPCJ to this benchmark, which is another collections library for primitive types (I use that one extensively). The 'official' trove benchmark does not compare IntIntMaps to Java Collection's Map, probably storing Integers and storing ints is not the same from a technical point of view. But a user might not care about this technical detail, he wants to store data representable withints efficiently.
First the relevant part of the code:
new Operation() {

     private long usedMem() {
        System.gc();
        return Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
     }

     // trove
     public void ours() {
        long mem = usedMem();
        TIntIntHashMap ours = new TIntIntHashMap(SET_SIZE);
        for ( int i = dataset.size(); i-- > 0; ) {
           ours.put(i, i);
        }
        mem = usedMem() - mem;
        System.err.println("trove " + mem + " bytes");
        ours.clear();
     }

     public void pcj() {
        long mem = usedMem();
        IntKeyIntMap map = new IntKeyIntOpenHashMap(SET_SIZE);
        for ( int i = dataset.size(); i-- > 0; ) {
           map.put(i, i);
        }
        mem = usedMem() - mem;
        System.err.println("pcj " + mem + " bytes");
        map.clear();
     }

     // java collections
     public void theirs() {
        long mem = usedMem();
        Map<Integer, Integer> map = new HashMap<Integer, Integer>(SET_SIZE);
        for ( int i = dataset.size(); i-- > 0; ) {
           map.put(i, i);
        }
        mem = usedMem() - mem;
        System.err.println("java " + mem + " bytes");
        map.clear();
     }
I assume the data comes as primitive ints, which seems sane. But this implies a runtime penalty for java util, because of the auto-boxing, which is not neccessary for the primitive collections frameworks.
The runtime results (without gc() calls, of course) on WinXP, jdk1.6.0_10:
                      100000 put operations      100000 contains operations 
java collections             1938 ms                        203 ms
trove                         234 ms                        125 ms
pcj                           516 ms                         94 ms
While this might already seem drastic, this is not the reason to use such a framework.
The reason is memory performance. The results for a Map containing 100000 int entries:
java collections        oscillates between 6644536 and 7168840 bytes
trove                                      1853296 bytes
pcj                                        1866112 bytes
Java Collections needs more than three times the memory compared to the primitive collection frameworks. I.e. you can keep three times as much data in memory, without resorting to disk IO which lowers runtime performance by magnitudes. And this matters. Read highscalability to find out why.
In my experience high memory consumption is the biggest performance issue with Java, which of course results in worse runtime performance as well. Primitive collection frameworks can really help here.
So: No, java.util is not the answer. And "adding functionality" to Java collections is not the point when asking about efficiency. Also the modern JDK collections do not "out-perform even the specialized Trove collections".
Disclaimer: The benchmark here is far from complete, nor is it perfect. It is meant to drive home the point, which I have experienced in many projects. Primitive collections are useful enough to tolerate fishy API -if you work with lots of data.
share|improve this answer
3 
Actually, I think your answer is misleading. Storing ints vs Integers is very different, and most likely the main reason for the increased memory usage. I agree a raw type collection framework could be useful, but it doesn't make trove or pcj "better" than java.util. –  Jorn Mar 10 '09 at 22:47
16 
The question is about storing int data efficiently. Not about storing Integers. For this task trove/pcj are more efficient, as I tried to show. Using Integers imposes runtime and memory inefficiencies. Since java.util doesn't allow usage of primitives, it is not the best choice for this task. –  the.duckman Mar 11 '09 at 9:12
2 
(for Russian community) here goes another benchmark: total-holywar.blogspot.com/2011/07/… –  dma_kAug 13 '11 at 11:17
 
Not sure if we don't use int as key,just normal String. What will be the workbench result for them? – Clark Bao Aug 14 '11 at 2:18
add comment
I know this is an old post and there are a ton of answers here. But, The answers above are superficial and over simplified in terms of suggesting a library. There is no one library that does well across the various benchmarks presented here. The only conclusion I derive is if you care about performance and memory and specifically dealing with primitive types, its more than worth looking at the non jdk alternatives.
Here is a more sound analysis, in terms of benchmark mechanics and the libraries covered. this is a thread in the mahout dev list.
The libraries covered are
  • HPPC
  • Trove
  • FastUtil
  • Mahout ( Colt )
  • Java Collections
The results of the three benchmarks Please look at the thread for a more detailed description of the three Benchmarks here.
share|improve this answer
1 
Thank you. This was very helpful .. considering the importance of the question it's hard to believe none of the other answers (other than the.duckman's) actually answer this question. –  Dexter Jul 10 '12 at 2:55
add comment
As other commentators have noticed, the definition of "efficient" casts a wide net. However no one has yet mentioned the Javolution library.
Some of the highlights:
  • Javolution classes are fast, very fast (e.g. Text insertion/deletion in O[Log(n)] instead of O[n] for standard StringBuffer/StringBuilder).
  • All Javolution classes are hard real-time compliant and have highly deterministic behavior (in the microsecond range). Furthermore (unlike the standard library), Javolution is RTSJ safe (no memory clash or memory leak when used with Java Real-Time extension).
  • Javolution's real-time collection classes (map, list, table and set) can be used in place of most standard collection classes and provide additional functionality.
  • The Javolution collections provide concurrency guarantees to make implementation of parallel algorithms easier.
The Javolution distribution includes a benchmark suite so you can see how they stack up against other libraries/the built-in collections.
share|improve this answer
add comment
Some collection libs to consider:
I would first and foremost reach for the JDK collection library. It covers most common things you need to do and is obviously already available to you.
Google Collections is probably the best high-quality library outside the JDK. It's heavily used and well supported.
Apache Commons Collections is older and suffers a bit from the "too many cooks" problem but has a lot of useful stuff as well.
Trove has very specialized collections for cases like primitive keys/values. These days we find that on modern JDKs and with the Java 5+ collections and concurrent use cases, the JDK collections out-perform even the specialized Trove collections.
If you have really high concurrency use cases, you should definitely check out stuff like the NonBlockingHashMap in the high-scale lib, which is a lock-free implementation and can stomp on ConcurrentHashMap if you have the right use case for it.
share|improve this answer
4 
"These days we find that on modern JDKs and with the Java 5+ collections and concurrent use cases, the JDK collections out-perform even the specialized Trove collections." Misleading - I have never seen a micro-benchmark where storing/retrieving primitive types in a specialized primitive-collection class like Trove didn't outperform the JDK collection classes in both memory usage and CPU time. If you are using objects though (and not primitive types), then I would agree with Alex, fretting over collection impl is not as big of a deal. – Riyad Kalla May 3 '11 at 13:47 
1 
This statement was based on heavy real-world usage (which I'll take over a micro-benchmark any day) of various collection impls where we had prior needed a Trove collection but were able to now pull it out. Late JDK 6 updates (circa late 2009) actually provided custom code for common map keys like Integer that have substantially improved some of the most common uses. –  Alex Miller May 3 '11 at 14:42
 
Alex, I don't doubt in your specific use-cases that pulling out primitive collections and going with JDK collections was fast enough, but waving your hand across the landscape that is collections and saying "All ye that pass, it is fast enough!" isn't accurate. If I am working on a 2D game engine, the overhead of boxing/unboxing my primitive types constantly is measurably expensive. If I'm working on a REST API then no, it probably doesn't make a measurable different at all with respect to much more expensive ops like the HTTP I/O. I just felt compelled to quantify your post is all. –  Riyad Kalla May 3 '11 at 16:24 
3 
I don't think anyone reading this should listen to either of us. They should test their own use case and see what has the best performance. My comments are based on my team's fairly aggressive performance tests with a variety of libraries. YMMV. –  Alex Miller May 4 '11 at 2:39
1 
I agree with @Riyad. I'm writing a high-performance finite automata suite and have implemented it with both Trove and the Java Collections Framework (jdk 6 latest update). Trove outperforms big time. In the order of tens of times better in both computation speed and memory consumption. –  Nico Huysamen Aug 3 '11 at 8:17
add comment
java.util
Sorry for the obvious answer, but for most uses, the default Java Collections are more than sufficient.
share|improve this answer
2 
For basic uses, yes. But I think the framework misses some basic and advanced features (like immutable collections, filters, multimaps, etc) and that's where (for example) Google Collections comes in –  Jorn Mar 10 '09 at 22:38
 
I think this answer misses the point. The JCF was probably awesome in 2002 when people didn't use use Java for much. Unfortunately it hasn't aged well, especially when compared to the collections support from other JVM languages. –  Ted Pennings Oct 17 '11 at 18:27
 
-1 The question is "most efficient for storing int" and any mentioned example is better than java.util – kommradHomer Oct 25 '13 at 10:38
add comment
To store millions of String in a map, take a look at http://code.google.com/p/flatmap
share|improve this answer
2 
+1 Can you introduce how it enhanced? –  Clark Bao Aug 14 '11 at 2:22 
 
There should be blog posts by the author of flatmap somewhere on the internet. –  akuhn Aug 20 '11 at 2:54
add comment
Depends on how we define "efficient".
Every data structure has its own Big-Oh behavior for reading, writing, iterating, memory footprint, etc. A linked list in one library is likely to be the same as any other. And a hash map will be faster for reading O(1) than a linked list O(n).
But when I read the answers to the question "Most useful free Java libraries?" I noticed that trove is hardly mentioned.
This doesn't sound like "most efficient". It sounds like "most popular" to me.
Just some feedback - I've never heard of it, and I don't know anyone who has used it. Collections built into the JDK, Google, or Apache Commons are well-known to me.
share|improve this answer
add comment
Trove offers a few advantages.
  • smaller memory footprint, it doesn't used Map.Entry objects
  • you can use hash strategies instead keys for maps, this saves memory and means you don't need to define a new key each time you want to cache an object on a new set of its attributes
  • it has primitive collection types
  • think it has some form of internal iterator
That said, a lot has been done to improve jdk collections since trove was written.
It's the hashing strategies that make it appealing to me though... Google for trove and read their overview.
share|improve this answer
add comment
If you want to store millions of records in a hash table, chances are that you will run into memory problems. This happened to me when I tried creating a map with 2.3 million String objects, for example. I went with BerkeleyDB, which is very mature and performs well. They have a Java API that wraps the Collections API, so you can easily create arbitrarily large maps with very little memory footprint. Access will be slower though (as it is stored on disk).
Follow-up question: is there a decent (and efficient), well maintained, library for immutable collections? Clojure has excellent support for this, and it would be nice to have something similar for Java.
share|improve this answer
1 
Google collections adds immutable Collections. –  the.duckman Mar 10 '09 at 20:19
add comment
I'm developer of happy-collections from happy-collections on source-forge
  1. Event based collections
  2. Unmodifiable
  3. SortedList
  4. Cache
share|improve this answer
add comment
ConcurrentHashMap as well as the java.util.concurrent package should be mentioned, if you plan to use the HashMap in multiple threads. small memory footprint is assued, since this is part of standard java.
share|improve this answer
add comment

7 comments:

oakleyses said...

louis vuitton handbags, oakley sunglasses, louboutin, longchamp outlet, nike shoes, louis vuitton outlet stores, chanel handbags, burberry outlet, prada outlet, jordan shoes, tiffany and co, michael kors outlet, tory burch outlet, louis vuitton outlet, longchamp handbags, nike free, true religion jeans, michael kors outlet, kate spade outlet, polo ralph lauren outlet, tiffany and co, prada handbags, polo ralph lauren outlet, michael kors outlet, michael kors outlet, longchamp handbags, oakley sunglasses, ray ban sunglasses, kate spade handbags, burberry outlet, louis vuitton outlet, louboutin outlet, louboutin, coach factory outlet, air max, air max, coach outlet, gucci outlet, christian louboutin shoes, michael kors outlet, coach purses, ray ban sunglasses, michael kors outlet, louis vuitton, coach outlet store online, true religion jeans, oakley sunglasses cheap

oakleyses said...

ralph lauren, lululemon, air max, hollister, north face, nike air max, polo lacoste, vanessa bruno, timberland, vans pas cher, louboutin, louis vuitton, oakley pas cher, air max pas cher, nike roshe run, air max, true religion outlet, barbour, sac longchamp, air force, hollister, sac louis vuitton, nike free, polo ralph lauren, nike trainers, louis vuitton uk, nike roshe, sac hermes, longchamp, michael kors, sac burberry, sac guess, mulberry, new balance pas cher, converse pas cher, sac louis vuitton, hogan outlet, nike tn, north face, true religion outlet, ray ban pas cher, michael kors, air jordan, nike blazer, nike free pas cher, michael kors pas cher, abercrombie and fitch, ray ban sunglasses

oakleyses said...

mac cosmetics, mont blanc, marc jacobs, canada goose outlet, nike huarache, vans shoes, soccer jerseys, hollister, giuseppe zanotti, beats by dre, abercrombie and fitch, longchamp, insanity workout, celine handbags, bottega veneta, ghd, nfl jerseys, north face outlet, chi flat iron, ugg boots, birkin bag, ugg australia, canada goose, herve leger, ugg pas cher, rolex watches, valentino shoes, canada goose uk, canada goose, ferragamo shoes, canada goose, ugg boots, uggs outlet, north face jackets, soccer shoes, asics running shoes, new balance shoes, p90x, lululemon outlet, canada goose jackets, mcm handbags, instyler, babyliss pro, ugg, wedding dresses, jimmy choo outlet, reebok outlet, nike roshe run

oakleyses said...

parajumpers, karen millen, air max, converse, pandora charms, moncler, louboutin, moncler, links of london, lancel, juicy couture outlet, oakley, hollister, pandora charms, supra shoes, thomas sabo, canada goose, gucci, wedding dresses, timberland boots, swarovski crystal, air max, coach outlet store online, moncler, ray ban, canada goose, moncler, ugg, louis vuitton, swarovski, hollister, montre homme, moncler, hollister clothing store, ralph lauren, rolex watches, moncler outlet, moncler, iphone 6 cases, baseball bats, juicy couture outlet, toms shoes, vans, pandora jewelry, ugg, converse shoes

Anna said...

Great and Useful Article.

Online Java Training

Online Java Course

Java EE course

Java Course in Chennai

Java Training in Chennai

Java Training Institutes in Chennai

Java Interview Questions

Java Interview Questions

Zheng junxai5 said...

zhengjx20160721
hermes outlet
burberry outlet online
louis vuitton outlet
true religion jeans
oakley vault
celine outlet
kids lebron shoes
jordan retro 8
ray ban sunglasses
air jordan femme
timberland outlet
longchamp outlet
polo ralph lauren
coach outlet online
cheap oakley sunglasses
true religion outlet
hollister clothing
ralph lauren clearance outlet
nike running shoes for men
coach factory outlet
ghd hair straighteners
ghd flat iron
north face jackets
lebron 13
kate spade handbags
louis vuitton outlet
louis vuitton bags
gucci belts
michael kors handbags
louis vuitton purses
jordan retro 13
louis vuitton outlet
coach outlet
coach factory outlet online
michael kors outlet online
true religion outlet
replica watches
true religion jeans

raybanoutlet001 said...

kobe byrant shoes
mlb jerseys
michael kors outlet store
cheap true religion jeans
oakley sunglasses
nike huaraches
links of london
cheap mlb jerseys
adidas nmd
michael kors handbags clearance
cheap jordan shoes
kobe shoes
michael kors handbags
yeezy boost
nike huarache
michael kors handbags sale
michael kors handbags
nike kobe sneakers
nike huarache
air jordan retro
tiffany and co jewellery
ralph lauren online
http://www.uggoutlet.uk