Friday, June 8, 2012

Why would iterating over a List be faster than indexing through it?


Reading the Java documentation for the ADT List it says:




The List interface provides four methods for positional (indexed) access to list elements. Lists (like Java arrays) are zero based. Note that these operations may execute in time proportional to the index value for some implementations (the LinkedList class, for example). Thus, iterating over the elements in a list is typically preferable to indexing through it if the caller does not know the implementation.




What exactly does this mean? I don't understand the conclusion which is drawn.


Source: Tips4all

5 comments:

  1. In a linked list, each element has a pointer to the next element:

    head -> item1 -> item2 -> item3 -> etc.


    To access item3, you can see clearly that you need to walk from the head through every node until you reach item3, since you cannot jump directly.

    Thus, if I wanted to print the value of each element, if I write this:

    for(int i = 0; i < 4; i++) {
    System.out.println(list.get(i));
    }


    what happens is this:

    head -> print head
    head -> item1 -> print item1
    head -> item1 -> item2 -> print item2
    head -> item1 -> item2 -> item3 print item3


    This is horribly inefficient because every time you are indexing it restarts from the beginning of the list and goes through every item. This means that your complexity is effectively O(N^2) just to traverse the list!

    If instead I did this:

    for(String s: list) {
    System.out.println(s);
    }


    then what happens is this:

    head -> print head -> item1 -> print item1 -> item2 -> print item2 etc.


    all in a single traversal, which is O(N).

    Now, going to the other implementation of List which is ArrayList, that one is backed by a simple array. In that case both of the above traversals are equivalent, since an array is contiguous so it allows random jumps to arbitrary positions.

    ReplyDelete
  2. The answer is implied here:


    Note that these operations may execute in time proportional to the index value for some implementations (the LinkedList class, for example)


    A linked list doesn't have an inherent index; calling .get(x) will require the list implementation to find the first entry and call .next() x-1 times (for a O(n) or linear time access), where an array-backed list can just index into backingarray[x] in O(1) or constant time.

    If you look at the JavaDoc for LinkedList, you'll see the comment


    All of the operations perform as could be expected for a doubly-linked list. Operations that index into the list will traverse the list from the beginning or the end, whichever is closer to the specified index.


    whereas JavaDoc for ArrayList has the corresponding


    Resizable-array implementation of the List interface. Implements all optional list operations, and permits all elements, including null. In addition to implementing the List interface, this class provides methods to manipulate the size of the array that is used internally to store the list. (This class is roughly equivalent to Vector, except that it is unsynchronized.)

    The size, isEmpty, get, set, iterator, and listIterator operations run in constant time. The add operation runs in amortized constant time, that is, adding n elements requires O(n) time. All of the other operations run in linear time (roughly speaking). The constant factor is low compared to that for the LinkedList implementation.


    A related question titled "Big-O Summary for Java Collections Framework" has an answer pointing to this resource, "Java Collections JDK6" which you might find helpful.

    ReplyDelete
  3. While the accepted answer is most certainly correct, might I point out a minor flaw. Quoting Tudor :


    Now, going to the other implementation of List which is ArrayList,
    that one is backed by a simple array. In that case both of the above
    traversals are equivalent, since an array is contiguous so it allows
    random jumps to arbitrary positions.


    This is not completely true. The truth is, that


    With an ArrayList, a hand-written counted loop is about 3x faster


    source: Designing for Performance, Google's Android doc

    Note that the handwritten loop refers to the indexed iteration. I suspect its because of the iterator which is used with enhanced for loops. It produces a minor performance in penalty in a structure which is backed by a contiguous array. I also suspect this might be true for the Vector class.

    My rule is, use the enhanced for loop whenever possible, and if you really care about performance, use indexed iteration only for either ArrayLists or Vectors. In most cases, you can even ignore this- the compiler might be optimizing this in the background.

    I merely want to point out that in the context of development in Android, both the traversals of ArrayLists are not necessarily equivalent. Just food for thought.

    ReplyDelete
  4. Iterating over a list with an offset for the lookup, such as i, is analogous to Shlemiel the painter's algorithm.


    Shlemiel gets a job as a street painter, painting the dotted lines
    down the middle of the road. On the first day he takes a can of paint
    out to the road and finishes 300 yards of the road. "That's pretty
    good!" says his boss, "you're a fast worker!" and pays him a kopeck.

    The next day Shlemiel only gets 150 yards done. "Well, that's not
    nearly as good as yesterday, but you're still a fast worker. 150 yards
    is respectable," and pays him a kopeck.

    The next day Shlemiel paints 30 yards of the road. "Only 30!" shouts
    his boss. "That's unacceptable! On the first day you did ten times
    that much work! What's going on?"

    "I can't help it," says Shlemiel. "Every day I get farther and farther
    away from the paint can!"


    Source.

    This little story may make it easier to understand what is going on internally and why it is so inefficient.

    ReplyDelete
  5. To find the i-th element of a LinkedList the implementation goes through all elements up to the i-th.

    So

    for(int i = 0; i < list.length ; i++ ) {
    Object something = list.get(i); //Slow for LinkedList
    }

    ReplyDelete