A better way for a Python 'for' loop

We all know that the common way of executing a statement a certain number of times in Python is to use a for loop.

The general way of doing this is,

    # I am assuming iterated list is redundant.
    # Just the number of execution matters.
    for _ in range(count):
        pass

I believe nobody will argue that the code above is the common implementation, however there is another option. Using the speed of Python list creation by multiplying references.

    # Uncommon way.
    for _ in [0] * count:
        pass

There is also the old while way.

    i = 0
    while i < count:
        i += 1

I tested the execution times of these approaches. Here is the code.

    import timeit

    repeat = 10
    total = 10

    setup = """
    count = 100000
    """

    test1 = """
    for _ in range(count):
        pass
    """

    test2 = """
    for _ in [0] * count:
        pass
    """

    test3 = """
    i = 0
    while i < count:
        i += 1
    """

    print(min(timeit.Timer(test1, setup=setup).repeat(repeat, total)))
    print(min(timeit.Timer(test2, setup=setup).repeat(repeat, total)))
    print(min(timeit.Timer(test3, setup=setup).repeat(repeat, total)))

    # Results
    0.02238852552017738
    0.011760978361696095
    0.06971727824807639

I would not initiate the subject if there was a small difference, however it can be seen that the difference of speed is 100%. Why does not Python encourage such usage if the second method is much more efficient? Is there a better way?

The test is done with Windows 10 and Python 3.6.

Following @Tim Peters' suggestion,

    .
    .
    .
    test4 = """
    for _ in itertools.repeat(None, count):
        pass
    """
    print(min(timeit.Timer(test1, setup=setup).repeat(repeat, total)))
    print(min(timeit.Timer(test2, setup=setup).repeat(repeat, total)))
    print(min(timeit.Timer(test3, setup=setup).repeat(repeat, total)))
    print(min(timeit.Timer(test4, setup=setup).repeat(repeat, total)))

    # Gives
    0.02306803115612352
    0.013021619340942758
    0.06400113461638746
    0.008105080015739174

Which offers a much better way, and this pretty much answers my question.

Why is this faster than range, since both are generators. Is it because the value never changes?

Using

    for _ in itertools.repeat(None, count)
        do something

is the non-obvious way of getting the best of all worlds: tiny constant space requirement, and no new objects created per iteration. Under the covers, the C code for repeat uses a native C integer type (not a Python integer object!) to keep track of the count remaining.

For that reason, the count needs to fit in the platform C ssize_t type, which is generally at most 2**31 - 1 on a 32-bit box, and here on a 64-bit box:

    >>> itertools.repeat(None, 2**63)
    Traceback (most recent call last):
        ...
    OverflowError: Python int too large to convert to C ssize_t

    >>> itertools.repeat(None, 2**63-1)
    repeat(None, 9223372036854775807)

Which is plenty big for my loops ;-)

From: stackoverflow.com/q/46996315

Back to homepage or read more recommendations: