python logo

Evaluating Python Implementations

A Comprehensive Evaluation of Seven Widespread Python Implementations

Python is a widely used general-purpose dynamic language. Due to its popularity, many different implementations exist for the two distinct Python 2 and Python 3 language versions. We evaluated seven different implementations of both language versions to facilitate the selection of one of them. For this purpose, we carefully selected a collection of 523 programs to be executed in the seven Python implementations. Runtime performance and memory consumption are evaluated, considering different categories of code. The actual source of the measured programs and the obtained performance results are available for download here. A more detailed description of each program is presented in the following web pages:

  1. Benchmark suite: A description of the measured programs.
  2. Results: The runtime performance and memory consumption evaluation, presented with different figures and tables.
  3. Details of all the programs composing each benchmark and application suite.
  4. Categories: A classification of programs considering the kind of code used.
  5. Profiling: Detailed execution time of large-scale applications obtained with a Python profiler.

Downloads

In this section you can find the following downloads:

  • Benchmark suite: A zip file including a comprehensive Python benchmark suite plus a driver program used to run all the tests in the benchmarks. This file includes:

    • run-startup.cmd: A script that runs all the tests, measuring the execution time using the start-up methodology. All the measurements are saved in the results directory, as CSV files.
    • run-steady.cmd: A script that runs all the tests, measuring the execution time using the steady-state methodology. All the measurements are saved in the results directory, as CSV files.
    • run-single-execution.cmd: This script runs all the tests with a single execution, measuring the execution time and saving the results as CSV files in the results directory. It does not provide a rigorous statistically measure, but gives a quick estimate of start-up performance.
    • run-memory.cmd: A script that runs all the tests, measuring the memory consumption and saving the results as CSV files in the results directory.
    • run-visual-cfg.cmd: Runs a Python GUI application that allows specifying the tests to be run, the Python implementations to be measured, the methodology used, the paths where the implementations are placed, and the parameters of the data analysis methodology used. It also runs all the tests and shows the results graphically with different figures.
    • /results: Directory where all the measurements are saved as CSV files.
    • /benchmarks: Directory where all the benchmarks are placed.
    • results.xslx: An Excel file showing different figures of runtime performance and memory consumption. This file is automatically updated with all the information generated in the CSV files. Each tab presents a different figure.
    • /run: The Python implementation of the methodologies used in the evaluation.
    • /logs: Directory where the log files produced during the benchmark suite execution are placed.
    • install.txt: A description of the software to be install before performing the evaluation.
    • readme.txt: A description of the benchmark suite.
  • CSV files obtained in the evaluation presented in the article A Comprehensive Evaluation of Widespread Python Implementations.

  • R script that executes the expectation maximization algorithm against the micro-benchmark execution times of all the implementations for both methodologies (start-up and steady-state). It also represents the identified categories graphically, and analyzes whether the categories are statistically different (p-values below 0.05).

Project Funding

This work was partially funded by the Department of Science and Innovation (Spain) under the National Program for Research, Development and Innovation: project TIN2011-25978, entitled Obtaining Adaptable, Robust and Efficient Software by Including Structural Reflection in Statically Typed Programming Languages.