I'm running ODROID-XU Lite (a SoC development board based on SamSung Exynos 5410 quad-core ARM Cortex A15) on Ubuntu 13.04. I'm trying to build Python 2.7.4 from source to have a much more optimized binary version of Python for Cortex A15, but it can't beat the shipped version of Ubuntu (also Python 2.7.4) in terms of performance. It's completely baffling to me as I understand generally the distro packages should be built in generic ways which has no gcc optimization flags in favor of any specific ARM CPU architecture while my manual build had all gcc optimization flags towards Cortex A15. Yet the Python version coming with Ubuntu is ahead my Python build in unladen-swallow benchmarks by 15% - 20%.
The OS:
Code:
Linux odroid-server 3.4.74 #1 SMP PREEMPT Tue Dec 17 11:45:23 CST 2013 armv7l armv7l armv7l GNU/Linux
My Python build options:
Code:
OPT="-O3 -mcpu=cortex-a15 -mfpu=neon-vfpv4 -mfloat-abi=hard -ffast-math" \
./configure --prefix=$INSTALL_DIR
And the unladen-swallow benchmark (
http://code.google.com/p/unladen-swa...iki/Benchmarks), 17.55% slower in nqueens benchmark than the shipped Python:
Code:
odroid@odroid-server:/srv/samba/share/odroid/sources/unladen-bmarks$ python
Python 2.7.4 (default, Apr 19 2013, 19:49:55)
[GCC 4.7.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
$INSTALL_DIR/bin/python
Python 2.7.4 (default, Dec 27 2013, 07:47:24)
[GCC 4.8.3 20131111 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
odroid@odroid-server:/srv/samba/share/odroid/sources/unladen-bmarks$ python perf.py -r --benchmarks=nqueens \
> /usr/bin/python \
> $INSTALL_DIR/bin/python
Running nqueens...
INFO:root:Running /srv/samba/share/odroid/tools/python-2.7.4/bin/python performance/bm_nqueens.py -n 100
INFO:root:Running /usr/bin/python performance/bm_nqueens.py -n 100
Report on Linux odroid-server 3.4.74 #1 SMP PREEMPT Tue Dec 17 11:45:23 CST 2013 armv7l armv7l
Total CPU cores: 1
### nqueens ###
Min: 0.924576 -> 1.085515: 1.1741x slower
Avg: 0.933247 -> 1.097013: 1.1755x slower
Significant (t=-197.757121)
Stddev: 0.00583 -> 0.00588: 1.0076x larger
Timeline: http://tinyurl.com/lpwcwm4