PyCon2010 Atlanta, GA Feb 17-25, 2010

General Notes

  • Largest PyCon ever: 1025 participants, 10-11% women
  • Next year, same place, but limiting to 1500 participants
  • Lots of hiring, e.g., CCP is hiring 200 python programmers

General References

Notes

From Mark Clark

Tutorials

Faster Python Programs through Optimization - Mike Muller

  • more of an academic approach, tutorial was apparently from a school (Python Academy
  • handed out thick stack of notes covering all the material plus a CD

How fast is fast enough?

"Premature optimization is the root of all evil." - C. A. R. Hoare, e.g. picking a programming language first because it needs to be "fast"

Guidelines:

  1. make sure it is too slow, optimization has a price
  2. do not optimize as you go along
  3. select use cases wisely
  4. check the architecture
  5. check for bugs
  6. if really too slow, find bottlenecks (use module profile)
  7. use unit tests to prevent optimizing bugs slipping into the code

Usual suspects for tardiness: network connection, db access, system functions

Books with efficient algorithms: Python Cookbook, Python in a Nutshell

Highlights:

  • Measuring in Stones - test.pystone allows bookmarking of hardware and implementation (nice use of decorators)
  • Common anti-patterns - string concatennation and list/generator comprehension (we don't use generator comprehension enough I realize!)
  • Selecting the right data structure, e.g.:
    • list vs set
    • list vs deque
    • dict vs defaultdict
  • Caching
    • experience with this in first version of DSS scoring, but nice use of decorators in examples
    • show how to do both deterministic and non-deterministic caching (non-deterministic built into Django)

Mastering Python 3 I/O - David Beazley

I have a copy of all the overheads for those interested.

2to3 will not fix I/O changes because the changes involve semantics as well as syntax

Python 3 is the first non-backward-compatible version of python

The biggest change for Python 3 is I/O

  • str vs unicode
  • print is now a function (with non-% (C-like) formatting)
  • open() and file methods changed
  • standard library modules changed
  • str vs bytearray

Django in Depth - James Benett

  • This was too advanced for me, I was looking for Django 102, but this is all about customizing Django, e.g., what and how to redefine methods for special needs, getting to jQuery modules, handling non-relational databases, etc
  • Some description of major release Django 2.1

Creating Rich Client Applications using Dabo - Ed Leafe & Paul McNett

  • Dabo is a wrapper framework for wxPython
  • makes the calls to GUI objects more consistent
  • heave use of attributes over methods
  • auto-connections to databases with a poor-man's version of a ORM
  • an 80% to your finished product GUI editor
  • uses Three_tier_architecture
  • seem like dabo to wxPython is a translator, not a compiler, i.e., there are too many caveats to allow one to remain ignorant of the underlying layers
  • only 12 people attended including camp followers

Keynote and Invited Talks

Luminaries

Q&A with Guido using Twitter

Seem a little lame to me (a non-Twittering) lots of garbage/nonsense questions.

Guido spoke on Haskell - "admires from a distance" There is a thick book [probably *Real World Haskell*] trying to prove you can do non-functional stuff in Haskell so it produces complicated code to do things that would be easy in languages like python" Python is not a functional language.

Python 3 should have started 5 years ago when the community was smaller.

Django or TurboGears? Definitely Django (!)

Only dev tool he uses is an editor.

Language per se does not need more features, but better, more 3rd party packages (though I dream of a merger with functional languages it will probably not be python x, but another language)

State of IronPython - Dino E. Viehland (Microsoft)

IronPython is a browser-side implementation of python.

  1. Year in Review
    • 2.6
    • ctypes
    • sys._getframe
    • sys.settrace/pdb
    • New interpreter
  2. IronPython 2.6.1
    • improved import time
    • other performance tweaks
    • bug fixes
  3. .NET 4.0
    • C#/VB.NET can call IronPython code directly
  4. IronPython Tools for Visual Studio Prototype
  5. IronPython 2.7 by end of year
    • starting on Python 3.0
    • more Visual Studio
    • Python 3.0 by end of 2011 in IronPython 3.2

demo of some kine of IDE VisualStudio

State of PyPy - Macciej Fijalkowski

PyPy is python written in python.

  • most of work on speed (JIT)
  • comparing with CPython, faster on most benchmarks (!)
  • JIT can consume memory
  • can recompile too many times
  • garbage collection pretty good
  • run-time algorithms are not as polished as CPython yet
  • next release next month (first release with JIT)
  • need sponsors

Cadence, Quality, and Design - Mark Shuttleworth (paying astronaut - ubuntu benefactor)

  1. importance of cadence for releases
    • releases invigorate the community
    • generates feedback
    • 3 or 6 month are good for non-web projects because of yearly events
    • need automated testing
    • provides periods of stability for users
    • month-scale minor releases, year-scale major releases
    • coordination with other packages changes easier (communication)
  2. quality issues in open source software
    • people are interested and desire quality if it is valued by the organization
    • a complete suite of tests helps in distributed development and integration also makes everyone equal participants
    • code reviews
      • introduce new people
      • paired programming easier with tools like gobi(?)
    • have an assigned reviewer who happy to help if you follow the rules
    • automatic crash reporting
  3. need to make room for design professionals, it is a different discipline
    • see larger issues that implementers will not recognize
    • has team that is aware of larger issues (a_ la_ our CCC?)

State of Jython - Frank Wierzbicki

  • 2.5.1 now, 2.5.2 by end of summer, then start on 3.x
  • why do you care? lots of Java out there, can use jython
  • what works: django pylons sqlalchemy distribue setup tools etc
  • (compiles into byte code so python -> gwt is not likely)

Relentlessly Pursuing Opportunities With Python, or why the AIs will spare us all! - Antonio Rodriguez

  1. for a start-up company, everybody should program, therefore python is a good choice
    • every machine can run full system
    • everyone has check out
    • everyone can change
    • everyone can commit
    • everyone can push to production
  2. anyone can work on some small/easier part
  3. template languages is the introduction to programming
  4. must keep code very modular
  5. rejects business vs tech, e.g., HP started off doing everything
  6. "the emaciated startup"
  7. if everyone commits, then everyone reviews everything
  8. python is a high dynamic range language (fish eye photo), i.e., one can be productive knowing very little
  9. frameworks are liking kissing a pig
  10. ewok languages, e.g.. erlang or haskell, cute, but hiring, libraries, reading your earlier code are real problems
  11. needed improvements
    • standard library - need more stuff, i.e., web stuff because the web is the new OS keeping the web open so everyone can modify is good, facebook, for example, is bad
    • distribution: .db/.RPM, disutils, setuptools easy_install, eggs, PyPi, distribute (not CLASSPATH hell, e.g., jar stuff) start using distribute & pip
    • change initial python WEB page
  12. the AI thing
    • Recommended book Daemon - Daniel Suarez (AI view of the future)
    • Maybe if everyone contributes to the new world, then the AI world will like us

Scheduled Talks

Leafy Chat, DjangoDose, Hurricane, and PyCon, Lessons Learned with the Real-Time Web and Python - Alex Gaynor

http://us.pycon.org/2010/conference/schedule/event/10/

  • defined ajax
  • comet: handling real-time web, prevent polling
  • hard to do because http is stateless
  • comet makes the browser somewhat more like socket code

(most of talk covers server side)

A series of experimental programs for achieving non-polling server and multiple client interaction:

  1. LeafyChat
    • built in 48 hours, use twisted, orbited
    • orbited ort and jdango port to browser
    • orbited proxied to twisted
    • passed json packets around a la messages
    • UI lib on client written jQuery that translates the packets
    • 2nd place in some contest
      • sort of works
      • does not scale
      • new functionality brings donw everything
      • (not a real producion system)
  2. DjangoDose:
    • built in a week uses twisted, orbited, Django and StompMQ
    • live feed of everyting on twitter related to Django
    • initial and feed channels
    • same style json packets
    • works well when all clients are receiving he same channel, but not perfect
      • new user gets initial data several times
      • no isolation between comet and twitter connections (we all fall down)
  3. Hurricane
    • attempt at a framework
    • basic idea: produceers and consumers
    • one producer, one consumer
    • technologies
      • long polling with jQuery
      • tornado server
      • producer/consumer run concurrently
      • queue abstraction over multi-processing
    • worked OK
      • entire application state was in memory, not good for states
      • producer/consumer in different processes, but not isolated
      • couldn't get the abstractions right for comet
      • which user is which?
      • which messages go to who?
  4. This year do the same thing as DjangoCon
    • Solution Redis
    • key-value store
    • real data structures list sets hashes ...
    • better isolation
    • did not reinvent orbited this time
      • buckets for data (ring buffer)
      • django, twisted, orbited

  • cannot have "generaor based asynchronous programming rocks"
  • just learned that orbited is semi-defunct since working on a replacement

Python in the Browser - Jimmy Schmennti (microsoft)

http://us.pycon.org/2010/conference/schedule/event/14/

  • requires python plug-in
  • depends on SilverLight by microsoft which is a small .net framework
  • html contained gestalt.ironpython.net/dlr-latest.js
  • did hello world
  • created a console so had an interpreter running in the browser, e.g.,
    • s.innerHTML = "ouch!"
    • dir(document.message)
    • event handlers
  • can scope your control (what python script applies to) using class tags
  • html -> dlr.js -> dlr.xap -> text/python
  • performance note
    • 1st request -> 1.5 MB
    • nth request -> 8 KB
  • generates C# code
  • "cooler demos"
  • has majority of standard libraries, e.g., unittests
  • zip file holds libraries
  • supports "out of browser" applications, menu option allows installing on local machine (now python code runs on machine instead of browser)
  • extended python with C# for computionally intense stuff (fractal)
  • webcam example failed
  • debugging with pbp in html

Creating RESTful Web Services with restish - Grig Gheorghiu

http://us.pycon.org/2010/conference/schedule/event/21/

  • Talk basically taken from the book RESTful Web Services
  • Also see Nottingham's REST tutorials
  • web applicaton are consumed by applications, not humans or browsers
  • not a protocol, archtectural style
  • examples: rest, rpc style (soap, xml-rpc), hybrid rest-rpc
  • rest based on get post put delete ags go into urIs or http payloads
    • soap example looked verbose and opaque
    • xml-rpc looked good but did not take advantage of the http protocol
  • resource-oriented architecture
    • resource - anything important enough to be referenced a la object in OO
    • representation - concrete data is current state of resource
  • resources are addressable, i.e., linked bookmarked, printed sent in email, cached
  • openness vs opacity: rest vs rpc
  • state is only on the client side
  • stateless => visibility, reliability scalability
  • uniform interface a la crud
    • get/head has no side effect therefore safe * get/head/put are idempotent
  • also see Joe Gregorio's article How to create a REST Protocol by a series of questions
  • uris should be nouns NOT VERBS!
  • gave an example using restish, which is a framework in python: specific arguments and decorators
  • post vs put - whether the client or server "is in charge of defining the id"
  • testing
    • WebTest for unit testing
    • twill for functional (html) testing
  • deployment
    • run web application properly, use grizzled.os
    • rotatelogs, scribe
    • automatic deployment fabric
    • nginx
  • transactions: turn verbs into noun resources

Python 3: The Next Generation - Wesley J. Chuj

http://us.pycon.org/2010/conference/schedule/event/29/

  • note that no python 3 talks presented here
  • some use of 3, but most is still 2 (in fact 2.4 or 2.5)
  • 3 is backward incompatible
  • no rush, 2 around along time yet (at least 2 years)
  • migration tools 2to3 and python 2.6+
  • corrects age-old regrets and warts
  • 3 is first release that purposely breaks backward compatibility
  • big differences are:
    • print and exec are now functions
    • strings to unicode
    • think text vs data, not unicode vs ascii
    • updated exceptions
    • true division
      • 1/2 = .5
      • 1//2 = 0
  • single class type
  • exceptions
    • multiple exceptions, added 'as' keyword
    • now only one way to throw (raise)
    • exceptions called as function
  • single integer type
  • new literal types (more consistent), e.g., number bases
  • new iterables are memory-conservative
  • tuples will have methods a la lists and dictionaries
  • better set support
  • new keywords: as with nonlocal True False
  • lots of porting/migration guides
  • use -3 command-line option to find incompatibilities in current code
  • 2.6 is a transition version, very stable
  • 2.x and 3.x being developed in parallel

Maximize your program's laziness - David q. Mertz

http://us.pycon.org/2010/conference/schedule/event/32/

  • generators and iterators allow laziness by not creating data structures until they are needed
  • examples from haskell and lisp:
    • haskell example dealt with infinite lists
    • scheme example uses same example with delay commands which marks laziness returns promise which requires a force to get result
  • python's iterator and generators deal with sequential data, i.e., sequence-like
  • module itertools provides tools that handle lazy functions, e.g., imap, takewhile
  • things that deserve laziness:
    • expensive computations
    • large data sets
    • time-consuming background operations
        • db queries
        • retrieve network resources
        • waiting for external events

The Ring of Python - Holger Krekel

http://us.pycon.org/2010/conference/schedule/event/39/

python interpreters

  • CPython is the default, most-widely used implementation of the Python programming language. It is written in C
  • Python 3
  • Stackless Python - massive concurrency, e.g., eve game
  • Psycho 100x faster
  • Cython simplifies the writing of C programming language extension modules for the CPython
  • Unladen Swallow optimization branch of CPython, intended to be fully compatible and significantly faster. uses the JIT and http://en.wikipedia.org/wiki/LLVM
  • Jython implementation written Java, uses Java Library instead of the python standard library, complies into Java bytecode
  • IronPython implementation targeting the .NET Framework and Mono, can be used for client-side browser scripting with Silverlight
  • and more

  • Who is evil? microsoft, apple, google? - power corrupts therefore application control and/or data control are evil.
  • Must see Jonathan Zittrain The future of the internet and how to stop it
  • How to distribute and keep software free? private clouds?
  • Hybid development - work on all systems, only python seems likely
  • Need testing that guarantees its availability
  • packaging and installation
  • execnet

First question was what did he use for presentation, answer prezi

Using Django in Non-Standard Ways - Eric Florenzano

http://us.pycon.org/2010/conference/schedule/event/54/

Why not django? There are good reasons, and not so good reasons for using Django.

non-standard: choosing alternatives to that provided use django in other places

Speaker's basic theme was easier than expected substitute or just use pieces of Django.

Choosing alternatives to that provided

  1. JinJa2
    • alternative template system - provides different trade-offs
    • can get better performance
    • how
      • mirrored django layout exactly, substituted JinJa2
      • modified imports
      • different load
      • modified render_to_response
  2. not use django.contrib.auth
    • example is facebook app
    • user id is a given, user is known
    • 45 min to create app
    • 1 hours to convert apps
    • resulted in cleaner code, still used django stack
  3. not using the ORM
    • if legacy db, unsupported (non-relational), no db
    • choices: drop django, ORM with upkeep, just use the current service
    • app tree looks familiar but model.py was empty
    • do lose admin

Using django in other places:

  1. use forms in pylons
    • tried other approaches
    • steps
      1. turn off internationalization
      2. new Form base class to coerce WebOb,
      3. new genshi wrapper
    • modified settings.configure()
    • 60 lines of code
  2. use ORM as standalone
  3. use WSGI in Django, e.g., scale images, merge js/css, aggregates python, profiling datga[?] across all requests

Decorators from Basics to Class Decorators to Decorator Libraries - Charles Merriam

http://us.pycon.org/2010/conference/schedule/event/67/

Consider decorators when yhou have two concepts and one affects the other. Note a good decorator should be easy to understand and read. When you describe a function watch for and, function foo does this and does that.

For example, @require_manager decorator has a conditional that always checks permission.

A concrete decorator simply takes a function and returns a function. A class decorator takes a class and returns a class.

common uses:
  • security, install check
  • pre/post conditions, login required
  • framework registration, lazy setup
  • locking and atomic, cache, trace/log/stats, contract programming, etc
  • framework & callback registration
  • dictionary transmogrify
  • non-inheritance mix-in madness
  • non-class descriptors

Unladen Swallow: fewer coconuts, faster Python - Collin Winter (google)

http://us.pycon.org/2010/conference/schedule/event/71/

python is one of 3 primary langages at google

Goals
  • 5x performance
  • source compatible with existing python code
  • c extensions
  • ease of migration
  • open source, back into cpython (approved by Guido)

Why is python slow? mostly because everything is an object. ? ?

Though the language allows lots of dynamic things, programs just do not do it, therefore assume the program is less dynamic than the language.

unladen swallow post processes the output of eval including apple's llvm.

llvm works on C output, so compile python to the same representation to allow llvm work on it

Need better benchmarks. Most of the current benchmarks were developed in 70s and have been simply translated from language to langauge with no regard if they reflect actual use, e.g., fibonacci numbers. We have built 32 new ones so far. We hope it will become the standard benchmark suite for python.

Resultant code does use more memory because of JIT.

post mortem:

llvm is in osx leopard, assumed it was solid, it was not, so they improved it, also oprofile, still hope to get 5x, need more people, inviting python programmers, can even get 10x (!)

I hoped to kill off GIL, but failed, more happening.

invitation to sprint, not complicated, not for compiler gurus.

Diversity as a Dependency - Anna M. Ravenscroft

http://us.pycon.org/2010/conference/schedule/event/77/

co-editor of Python Cookbook

moral, legal, political regular reasons

guilt - in diversity used too much - this is a guilt-free zone

so what is in it for me?

or what is in it for python?

small town study ~500 people
  • most influential people (based on newspapers)
  • those who had most input from outside the town (eg, subscriptions)

big corporation study ~10,000 people
  • network analysis, in groups, structural analysis
  • wider connections are more creative people

science lab (dunbar number)
  • how are unexpected results handled: individuals ignore, groups investigate
  • groups like analogies -> conceptual changes
  • need a diverse set of pools to draw analogies from

universal design use of curb cuts - originally for wheelchairs text-to-speech voice control

Interactions among individuals with different perspective, skill sets, need and motivations generates innovation and creativity.

Need uncountable skills and actions to make python get better

Problem solving requires different approaches, definitions, goals, ...

If we want python to be the best it can be, then need diversity

diversity is hard
  • differences cause friction
  • communication
  • goals and priorities
  • valuing other perspectives and skill sets

Harnessing diversity
  • education
  • clear goals
  • facilitating communication
  • willing to put out effort

Conclusion: hard, but worth the effort

Distributed Programming with Pyro - Alfreda Deza

http://us.pycon.org/2010/conference/schedule/event/89/

remote objects act exactly like local objects heavily tested

(I was hoping for something new in this talk, but did not hear anything startling.)

needs a working DNS

one master, one slave is simplest

deadlock must be guarded against

validation and encryption - need ssl server

Actors: What, Why, and How - Donovan Preston

http://us.pycon.org/2010/conference/schedule/event/93/

actors are the primary programming abstraction
  • actors are processes, can change their own state (only), create new
  • actor and get address, send message to known addresses,
  • no blocking for sending messages, can wait for a message type

why this approach?
  • isolation - does not need locking, no race conditions
  • simple control flow - each actor is independent, straight line or a loop
  • message passing (not shared memory) - easy to distribute
  • simplify error handling
    • most exceptions occur while waiting, e.g., timeouts, network errors
    • isolate errors

Implemented using: erlang (functional language), io, Python with (PARLEY, etc)

Python Metaprogramming - Nicolas Lara

http://us.pycon.org/2010/conference/schedule/event/96/

most common method is exec() or eval()

allows isolation
  • can pass in globals and locals, but still get __builtins__, but
  • can redefine them too

inspect - returns the source code for any module
  • can change program globally without affecting the source code, like
  • changing imports

data model

magic methods operators, eg -add_ or call

possible because of duck typing, e.g., HTML generator

decorators "most powerful method"
  • work around the function (timing)
  • mingle with parameters and return value
  • decorators with patterns
  • class based decorators (memoized)

Types can create functions programatically (types.FunctonType())

For Metaclasses one needs to understand how python constructs a class to use this

Python's Dusty Corners - Jack Diederich

http://us.pycon.org/2010/conference/schedule/event/106/

python aims not to surprise by using object protocols comparisons, attributes, lookup python 3 makes the language smaller contextlib culture short!

Tests and testability - Ned Batchelder

http://us.pycon.org/2010/conference/schedule/event/114/

why is it hard
  • different needs between real use and tests
  • many internal interfaces
  • baby steps
  • decide what your code does

Testable code implies more testing, fewer bugs, better design

Better tests are convenient, fast, unambiguous, and repeatable

Gave examples of unpeeling the sandwich (get to the meat), e.g., test option parsing and file writing (like our sparrow unit tests)

Examples using mocks and dependency injection, there exists a python module mock that provides universal mock objects.

Example using a class pdfReport which inherits from pdfWriter, which demonstrated the advantage of composition over inheritance.

test-only code, i.e., use of conditionals, use if testing (sometimes) eg, sqlite, turn off caching (need extra asserts)

beware of implicit dependencies

use fixtures for handling input data

take data from convenient sources

mock out disks and dbs

better harvesting is worth the effort

Mom and apple pie:
  • asserMultiLineEqual
  • nose, py.test
  • invest in building infrastructure
  • 1:1 ratio of test to production code
  • test your tests
  • 70% unit, 20% function, 10% whole system

rapid multi-testing - Holger Krekel

http://us.pycon.org/2010/conference/schedule/event/127/

advantages of py.test
  • no boilerplate
  • informative failure report
  • multi-cpu
  • conditional skipping (decorators)
  • custom marking
  • fixtures - setting up an environment
    • funcargs - factory pattern
    • testresources - temp directories, stdout-capture
  • and much, much more

Modern version control: Mercurial internals - Dirkjan Ochtman

http://us.pycon.org/2010/conference/schedule/event/132/

dag solution to darcs Patch Theory problems, heads nodes without out-bound edges

revlogs provide speed by using the time/space tradeoff

Extensions: just add a python module or package, e.g.,, color differences, git-like stuff use: uisetup() reposeup() wrapcommand() cmdtable (hgdiff?)

Q&A

  • is bazaar is faster (also git)? we like our data format because light on seeks, so some things faster, some slower
  • how much python? almost all except for 3 or 4 functions

Hg and Git: Can't we all just get along? - Scott Chacon (GitHub)

http://us.pycon.org/2010/conference/schedule/event/137/

(kind of felt like a Marc Antony speech to me)_

why
  • lots of python projects use git
  • linux kernel needed a SCS, immediately two were born: hg and git (same month)

similarities (all good)
  • fully local repositories
  • directed graph histories, i.e., merges are easier
  • SHAI content checksumming system
  • blinding speed - bzr, l.x, darcs, or arch are slow

know one, know the other: incredibly similar even in commands

"our enemy is SVN"

differences
  • revlog vs associative format
  • data normilization
  • duplicated data
  • branches
  • patch management (big plus for git)

hg-git.plugin
  • allows local repositories to be either with a git server
  • can borrow commands from each other
  • bidirectionality
  • salt lake city merges

From Joe Masters

Day 1 (Fri. 19 Feb)

Introductions
    • Van Lindberg
    • Steve Holden, Chairman Python Software Foundation
      • Organization
      • Diversity
      • Public Face of Foundation
      • Broaden Python popularity
    • Guido van Rossum
    • Questions via twitter * most were unanswered * need more female involvement
      • community involvement from the start after CWI release
      • '93-'96 C-API refactoring
      • Python 2.7.9 will likely be the last 2.x release
        • 2.7 will be summer time frame
        • 3rd party library porting to Py3K needed
        • python packaging is hateable, platforms are always a major pain
        • dictionary comprehension: mashup of dictionary and list comprehension
    • comprehensions are complicated to implement
    • why isn't android more python friendly
      • scripting env. just OK for your own phone
      • that will change in the future
    • opinion on Ruby?
      • syntax freedom gives me the creeps
      • callback structure is weird
      • I like for loops and iterators better
    • 100 years from now programming languages don't exist
      • too much attention is placed on strings of ascii characters
      • something more organic, dynamic will emerge
      • maybe from functional languages. difficult to predict.
      • I'm incredibly lucky Python went anywhere at all
    • hundreds of languages come about each year and most are forgotten
    • Unladen swallow just gave python a new lease (speed improvements)
    • *Peoplefinder * to help Haiti
    • Teaching Python at gradeschool level?
      • Can be successful for *gifted * children (ephasis his)
    • Better at middle and high school level
    • GIL? If you hate the GIL, start using Jython
      • most programms run just fine without worrying about the GIL
    • Merge Cpython and stackless? No.
    • What text editor do you use? emacs. also vi. not very good at either.
    • Which OS do you use? Linux. Mac OS.
    • Looking at functional language
      • don't try to write functional language python
      • I admire it from a distance
      • Not sure how to make it useful
    • Neglected areas of Python?
      • Standard library is unevenly attended to.
    • 3 wishes for python? Everyone starts porting to Py3k x 3.
    • What dev env. do you use besides an editor?
      • Nothing, well actually a code review tool on web codereview.appspot.com
    • Is batteries included vs. sold seperate better?
      • Included, but once stuck in standard library, you are a slave to python release cycle.

Python for Large Astronomical Data Reduction and Analysis Systems
(Francesco Pieffederici, CfA)

  • Python in "Big" Astronomy
  • Simulations are important
  • LSST: ready in 5-6 years in Chile (8.4m fully automatic optical)
    • huge field of view
    • 3.2 billion pixels, 64-bit
    • 70cm square CCD array
    • terabytes of data each night
    • petabytes to hundreds of petabytes image storage
    • approaching data volumes of high energy physics experiment
    • transient detection
    • 100k events / night for 10 years
    • <60 sec. latency for alerts
    • simulations: dome, telescope, camera, obs. scheduler, weather DB
      • also unforseen event scheduler (?)
    • nearly everything in Python
    • used SimPy for simulations
    • used ctypes for wrapping: allows you to open a binary library and call functions in that library from Python
      • also SWIG
    • in middleware layer, distributed python allows 2000-3000 core computing
      • in pure python
    • use of MPI (example)
    • open source, open data (catalogs, images)
    • questions:
      • I asked about visualization and file formats: neither have been addressed
        • both are known issues for the future
      • how to distribute data? tap in to high bandwidth mining pipelines in Chile
      • MPI strategies? atomic piece of data on each machine. heavy data parallel.
    • processing from beginning to end for each chunk of data
  • GMT (Giant Magellan Telescope): 10 years, near LSST in Chile
    • 7 LSST sized mirrors together

VisTrails: A Python-Based Scientific Workflow and Provenance System
(David Koop, University of Utah and VisTrails, Inc.)

  • uses VTK
  • Visual workflow programming
  • leverage existing pipeline & workflows
  • logging execution of pipeline (hash of input data). history. * done with a version tree
  • open source
  • PyQt and Qt
  • LANL: cosmological visualization
  • formats?
  • matplotlib
  • gui pipeline constructor

Using Python to Create Robotic Simulations for Planetary Exploration
(Dr. Jonathan M Cameron, JPL)

  • works in Mobility and Robitics Systems Division
  • includes rover, airships, spacecraft
  • vehicle simulations for testing software in terms of physics and also devices
  • run 4 to 10x faster than real time
  • when is sun visible, etc.
  • simulation test environment for new vehicle ideas
  • modular code; mix and match for specific problems
    • need to mix/match often without recompilation
  • agile environment to build simulations quickly
  • use swig for binding
  • much is C++ code
  • get it working, get it right, get it optimized
    • not good in c++ (compiling, makefiles, linking)
  • python connects c++ code
  • use python for regression testing (doctests). haven't switched to pyunit yet.
  • results reporting through django website
  • high fidelity simulation
  • use hdf5
    • in visualization of terrain and zooming in / zooming out
  • use numpy, matplotlib

Python 3: The Next Generation
(Wesley Chun, Cyberweb Consulting)

Big Message: 3.x is here. So is 2.x.

  • most people are not using python 3
  • mainly new users encouraged to use the latest
  • most companies are still using 2.4
  • the larger the company the older the version
  • why 3? fix early design flaws
  • 2.x will live for a long time
  • both are developed in parallel
  • 2.7 in june. 3.2 in december
  • migration tools: 2to3
  • 3.x not backwards compatible
  • still recognizable
  • many small changes that break 2.x code
  • python "regrets" and "warts" papers as reference for flaws
  • 1st release to break backwards compatibility
  • print/exec changed to functions
  • strings: bytes/bytearray types
  • updated syntax for exceptions
  • more things are functions; more flexible b/c of extra parameters
  • everything changed to Unicode strings
  • text vs. data instead of Unicode vs. ascii
  • string type changed to byte type
  • classes and types are the same
  • one integer type (no long vs int)
  • 1/2 is 0.5, not 0 now; you always get a float
  • 0o is octal, 0x is hex
  • more saving memory
    • especially in iterators
    • iterators are everywhere
  • new io class type for files
  • tuples get methods (read-only, still mutable) for the first time
  • new resevered words: as, with, nonlocal, True, False
  • new built-in functions and methods
  • removed < > `` (use repr instead)
  • use at least 2.6 to ease transition
    • '-3' command line switch to warn against incompatibility
  • python 2 will be here for several more years
  • some OSs have Py3 bundled

Maximize your program's laziness
(Dr. David Mertz, Gnosis Software)

  • haskell and scheme example for program laziness
    • call/generate on demand
  • iterators and generators exist in python
    • generate on demand
    • defer performing an action until needed
    • they are sequence-like
    • generator: more than a continuation, not quite a closure
    • itertools module
  • avoid expensive computations
  • do not hold large amounts of data in memory
  • memorize, weakref, Promise

The Ring of Python
(Holger Krekel, Merlinux GMBH)

  • Python interpreters
  • Varients and features
  • speed and competition

  • interpreters:
    • C api implementation (CPython)
      • mainstream python
    • python3
      • cleaned-up python
      • struggling for critical mass breakthrough
    • Cpython varients
      • stackless python; massive concurrency
        • 10's of thousands of microthreads
        • EVE mmorpg
          • 95% stackless python
    • Psyco: speeds loops up to 100x (32-bit only)
    • Cython: fast extension modules (derived from PyRexA)
      • so you don't have to write C
    • Unladen Swallow: uses Low Level Virtual Machine (LLVM)
      • JIT compiling
      • intent is to merge back to CPython probably py3 only
    • Jython
      • glues Java to Python in both directions
      • not victim to Global Interpreter Lock (GIL) for multiple OS threads
    • IronPython
      • Microsoft .NET integration
    • parrot / pynie
    • py. (pypy): true sandboxing, garbage collectors, virtualizing file systems, etc.
    • python tracing (jit)
    • not all these interpreters are competing; there is much collaboration

  • what is the new evil
    • data control (google)
    • application control (apple)
    • jonathan zittrain is an expert on fighting these trends
      • ironically up on google talks
      • book: the future of the internet and how to stop it
    • so how do we organize our own clouds, services, etc. and have tools for it
      • imitate google, apple, amazon but make it free

  • software is not "built" anymore
    • it is more "made available"

  • amazon ec2
  • google app engine
  • disutils 2
  • no python interpreter bridges, except for one called "execnet"

Python in Quantitative Finance
(Wes McKinney, AQR Capital Management)

Day 2 (Sat. 20 Feb)

Open Space -- HPC
(Josh Hemann, Rogue Wave Software)

  • air polution modeling background
  • connection to U of Colorado
  • numpy, scipy, matplotlib
  • HPC for the layman

Continuous Testing
(Titus , Michigan State)
  • build and *run * your tests all the time
  • must haves: version control, automated build/compile, automated tests

  • simple form of automated tests: script + cron
  • but it'd be nice:
    • to manage configurations across systems

    • error reporting / notification: email, rss, twitter
  • options for continuous integration: buildbot, *hudson*, cruisecontrol
  • hudson is great for 80% of peoples needs out of the box (written in Java)
  • new package by speaker called pony-build
  • E.T. phone home, build results from users sent back to you automatically
  • "stop creating, recommending and using crappy software"

Tests and testability
(Ned Batchelder, )

  • Why is it hard?
    • different needs
    • many interfaces
    • baby steps (many little things to do)
  • as important as scalability, portability, maintainability, etc.
  • testable code means:
    • more testing
    • fewer bugs
    • better design
  • lifecycle: test, configure, feed, run, review, fix
  • good tests: convenient, fast, unambiguous, repeatable
  • Mock class to mock an expensive call for testing purposes
    • just check the interfaces without the expensive work
  • turn off caching for better repeatability
  • do one thing well (reduce tests to tiny modular units)
  • "good fences make good neighbors"
  • py.test and nose are good packages
  • there should be as much test code as other code

  • TIP: testing in python

Day 3 (Sun. 21 Feb)

Lightning talks

  • Use postgresDB because you can store procedures in Python
  • Python Spring Cleanup
    • report and cleanup bugs
  • Python users group in Argentina
  • Cython in numpy
    • cimport numpy as np
  • PiCloud
    • > import cloud
    • > id = cloud.call(func)
    • > ret = cloud.result()
  • Molly: python for mobile smart phones
    • Tim Fernando, Oxford
  • Monkeypatching

Keynotes
  • Jython
  • Unladen Swallow, Colin Winter
    • branch of Cpython focused on optimization and performance
    • will be merged into Cpython developement tree for 3.x line
  • Relentlessly Pursuing Opportunities with Python
    • Antonio, Tabblo/HP
    • crazy idea: anyone can commit anywhere and push to production
      • this includes marketing people, managers, etc.
    • python is an HDR programming language (simple to complex)
    • the web is the new OS
    • .deb -> disutils/setup.py -> easy_install -> .egg -> distribute
    • use distribute / pip!
    • daemon book by daniel suarez

Customizing your editor for maximum productivity
(Justin Lilly, HUGE)

  • how to be more productive with your editor
    • 1. learn to type
    • 2. use macros
    • 3. use syntax highlighting
  • split screen
  • terminal integration
  • autocompletion
  • eliminate distractions
  • linting (catch coding mistakes as you type them)
  • Textmate

unittest / PyUnit
(Chander Ganesan, Open Technology Group)

  • like Junit, phpunit, etc.
  • group tests together heirarchically
  • run one test suite
  • test fixtures
  • setup/cleanup features [ setUp()/tearDown() ]

Lightning Talks

  • PyWeek -- create a game in 1 week
  • Please Pirate -- alternative to copyright
  • Self publishing a book
  • Selenium testing library
  • Python in India
Topic revision: r3 - 2010-02-24, MarkClark
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding NRAO Public Wiki? Send feedback