Python

Python is a commonly-used language for programming Machine Learning. It is assumed both that you have a basic understanding of programming as well as Python. There are various web pages available to help you if you need assistance. You will need to install Python on your computer.

Installing

For the class examples from the Manning Machine Learning in Action book, you should install Python version 2 (recommended is 2.7). Python 3, the latest version, is not completely compatible with out Python 2 example code, which requires code changes to run under Python 3. This link can help you decide whether Python 3 is appropriate for your own projects, but Python 3 should be used for new projects or other projects on the Internet, such as from Kaggle.

Python is installed as the ‘SciPy stack’, which includes Python, NumPy, and SciPy, on Windows, OSX, and Linux. Please contact the organizer (on the Organizers section on group Meetup page) if you have questions about these instructions:

  • We recommend installing Anaconda, a self-contained Python Interactive Development Environment (IDE) that simplifies the ‘SciPy stack’ installation and use. The IDE permits simultaneously viewing the source code and running it in separate windows, as well as debugging programs. Anaconda also facilitates switching between Python 2 and Python 3 should you need to do so. Steps to set up Anaconda are:
    • Download Anaconda from the downloads page.
    • Select your platform’s Python 2.7 installation, 64-bit or 32-bit as required (64-bit is almost always compatible, you can install the 32-bit if that fails to run), run the installation executable and follow the instructions.
    • Anaconda provides the program conda, which can be used to install the other version of Python and switch between versions if needed.
    • Launch the Spyder program to start the IDE.
      • On Windows, if Spyder does not launch or appears to launch but not display, make sure Spyder is closed and select the ‘Reset Spyder Settings’ Start menu entry, then launch Spyder again.
    • Learn how to use Spyder from a  tutorial  and the Spyderdocumentation.
  • A second choice for installing the Python 2 ‘SciPy stack’ for your platform’s command-line, which is a longer and more complex process, and requires more knowledge of operating system commands and the filesystem. If you already have the command-line Python 3 installed, you may install Python 2 as above and follow these instructions to switch between Python 3 and Python 2:
    • On Windows, there is a Python launcher to switch between Python 2 and 3.
    • On OSX, Python 2.7 is preinstalled. These instructions permit installing the rest of the ‘SciPy stack’.
    • On Linux, versions of Python 2 and 3 are installed, these are instructions to install the rest of the stack and to switch between versions

Running Examples

Note the following differences between the two environments:

  • In Anaconda:
    • Run the ‘Spyder’ module first, and when the instructions call for loading a Python module, select File->Open and browse to the file, then select the green right-arrow (‘Run file‘ or F5 in Windows) to load it
    • When the instructions call to run ‘module-name.function(…)‘, enter the text in the Console window but omit the ‘module-name.‘ prefix from ‘function(…)‘.
  • Using the command-line Python, change to the directory of the example files, run the Python executable, then to load a module type ‘import module-name.py’ as the first command.

Testing

To test your installation, bring up Spyder if you installed Anaconda, and bring up a terminal window if you installed Python to run at the command line. Then run the following (boldface is what you type):

  • In Spyder, type the following in the ‘Console 1/A’ window:
    In [1]: print “test”
    test
  • At the command line, type:
    $ python
    >>> print “test”
    test

You should have gotten the response ‘test’ in either case. If you received an error, either Python was not installed correctly, or you have Python 3 installed instead. If reviewing the installation instructions or reinstalling Python does show how to correct the problem, please contact the organizer (on the Organizers section on group Meetup page) if you have questions about these instructions.

A quick introduction to Python

Now that you have Python installed, we can go over a few of the features of the language that we will use. This is not an exhaustive description of Python, for that we suggest you try “The Quick Python Book” by Vern Ceder. We will go over collection types and control structures, something found in almost every programming language. We are just going to review them to see how Python handles them. Finally in this section we will review list comprehensions, which may be the most confusing part of getting started with Python.

1 Collection types

Python has a number of ways of storing a collection of items and many modules can be added to create more container types. Below is a short list of the commonly used containers in Python.

1. Lists – lists are an ordered collection of objects in Python. You can have anything in a list: floats, bool, strings, etc. To create a list you use two brackets. The following code illustrates the creating of a list called jj and adding a string and integer.

>>> jj=[]
>>> jj.append(1)
>>> jj.append('nice hat')
>>> jj
[1, 'nice hat']

Python does have an array data type, which, similar to other programming languages, can contain only one type of data. This array type is faster than lists when you are looping. Numpy matrices are related, you can find a description of them here.

2. Dictionaries – dictionaries are an unordered key, value type of storage container. You can use strings and numbers for the key. In other languages, a dictionary may be called an associative array. In the following code we create a dictionary and adds two items to it.

>>> jj={}
>>> jj['dog']='dalmatian'
>>> jj[1]=42
>>> jj
{1: 42, 'dog': 'dalmatian'}

3. Sets — A set is just like a set in mathematics. If you are not familiar with that, it simply means a unique collection of items. You can create a set from a list by entering the following.

>>> a=[1, 2, 2, 2, 4, 5, 5]
>>> sA=set(a)
>>> sA
set([1, 2, 4, 5])

Sets can then do math operations on sets, such as the union, intersection, difference etc. The union is done by the pipe symbol |, and the intersection is done by the & symbol.

>> sB=set([4, 5, 6, 7])
>>> sB
set([4, 5, 6, 7])
>>> sA-sB
set([1, 2])
>>> sA | sB
set([1, 2, 4, 5, 6, 7])
>>> sA & sB
set([4, 5])

2 Control structures

In Python, indentation matters. Some people actually complain about this, but it forces you to write clean readable code. In for loops, while loops, or if statements, you use indentation to tell the machine which lines of code belong inside these loops. In some other languages, you use braces:{ }. By using indentation instead of braces Python saves a lot of space. Let’s see how to write some common control statements.

1. If – the if statement is quite straightforward. You can use it on one line like so:

>> if jj < 3: print "it's less than three man"

Or, for multiple lines, you can use an indent to tell the interpreter you have more than one line. You can use this indent with just one line of code if you prefer.

>> if jj < 3:
... print "it's less than three man"
... jj = jj + 1

Multiple conditionals, like else if, are written as elif, and the keyword else is used for a default condition.

>> if jj < 3: jj+=1
... elif jj==3: jj+=0
... else: jj = 0

2. For – a for loop in Python is like the enhanced for loop in Java or C++. If you are not familiar with those, it simply means that the for loop goes over every item in a collection. Let me give you some examples from lists, sets and dictionaries.

>> sB=set([4, 5, 6, 7])
>>> for item in sB:
... print item
...
4
5
6
7

Now let’s see how we loop over a dictionary.

>> jj={'dog': 'dalmatian', 1: 45}
>>> for item in jj:
... print item, jj[item]
...
1 45
dog dalmatian

The items iterated over are actually the dictionary keys.

3 List comprehensions

I think the most confusing thing for people new to Python are list comprehensions. List comprehensions are an elegant way of generating a list without writing a lot of code. However, the way they work is a little bit backwards. Let’s see one in action. Then we will discuss it.

>>> a=[1, 2, 2, 2, 4, 5, 5]
>>> myList = [item*4 for item in a]
[4, 8, 8, 8, 16, 20, 20]