Closure¶
def outer():
x = 1
def inner():
print x # 1
return inner
foo = outer()
foo.func_closure # doctest: +ELLIPSIS
#(<cell at 0x...: int object at 0x...>,)
Clearly, we cannot access x, as it lifetime is specific to outer func. And once we return we are out. However, due to peculiar property of """function closures""" {which means that inner functions defined in non-global scope # remember what their enclosing namespaces looked like ""at definition time"".} This can be seen by looking at the func_closure attribute of our inner function which contains the variables in the enclosing scopes.
Variables and Objects: "everything is an object"¶
Function¶
Variables Functions are first class objects in Python
issubclass(int, object) # all objects in Python inherit from a common baseclass
#True
def foo():
pass
foo.__class__
issubclass(foo.__class__, object)
Variables¶
# some_guy is a name and 'Fred' is a sting obj
# some_guy is bound to string obj containing Fred.
some_guy = 'Fred'
# first_names is bound to empty list obj.
first_names = []
first_names.append(some_guy) #Until here only 2 obj exist string and list obj.
# names to names assignment does not create a object.
# another_list_of_names is just bound to first_names
another_list_of_names = first_names
another_list_of_names.append('George')
some_guy = 'Bill'
print (some_guy, first_names, another_list_of_names)
Passing arguments¶
#Mutable data type
def foo(bar):
bar.append(42)
print(bar) # [42]
answer_list = []
foo(answer_list)
print(answer_list)
def try_to_change_list_reference(the_list):
print 'got', the_list
the_list = ['and', 'we', 'can', 'not', 'lie']
print 'set to', the_list
outer_list = ['we', 'like', 'proper', 'English']
print 'before, outer_list =', outer_list
try_to_change_list_reference(outer_list)
print 'after, outer_list =', outer_list
Immutable data type: At most it can create a bar in foo namespace¶
# at most it can create a bar in foo namespace
def foo(bar):
bar = 'new value'
print (bar)
# >> 'new value'
answer_list = 'old value'
foo(answer_list)
print(answer_list) # >> 'old value'
Think of stuff being passed by assignment instead of by reference/by value. That way, it is always clear, what is happening as long as you understand what happens during normal assignment.So, when passing a list to a function/method, the list is assigned to the parameter name. Appending to the list will result in the list being modified. Reassigning the list inside the function will not change the original list, since:
a = [1, 2, 3]
b = a
b.append(4)
b = ['a', 'b']
print a, b # prints [1, 2, 3, 4] ['a', 'b']
Passing variable as arg to function. And want to change it¶
x = 0
def changeval(a):
a += 1
print x
changeval(x)
print x
Just like Java, for prmitive data types, it's pass by value. I mean literally value and not passing the address by value.So the solution is to put it inside a container which is mutable.
x = [0]
def changeval(a):
a[0] += 1
print x[0]
changeval(x)
print x[0]
Data Structure in Python.¶
String¶
Python strings are immutable (i.e. they can't be modified). There are a lot of reasons for this. Use lists until you have no choice, only then turn them into strings.
mystring = 'here'
mystring[0] = "U"
#Solution
s = list(mystring)
s
s[0] = 'w'
s
"".join(s)
Dictionary¶
Since an ordered dictionary remembers its insertion order, it can be used in conjuction with sorting to make a sorted dictionary:
# regular dictionary doesn't maitain the order.
from collections import OrderedDict
d = {'banana': 3, 'apple':4, 'pear': 1, 'orange': 2}
# dictionary sorted by key
OrderedDict(sorted(d.items(), key=lambda t: t[0]))
# dictionary sorted by value
OrderedDict(sorted(d.items(), key=lambda t: t[1]))
# dictionary sorted by length of the key string
OrderedDict(sorted(d.items(), key=lambda t: len(t[0])))
List¶
#append: Appends object at end
x = [1, 2, 3]
x.append([4, 5])
print x
# insert at an index i
x = [1, 2, 3]
x.insert(1,22)
print x
#extend: extends list by appending elements from the iterable
x = [1, 2, 3]
x.extend([4, 5])
print x
Sorting a list¶
x = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
print x.sort() # sort() doesn't return the sorted list. And reorder the list.
print x
x = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
print sorted(x) # sorted() does return the sorted list. But doesn't change the list.
print x
print x.sort(reverse=True)
print x
x = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
print sorted(x, reverse=True)
print x
Sorting using you own comparator¶
Like many other programming languages, Python does have the comparator facility to sort a list according to your need. You can define a customize compare function which will implement your logic for sorting. I'm giving few example about sorting first, then customize sorting. Lets talk about simple sorting option at Python first.
def comparator(x, y):
return x-y
## Initial Array
array = [8, 2, 9, 0, 1, 2, 5]
array.sort(comparator)
print array
array = [8, 2, 9, 0, 1, 2, 5]
print sorted(array, comparator, reverse = True)
print array
Now with some logic into the function
## custom sort
def comparator(x, y):
## Just a condition for example,
## you can add as many you need.
if(x % 2):
return x
return x-y
## Initial Array
array = [8, 2, 9, 0, 1, 2, 5];
array.sort(comparator)
print array
Sorting List of List¶
Using itemgetter¶
from operator import itemgetter
L=[[0, 1, 'f'], [4, 2, 't'], [9, 4, 'afsd']]
#not in place using sorted:
sorted(L, key=itemgetter(2))
#in place:
L.sort(key=itemgetter(0))
print L
Using lamba¶
l = [[0, 1, 'f'], [4, 2, 't'], [9, 4, 'afsd']]
#In-place
l.sort(key=lambda x: x[2])
print l
#not in place using sorted:
sorted(l, key=lambda x: x[1])
Tuple¶
#Tuple: Think of a tuple as a list that you can't change
mytuple = (1,2,3,4)
mytuple[2] = 4
#List to tuple
seq =[1,1,2,3,4]
tuple(seq)
#To write a tuple containing a single value you have to include a comma, even though there is only one value:
tup1 = (50,)
Set¶
#Set
s = set()
s = set([1,1,2,3,4])
print s
s = {1,1,2,3,4}
print s
Stack¶
#Stack in Python i.e. List
stack = [3, 4, 5]
stack.append(6)
stack.append(7)
stack
stack.pop()
stack
Queue¶
There are 2 options collections.deque and Queue.Queue
Queue.Queue and collections.deque serve different purposes. Queue.Queue is intended for allowing different threads to communicate using queued messages/data, whereas collections.deque is simply intended as a datastructure. That's why Queue.Queue has methods like put_nowait(), get_nowait(), and join(), whereas collections.deque doesn't. Queue.Queue isn't intended to be used as a collection, which is why it lacks the likes of the in operator.
It boils down to this: if you have multiple threads and you want them to be able to communicate without the need for locks, you're looking for Queue.Queue; if you just want a queue or a double-ended queue as a datastructure, use collections.deque.
import collections
d = collections.deque('abcdefg')
len(d) == 0 # is empty
From Left¶
d[0] # Peek first element
d.popleft() # Pop from first
d
d.appendleft('z') # Append from first side
d
From Right¶
d[-1] # Peek last element
d.pop() # Pop from last
d.append('x') # Append from last side
d
Heap¶
Implementing HeapSort i.e. Priority Queue¶
import heapq
def heapsort(iterable):
'Equivalent to sorted(iterable)'
h = []
for value in iterable:
heapq.heappush(h, value)
return [heapq.heappop(h) for i in range(len(h))]
# Note in heapq.heappush & heapq.heappush, heapq. is mandatory. Otherwise code break.
heap = heapsort([1, 3, 5, 7, 9, 2, 4, 6, 8, -1])
print heap
Peek into heap without poping it¶
Heaps are arrays for which heap[k] <= heap[2k+1] and heap[k] <=heap[2k+2] for all k, counting elements from zero. For the sake of comparison, non-existing elements are considered to be infinite. The interesting property of a heap is that heap[0] is always its smallest element.
heap[0] # This is the python equivalent to peek the top of heap.
Using Priotity Queue: Heap elements can be tuples. This is useful for assigning comparison values (such as task priorities) alongside the main record being tracked
h = []
heapq.heappush(h, (5, 'write code'))
heapq.heappush(h, (7, 'release product'))
heapq.heappush(h, (1, 'write spec'))
heapq.heappush(h, (3, 'create tests'))
heapq.heappop(h)
Note: Firstly, it's a MinHeap by default plus it's heapify using thef first element.
Making a MaxHeap¶
Since by default heapq makes a minHeap. There is a hacky way to append minus infront of number to make minheap a maxheap. But it goes wrong if it has negative numbers or alphabets
Method 1:
import functools
@functools.total_ordering
class ReverseCompare(object):
def __init__(self, obj):
self.obj = obj
def __eq__(self, other):
return isinstance(other, ReverseCompare) and self.obj == other.obj
def __le__(self, other):
return isinstance(other, ReverseCompare) and self.obj >= other.obj
def __str__(self):
return str(self.obj)
def __repr__(self):
return '%s(%r)' % (self.__class__.__name__, self.obj)
import heapq
letters = 'axuebizjmf'
heap = map(ReverseCompare, letters)
heapq.heapify(heap)
print heapq.heappop(heap)
Method 2:
Looks like there are some undocumented functions for max heap: _heapify_max, _heappushpop_max, _siftdown_max, and _siftup_max. But there is nothing to heapq._heappop_max.
import math
from cStringIO import StringIO
def show_tree(tree, total_width=36, fill=' '):
"""Pretty-print a tree."""
output = StringIO()
last_row = -1
for i, n in enumerate(tree):
if i:
row = int(math.floor(math.log(i+1, 2)))
else:
row = 0
if row != last_row:
output.write('\n')
columns = 2**row
col_width = int(math.floor((total_width * 1.0) / columns))
output.write(str(n).center(col_width, fill))
last_row = row
print output.getvalue()
print '-' * total_width
print
return
import heapq
listForTree = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
heapq.heapify(listForTree) # for a min heap
print type(listForTree)
show_tree(listForTree)
heapq._heapify_max(listForTree) # for a maxheap!!
show_tree(listForTree)
heapq.heappop(listForTree)
show_tree(listForTree)
heapq._heapify_max(listForTree)
heapq.heappop(listForTree)
looks like _heapify_max not working for dyanamically added/removed elements.
def heapsort(iterable):
h = []
for value in iterable:
heapq.heappush(h, value)
return [heapq.heappop(h) for i in range(len(h))]
def heapsort2(iterable):
h = []
heapq._heapify_max(h)
for value in iterable:
heapq.heappush(h, value)
return [heapq.heappop(h) for i in range(len(h))]
print heapsort([1, 3, 5, 7, 9, 2, 4, 6, 8, 0])
print heapsort2([1, 3, 5, 7, 9, 2, 4, 6, 8, 0])
Operators in Python¶
You can find all of those operators in the Python language reference, though you'll have to scroll around a bit to find them all. As other answers have said: i. The operator does exponentiation. a b is a raised to the b power. The same * symbol is also used in function argument and calling notations, with a different meaning (passing and receiving arbitrary keyword arguments). ii. The ^ operator does a binary xor. a ^ b will return a value with only the bits set in a or in b but not both. This one is simple! iii. The % operator is mostly to find the modulus of two integers. a % b returns the remainder after dividing a by b. Unlike the modulus operators in some other programming languages (such as C), in Python a modulus it will have the same sign as b, rather than the same sign as a. The same operator is also used for the "old" style of string formatting, so a % b can return a string if a is a format string and b is a value (or tuple of values) which can be inserted into a. iv. The // operator does Python's version of integer division. Python's integer division is not exactly the same as the integer division offered by some other languages (like C), since it rounds towards negative infinity, rather than towards zero. Together with the modulus operator, you can say that a == (a // b)b + (a % b). In Python 2, floor division is the default behavior when you divide two integers (using the normal division operator /). Since this can be unexpected (especially when you're not picky about what types of numbers you get as arguments to a function), Python 3 has changed to make "true" (floating point) division the norm for division that would be rounded off otherwise, and it will do "floor" division only when explicitly requested. (You can also get the new behavior in Python 2 by putting from future import division at the top of your files. I strongly recommend it!)
Generators and Yield¶
The idea of generators is to calculate a series of results one-by-one on demand (on the fly). In the simplest case, a generator can be used as a list, where each element is calculated lazily.
#List comprehension to create list
the_list = [2**x for x in range(5)]
len(the_list)
# again using syntax much like list comprehension but pay close attention to the wrapping bracket it's () and not []
# That small change make it an iterator
the_generator = (x+x for x in range(5))
len(the_generator) # generator doesn't have len defined. Which is obvious
Example 1:
def search(keyword, filename):
print('generator started')
f = open(filename, 'r')
# Looping through the file line by line
for line in f:
if keyword in line:
print "found it"
# If keyword found, return it
yield line
f.close()
the_generator = search('Seal', 'directory.txt') # Nothing happens
When we call the search function, its body code does not run. The generator function will only return the generator object, acting as a constructor.
print type(search)
print type (the_generator)
To make the newly-created generator calculate something, we need to access it via the iterator protocol, i.e. call it's next method
print next(the_generator)
Now let's request the next match.
print next(the_generator)
Note generator started didn't printed this time. Because the generator resumed on the last yield keyword/statement and went through the loop until it hit the yield keyword/statement again.
print next(the_generator) # we couldn't find any other.
Example 2:
def hold_client(name):
yield 'Hello, %s! You will be connected soon' % name
yield 'Dear %s, could you please wait a bit.' % name
yield 'Sorry %s, we will play a nice music for you!' % name
yield '%s, your call is extremely important to us!' % name
mygenerator = hold_client("Deb") # Nothing happened again
print next(mygenerator)
print next(mygenerator)
Example 3:
def fibonacci(n):
curr = 1
prev = 0
counter = 0
while counter < n:
yield curr
prev, curr = curr, prev + curr
counter += 1
fib_gen = fibonacci(4)
# to iterate over the whole of the generator
for item in fib_gen:
print item
for item in fib_gen:
print item
Note this doesn't return anything because the generator exhausted and as we know iterator can be traversed only once. Unlike a list which are traversed any number of time.
Writing your own Iterator¶
There are four ways to build an iterative function: create a generator (uses the yield keyword) use a generator expression (genexp) create an iterator (defines iter and next (or next in Python 2.x)) create a function that Python can iterate over on its own (defines getitem)
i. Generator¶
def uc_gen(text):
for char in text:
yield char.upper()
ii. Generator Expression¶
def uc_genexp(text):
return (char.upper() for char in text)
iii. Iterator Protocol¶
class uc_iter():
def __init__(self, text):
self.text = text
self.index = 0
def __iter__(self):
return self
def __next__(self):
try:
result = self.text[self.index].upper()
except IndexError:
raise StopIteration
self.index += 1
return result
iv. Getitem Method¶
class uc_getitem():
def __init__(self, text):
self.text = text
def __getitem__(self, index):
result = self.text[index].upper()
return result
Testing all the above 4 methods¶
for iterator in uc_gen, uc_genexp, uc_iter, uc_getitem:
for ch in iterator('abcde'):
print ch,
print
Enumerate¶
Enumerate is the better option for range(len(myarr)).
# By default the enumerate starts from 0.
myarr = ['a','b','c','d']
for i, elem in enumerate(myarr):
print i, elem
# If you want the enumerate to start with 1, you can so do as well
for i, elem in enumerate(myarr, 1):
print i, elem
# If you want to enumerate in reversed order, below you can do so:
for i, elem in reversed(list(enumerate(myarr))):
print i, elem
NOTE: No copy is created, the elements are reversed on the fly while traversing! This is an important feature of all these iteration functions (which all end on “ed”) reversed() doesn't modify the list. reversed() doesn't make a copy of the list (otherwise it would require O(N) additional memory). If you need to modify the list and actually reverse it then use alist.reverse(); if you need a copy of the list in reversed order use alist[::-1]. Also reversed(alist) returns a iterator whereas alist[::-1] returns a list
Implementing GROUP BY in Python.¶
If you have to hash on multiple values, because if you need to group by just 1 field then dictionary should do the trick. However, when you need to group by multiple field then it's get a bit tricky. Python doesn't allow you to use mutable data as keys in dictionaries,because changes after insertion would make the object un-findable. You can use tuples as keys though.
hash([1,2,3])
hash((1,2,3)) #tuple
hash(set((1,2,3)))
hash(frozenset((1,2,3)))
File Operations¶
with open("directory.txt") as f:
data = f.read()
print len(data),data
with open("directory.txt") as f:
line = f.readline()
print len(line),line
As you can see you only get the first line
Functional Programming¶
#Builtin func returns a dictionary containing all the variable names Python knows about.
a_string = "This is a global variable"
print globals()['a_string']
#{..., 'a_string': 'This is a global variable'}
def foo():
print locals()
foo() #{} Since the local namespace of foo is empty.
Map¶
def cube(x): return x*x*x
map(cube, range(1, 11)) # is equivalent to [cube(1), cube(2), ..., cube(10)]
Filter¶
print filter(cube, range(2, 25))
#is equivalent to result after running
result = []
for i in range(2, 25):
if cube(i):
result.append(i)
print result
Zip¶
Return a list of tuples, where each tuple contains the i-th element from each of the argument sequences. The returned list is truncated in length to the length of the shortest argument sequence.
zip( range(5), range(1,20,2) )
Join¶
str = "-"
seq = ("a", "b", "c") # This is sequence of strings.
print str.join( seq )
range & xrange¶
For the most part, xrange and range are the exact same in terms of functionality. They both provide a way to generate a list of integers for you to use, however you please. The only difference is that range returns a Python list object and xrange returns an xrange object. Eg:That means that if you have a really gigantic range you'd like to generate a list for, say one billion, xrange is the function to use. This is especially true if you have a really memory sensitive system such as a cell phone that you are working with, as range will use as much memory as it can to create your array of integers, which can result in a MemoryError and crash your program. It's a memory hungry beast.
for i in xrange(-1, -10, -1): print(i),