CSC223 Advanced Python for Data Manipulation, Dr. Dale E. Parson, Fall 2023
Course Main Page

Contents
    Week 1 Overview and recap of Python features covered in CSC123.
    Week 2 on varieties of function types.
    Week 3
is the sorting example and Assignment 1 overview.
Week 1
Recap of basic Python features covered in CSC123 because not all CSC223 students have taken CSC123.
Read and work along with Sections 1 through 5 of the Python Tutorial in parallel to our class time examination of Python basics.
~parson/Scripting/CSC223f23SORTassn0.solution.zip is also available for download here.
    ^^^ That is not an assignment. It is demo code for class. ^^^

Python Resources

Please log into acad or mcgonagall (ssh mcgonagall from acad) and run the following commands:

$ python -V Python 3.7.7 $ ipython -V 7.14.0

If you see earlier version numbers, edit a file called .bash_profile in your login directory and add the following 2 lines at the top:

alias python="/usr/local/bin/python3.7" alias ipython="/usr/local/bin/ipython3"

Log out, log back in, and check the version numbers again. Let me know if you run into problems.

Windows users can download the WinSCP file transfer client in the Computer Science sub-menu below here.

    We will be using the 3.x version of Python.
    Try running python -V to see that you are getting Python 3.x.x as your default.
        From the mcgonagall machine (ssh mcgonagall from acad) do the following actions in bold:
        Edit a file called .bash_profile in your login directory (create it if needed) and add these 2 lines near the top.
                export PATH="/usr/local/bin:${PATH}"
                alias python="/usr/local/bin/python3.7"
                alias ipython="/usr/local/bin/ipython3"
        Save the file and exit the editor, log out and log back into mcgonagall.
        Now type this:
                python -V    # You should see this:
                    Python 3.7.7
            If you install python on your own machine, just running python will get you the simpler-to-use interpreter.
            I will use ipython in lecture.

    The Python website is at http://www.python.org/.
    The official site version 3.7 Tutorial is Here and the 3.7 Library Reference is Here.
    The IPython site is here.
    We have Python installed on acad, but if you want your own copy:
    You can download Python 3.x from here. Use the most recent stable 3.x for this course.

Free on-line textbooks used by previous instructors:
    A Whirlwind Tour of Python
    Python Data Science Handbook


Most of our assignments this semester will run on acad or mcgonagall, using a makefile per project to drive testing and project submission.
    That may change when we get to generating graphical data visualizations.

For students new to using our department's Linux servers:

cmd ssh
      acad.kutztown.edu
Libraries:
    A Tutorial and an Overview of the Standard Library
    Python math and statistics and random libraries.
    NumPy for numeric processing. We may use numpy.random.Generator Distributions.
    SciPy for scientific programming.
    scikit-learn for machine learning. We may sample. CSC523 Advanced Scripting for Data Science uses it heavily.

Python Basics                    Top of Page

Read and work along with Sections 1 through 5 of the Python Tutorial in parallel to our class time examination of Python basics.

Python’s read-eval-print UI.

You can interact with Python to compute interactively. It can also interpret script files. You create variables on the fly. They hold whatever type of data you put into them.
$ python –V # Must  be version 3.x. for our course.
$ python
>>> a = 2 ; b = 4.7 ; (a - 7) * b
-23.5
# “;” and newline are command separators. Use \ to continue a line onto the next line.

Python uses indentation, not {}, to delimit flow-of-control constructs
>>> a = 7
>>> if a <= 7:
        print (a, "Is low”)
    else:
        print(a, "is high”)
7 Is low
# Do NOT mix leading spaces with TABS in assignments.
# Use leading spaces to be compatible with handouts.

for loop iterates over sequence of values.
>>> a = 7
>>> mylist = [a, 'a', "Strings use either delimiter"]
>>> for s in mylist:
        print(s)
7
a
Strings use either delimiter

range() creates a generator for a sequence of numbers
>>> r = range(1,3)
>>> r
range(1, 3)
>>> type(r)
<class 'range'>
>>> for i in r:
        print("i is", i) # Note that the final value is exclusive
i is 1
i is 2
>>> for i in range(3,-3,-2): # -2 here is an increment
        print("i is", i)
i is 3
i is 1
i is -1

Use and, or, not instead of &&, ||, ! as used in Java or C++
>>> a = 1 ; b = 5
>>> while (a <= 3) and (b >= 3):
        print("a, b", a, b)
        a += 1 ; b = b - 2
a, b 1 5
a, b 2 3
>>> print("a, b", a, b)
a, b 3 1

Basic data types
Basic data types include strings, ints, floats, and None, which is Python’s “no value” type.
Use a raw string to make escape sequences literal.
>>> a = "a string" ; b = 'another string' ; c = -45 ; d = 4.5 ; e = None
>>> print(a,b,c,d,e)
a string another string -45 4.5 None
>>> raws = r'a\n\nraw string'
>>> print(raws)
a\n\nraw string

Aggregate data types

A list is a mutable sequence of values. A tuple is an immutable sequence.
>>> L = ['a', 1, ["b", 2]]
>>> for e in L:
        print(e)
a
1
['b', 2]
>>> T = tuple(L)
>>> T
('a', 1, ['b', 2])
>>> L
['a', 1, ['b', 2]]
>>> L[1] = 11
>>> L
['a', 11, ['b', 2]]
>>> T[1] = 22
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment

A dictionary maps keys to values.
>>> m = {'a': 1, "b" : 2} ; m['c'] = 3
>>> for k in m.keys():
        print(k, m[k])
a 1
c 3
b 2
>>>  'b' in m   #  same as 'b' in m.keys()
    # Python 2.x allows: m.has_key('b')
True
>>> 'z' in m
False

A set is an unordered collection of distinct values. A frozenset is immutable.

>>> L = [1, 2, 1, 3, 66, 1, 66, 2, 4]
>>> S = set(L)
>>> L
[1, 2, 1, 3, 66, 1, 66, 2, 4]
>>> S
{1, 2, 3, 4, 66}
>>> F = frozenset(S)
>>> F
frozenset({1, 2, 3, 4, 66})
>>> S.add(108)
>>> S
{1, 2, 3, 4, 66, 108}
>>> F.add(108)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'frozenset' object has no attribute 'add'

Python has functions and classes
>>> def f(a, b):
...     return a + b
...
>>> f(1, 3.5)
4.5
>>> f("prefix", 'suffix')
'prefixsuffix'

Week 2                    Top of Page
Named and anonymous functions, functions as first-class objects, list comprehensions.
Higher order functions, built-in generator types, custom closures and generators.

I will record interactions with ipython during class time and post edited versions accessible here.

$ ipython --logfile=logdemo     # or --logappend
Activating auto-logging. Current session state plus future input saved.
Filename       : logdemo
Mode           : backup
Output logging : False
Raw input log  : False
Timestamping   : False
State          : active
Python 3.10.1 (tags/v3.10.1:2cd268a, Dec  6 2021, 19:10:37) [MSC v.1929 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 7.30.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: %logstop

In [2]: %logstart -o  # needed to log output from interpreter
Activating auto-logging. Current session state plus future input saved.
Filename       : # needed to log output from interpreter
Mode           : backup
Output logging : True
Raw input log  : False
Timestamping   : False
State          : active

Third-class functions can be invoked (called).

In [4]: def sum(a, b, c=None): # Function parameters can have default values
   ...:     result = a + b
   ...:     if c != None:
   ...:         result += c
   ...:     return result
In [5]: sum(11, 22)
Out[5]: 33
In [6]: sum(11, 22, 33)
Out[6]: 66
In [7]: sum(11, 22.2, 33)
Out[7]: 66.2

# Implicit parametric polymorphism means variables and function parameters can take many forms
# (many types). The objects themselves, such as integer, float, or string variables, must support
# the operations used.

In [8]: sum('prefix', '_infix_')
Out[8]: 'prefix_infix_'
In [9]: sum('prefix', '_infix_','postfix')
Out[9]: 'prefix_infix_postfix'

Second-class functions are third-class functions that can be passed a parameters.

In [23]: def applyBinaryFunction(f, arg1, arg2):
    ...:     return f(arg1, arg2)
In [24]: applyBinaryFunction(sum, 1, 2)
Out[24]: 3

First-class functions are second-class functions that can be stored in variables and returned from functions.
Lambda expressions are expressions that define anonymous (unnamed) functions.

In [25]: applyBinaryFunction(lambda x, y : x * y, 3, 4)
Out[25]: 12
In [26]: divvy = lambda x, y : x / y
In [27]: applyBinaryFunction(divvy, 3, 4)
Out[27]: 0.75
In [34]: from types import FunctionType
In [35]: def makeReturnFunction(sourceCode):
    ...:     f = eval(sourceCode) # eval() evaluates an expression string
    ...:     if not (type(f) == FunctionType):
    ...:         raise TypeError('NOT A FUNCTION: ' + str(sourceCode))
    ...:     return f
In [36]: subby = makeReturnFunction('lambda x, y : x-y')
In [37]: subby(20, 30)
Out[37]: -10
In [38]: oopsie = makeReturnFunction('5 > 3')
TypeError: NOT A FUNCTION: 5 > 3
In [39]: oopsie = makeReturnFunction('if a == b:')
  File "<string>", line 1
    if a == b:
    ^
SyntaxError: invalid syntax
eval(string) interprets its string argument as an expression
exec(string) compiles its statement into executable code and runs it
compile(string) just does the compile part for later exec

In [40]: a = -3
In [41]: exec('a = 4')
In [42]: a
Out[42]: 4
In [46]: c = compile('a = 5',filename='nofile',mode='exec')
In [47]: a
Out[47]: 4
In [48]: exec(c)
In [49]: a
Out[49]: 5

Higher-order functions accept functions as arguments and direct their application to data.

In [53]: from functools import reduce
In [54]: help(map)
Help on class map in module builtins:
class map(object)
 |  map(func, *iterables) --> map object
 |
 |  Make an iterator that computes the function using arguments from
 |  each of the iterables.  Stops when the shortest iterable is exhausted.
In [55]: help(filter)
Help on class filter in module builtins:
class filter(object)
 |  filter(function or None, iterable) --> filter object
 |
 |  Return an iterator yielding those items of iterable for which function(item)
 |  is true. If function is None, return the items that are true.
In [56]: help(reduce)
Help on built-in function reduce in module _functools:
reduce(...)
    reduce(function, iterable[, initial]) -> value

    Apply a function of two arguments cumulatively to the items of a sequence
    or iterable, from left to right, so as to reduce the iterable to a single
    value.  For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates
    ((((1+2)+3)+4)+5).  If initial is present, it is placed before the items
    of the iterable in the calculation, and serves as a default when the
    iterable is empty.
In [57]: l = range(0,10)
In [58]: l
Out[58]: range(0, 10)
In [59]: l = list(l)
In [60]: l
Out[60]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [61]: m = map(lambda x : x * 11, l)
In [62]: m
Out[62]: <map at 0x243a498b6a0>
In [63]: m = list(m)
In [64]: m
Out[64]: [0, 11, 22, 33, 44, 55, 66, 77, 88, 99]
In [65]: r = reduce(sum, m)
In [66]: r
Out[66]: 495
In [67]: 11+22+33+44+55+66+77+88+99
Out[67]: 495
In [69]: f = filter(lambda x : (x & 1) == 0, m) # matches even numbers
In [70]: f
Out[70]: <filter at 0x243a4b80550>
In [71]: list(f)
Out[71]: [0, 22, 44, 66, 88]

Custom generators

In [73]: def mygen(listOfValues): # calling mygen constructs a generator
    ...:     for v in listOfValues:
    ...:         yield v # returns control to caller, can be resumed later
In [74]: g = mygen(range(0, 100, 5))
In [76]: g
Out[76]: <generator object mygen at 0x00000243A4B5C4A0>
In [77]: for value in g:
    ...:     print(value)
    ...:     print("Do something else")
0
Do something else
5
Do something else
10
Do something else
...
90
Do something else
95
Do something else

Custom closures return inner functions that have access to outer parameters & variables.
They are similar to object-oriented objects that house state variables & methods (member functions).

In [83]: def constructor(initialValue):
    ...:     localvar = 4
    ...:     def inner(parameter):
    ...:         return (initialValue + localvar) * parameter
    ...:     return inner
In [84]: f = constructor(3)
In [85]: f(2)
Out[85]: 14
In [86]: f(-1)
Out[86]: -7
In [87]: f
Out[87]: <function __main__.constructor.<locals>.inner(parameter)>

Week 3 is the sorting example and Assignment 1 overview.        Top of Page.

~parson/Scripting/CSC223f23SORTassn0.solution.zip is also available for download here.

Assignment 1 Specification, code is due by end of Friday September 29 via make turnitin on acad or mcgonagall.