CSC 543 - Multiprocessing & Concurrent Programming, Student Topics.

Dr. Dale E. Parson

These are the student topics and assignments for spring 2015. We need to resolve topics and time of presentation.
Each student MUST:
    1. Sign up for a topic.
    2. Complete some amount of demonstration code and perform related library research, including using the Rohrbach Library IEEE and/or ACM databases for at least 1 paper.
    3. Give a 15-minute talk to the class on April 29 or May 6.
    4. Turn in all code, a README.txt file explaining how to build & run your code, your presentation slides, and a short write-up in a .txt text file or Word .doc file format, to me by midnight May 6.

    5. In your presentation and short write-up, give a reference to at least one related article that you found via searching each of these Rohrbach databases (2 references minimum, 1 for each): 
        ACM: http://library.kutztown.edu/acm
        IEEE: http://library.kutztown.edu/csdl

    April 29       15-minute student presentations. (Adib, Andrew, Robert, Jairus, David C., David D., Cory, Benjamin)
        1. 6:00
Concurrent Map / Reduce in Java, Adib Farah
        2. 6:20
Java ReentrantReadWriteLock class, David Day
        3. 6:40
Multithreading on Android, Cory Ma
        4. 7:00 C++0X Multithreading in Assignment 1, David Clymer
        10-minute BREAK 7:20
        5. 7:30 C++0X atomics (in Assignment 1?), Benjamin Swearingen
        6. 7:50 Search algorithm on CUDA GPU, Andrew Wernicki.
        7. 8:10
Concurrency in CPython versus Jython, Robert Brotzman-Smith.
        8. 8:30
Python asyncio package, Jairus Martin
    May 6          15-minute student presentations. All student project code is due by midnight. (Matt B., Casey, Nick, Sadia, Brandon, Adam, Matt T. & Scott)

        1. 6:00 java.nio Channels ("new I/O") & multithreading, Sadia Tanveer
        2. 6:20 Java serialization and compression for distributed communication, Casey Hennessey
        3. 6:40
Introducing Scala, with emphasis on functional programming constructs and why they are useful to concurrent programming, Matt Tothero
        4. 7:00 
Actor-based concurrency in Scala, Scott Twombly
        10-minute BREAK 7:20
        5. 7:30 OpenMP concurrent programming API, Adam Houghton
        6. 7:50 WordCA Cellular Automaton on GPU/CUDA, Matt Bachman.
        7. 8:10 Neural networks on the Web using parallel processing, Nick Evans.
        8. 8:30
Hash cracking on GPUs or distributed systems, Brandon Trumble

***Benjamin (or other C++0X) -- you may want to look at *The Art of Multiprocessor Programming* by Herlihy & Shavit as an on-line book in Rohrbach, and write one of those Java collection classes in C++0X and report on that. Implementing a container class that uses C++ atomic type(s) is another C++0X possibility. Make sure to use
/opt/gcc4/bin/g++ on Hermione or Dumbledore in command lines like this:

/opt/gcc4/bin/g++ -c  -I.  -std=c++0x -pthread -O3 Demo.cxx -o Demo.o
/opt/gcc4/bin/g++ -std=c++0x -pthread Demo.o  -lpthread -lstdc++ -o Demo
 
Here is a draft list of topics and some associated resources. All of them require writing at least some amount of extension code of your own. Check with me on how much. Also, *feel free to propose your own topic!*

NVIDIA Graphical Processing Units (GPAs) / CUDA software support and applications. WHO?
    There could be several GPU projects. Dumbledore houses a NVIDIA C2070 card as its second graphics card, Luna houses a Tesla K20C that has not yet been used. Note that there is information on the C2070 at that latter link.
    A. I wrote some research grade bidirectional search code on the C2070 in summer 2012, see parson/multip/multip/pennydime/*Cuda* on acad. I have a draft write-up on that work. I would love to see my code verified to work with the current version of CUDA on Dumbledore's C2070, and then benchmarked on Luna's K20C, and then perhaps enhanced to get additional benefits out of the K20C. My Cuda code includes all-GPU solutions that communicate only text IO to the Linux process, and also a few hybrid solutions that split work between the Linux CPUs and the GPU. I also have some partially outdated Cuda books. Andrew Wernicki, April 29.
    B. A port of the WordCA engine to one of the GPU cards would be very interesting. We'd have to figure out how to refactor the application architecture. Matt Bachman, May 6.
    C. There are many possible projects at the General Purpose Processing on a GPU site and the NVIDIA site, and undoubtedly others.

C++0X / C++11 added support for a memory model, thread-safe library classes and atomics that are mostly a subset of what Java provides. David Clymer, April 29. KUIT and I have resolved build problems associated with this code base. David Clymer will port Assignment 1 to C++0X.
    A. My draft paper cited above for Cuda also discusses some C++0X code I wrote and benchmarked with Java in summer 2012. There are subdirectories CPPCoinPuzzle/, ClikeCoinPuzzle/ and GPPCPUEmulatesGPU/ under ~parson/multip/multip/pennydime with C++0X code that may need to be mildly ported to C++11. Rohrbach library has a hard copy of C++ Concurrency in Action, which is the book I used. I tried my benchmarks in January 2015 with the new Linux csit/acad system, and found that C++11 revised and broke the C++0X libraries for atomics. They changed names to make sure no one assumes backward compatibility. The book will be out of date for those issues, but it is largely useful. This talk would include discussion of the C++0X and C++11 enhancements for concurrency, porting my code as necessary to work on C++11, and making extensions that we agree on.
    B. Benjamin Swearingen, demo algorithm (e.g., FIFO queue) and a test driver in C++0x using C++0X atomics, April 29.

Networked communications & concurrency. Option A is Sadia Tanveer on May 6. Option C is Adam Houghton on May 6.
    A. This would basically be taking the socket code that I have supplied for assignments 3 & 4, and replacing it with library classes from java.nio.channels and associated packages in the library. Take a look at:
    http://www.ibm.com/developerworks/java/tutorials/j-nio/j-nio.html
    https://www3.ntu.edu.sg/home/ehchua/programming/java/J5b_IO_advanced.html
    The java.nio packages in http://docs.oracle.com/javase/6/docs/api/index.html
    NIO is supposed to be more efficient for certain classes of IO. Two reasons that it interests me are:
        The IO libraries we are currently using for Socket IO do not support interrupts(). A blocking read() or write() on a Socket will not sense an interrupt. It is necessary for another thread to close()the Socket to get the thread that is blocked to unblock. That is fine if the purpose of the interrupt() is shutdown (our assignment 3), but if an application is using interrupt() for other purposes, then we wouldn't want to close() the Socket in order to get read() or write() to throw an exception. I am pretty certain that java.nio.channels support interruption from another thread. Your code would be to rewrite the basic Socket code in assignments 3&4 to use java.nio.channels  to support interruptible Socket IO, and then give a talk on that to the class. It also appears to support the Select capability for allowing a single thread to block on multiple incoming & outgoing Sockets until one of them is ready to read() or write() as required by the application.
    B. MPI /  on several of our Linux machines (or possibly several Windows machines):
        http://mpitutorial.com/tutorials/mpi-hello-world/
        It may be possible to do this in Java without requiring KUIT to install software. I don't know. I have never used MPI.
    C. 
OpenMP is a pragma-based approach for letting the compiler parallelize inner loops in C/C++, similar to the inner loop parallelization we did by hand in assignment 2. It is not intrinsically about networking. I have never used it.
        http://openmp.org/wp/
   
D. There could be additional networking and distributed application topics if there is interest.

The Scala language and its library support for Actor concurrency and alternatives. Scala runs on the JVM. In addition to on-line resources, Rohrbach library has some books. I have some loaner books. Scala is installed on one or more of our machines.
     1. Matt Tothero: Introducing Scala, with emphasis on functional programming constructs and why they are useful to concurrent programming. May 6
     2. Scott Twomby: 2. Actor-based concurrency in Scala. May 6

 
Other uses of Functional Programming Techniques as they apply to multiprocessing & concurrent programming would be good.

Concurrency in some class of applications or application framework.

Dr. Frye has a draft paper "Parallel Model for Complex Attack" to which I added comments during spring break. I believe her code could benefit from conversion from using blocking Java library classes to non-blocking poll()ing similar to some of our projects. I have not seen the code, and she has not run it in some time. If someone is interested, you will need to consult with her, get it running on one of our Linux platforms, consult with me on proposed speedup, implement those & document your results. This work should lead to co-publication.

Please suggest your own topic.


Concurrent Map / Reduce in Java, Adib Farah, April 29.

Neural networks on the web using parallel processing, Nick Evans, May 6.

Hash cracking on GPUs or distributed systems, Brandon Trumble, May 6.

Concurrency in CPython versus Jython, Robert Brotzman-Smith, April.


Python asyncio package, Jairus Martin, April 29.

David Day, Java ReentrantReadWriteLock class, including background on how they are implemented, and a benchmark showing differences between unfair and unfair ReentrantReadWriteLock  objects (via the "fair" constructor parameter) on the same timed sequence of reader & writer threads, April 29.

Casey Hennessey, Java Seralization & compression approaches for distributed object communication, May 6.

Cory Ma, Multithreading on Android, April 29.