CPSC 523 - Scripting for Data Science, Fall 2024, Tuesday 6:00-8:50 PM, Old Main 158 .

    Pitfalls of trying to use off-the-shelf AIs to do your work.

Dr. Dale E. Parson Class will be live face-to-face or on-line at class time via Zoom.
Mon 6-8:50 PM, Zoom classes & recordings, https://faculty.kutztown.edu/parson
Class-time Zoom link for CSC523: See D2L Course CSC523 -> Content -> Overview for the link.
IF you don’t want to be recorded or are a minor, use PRIVATE ZOOM CHAT to me for questions.
Please fill out & email Dr. Parson this permission to record slip. I will use it to take attendance in week 1.

Dr. Dale E. Parson, parson@kutztown.edu, Office hours: https://kutztown.zoom.us/j/94322223872
Office Hours
Mon 11 AM-1 PM, Wed 12-2 PM, Th 4-5 PM, or by appt . All available via Zoom.

You will need to use the acad Linux server in this course. You will have to come in
via a VPN starting this fall. Here are the instructions for that. Download the VPN from here.
KU offers a 4-course Graduate Certificate in Data Analytics. Talk with me if you want to sign up.

Our department is adding a Scripting Certificate, a Data Science major, and a Data Science minor in fall 2024.
    Instructions to Change, Add, Remove an UNDERGRADUATE Major, Minor or Certificate Program

First day handout (syllabus that is specific to this semester).


Initial Linux environment Setup on our new Linux server using Python 3.11. That page contains a Python review.

Scikit-learn is the primary machine-learning library that we will use to analyze data relationships.

Compilation of Weka slides on Instance Based Learning and Clustering.
A graph on informational entropy, relates to building rules & decision trees.
A page describing Bayes theorem and related matters.
A Bayes computer for a 52-card deck is on acad at ~parson/DataMine/BayesCards.py
Weka slides on evaluating numeric prediction.
A summary of the Kappa Statistic.
A subset of Weka Chapter 5 on Evaluation and 7 on Data Transformations.
Weka Chapter 12 on Ensemble Learning.

There is a 10% per late late penalty for projects that come in after the due date.

Run ssh K120023GEMS.kutztown.edu after logging into acad to perform make test.

Assignment 1 is due via make turnitin by 11:59 PM on
Friday September 20
via make turnitin on acad or K120023GEMS.


August 27 Introduction to the class, information entropy, Bayes Theorem, start of instance-based learning.
August 28 CPSC223 first 40 minutes are how to set up your Linux account for students new to acad etc.
A student's excellent summary of using our Linux server written recently.