CSC 523 - Scripting
for Data Science, Fall 2023, Assignment 5.
Assignment 5 is due by 11:59 PM on Saturday December 16 via "make
turnitin". You must test on mcgonagall.
11/27 Clarification is below.
Assignment 5 is a redo of Assignment 3 using new regressors and
new classifiers with new configuration parameters.
Our "final exam" class on 12/11 will be a work
session.
There is a 10% per day penalty for late assignments in my courses.
I need this by end of Sunday 12/17 (late) to get grades in on
time.
Assignment
3:
1. Replace all regressors and all classifiers in
your solution CSC523f23AudioAssn3_generator.py file with new ones.
You can also reuse ensemble classifiers for which you make
significant changes to their base models and configuration
parameters.
11/27 Clarification:
Half or more of the regressors, and half or more of the
classifiers, must be new, where "new" means:
A) a completely new regressor or classifier,
and/or B) Adaboost or Bagging with a non-default
estimator,
i.e.,
not the DecisionTree estimator. Use a new underlying regressor
or classifier instead of DecisionTree.
This "half or more" applies to both the
regressors and classifiers separately. Each must have "half or
more new".
The other fewer than half may be new per the above definition,
or handout ones with a configuration-parameter
change that has a measurable effect, maybe a
small or large effect.
11/30 Clarification: We did not use Bagging or Adaboost
ensemble regressors or classifiers in Assignment 3,
so you MAY use them with a
default DecisionTree base model in Assignment 5 if you like.
https://scikit-learn.org/stable/supervised_learning.html
If you want to start with my
CSC523f23AudioAssn3_generator.py solution code instead of code
for which you had bugs and lost points, email me and I'll send
you my solution copy of that file as your starting point. (Update
11/28: I am sending this to everyone who received a grade for
Assignment 3.)
You will get diffs when running make test but the tests should
complete without bombing. When you get a diff like this:
$ make clean test
/usr/local/bin/python3.7 CSC523Fall2022Classify_main.py
CSC523Fall2022ClassifyTrace.txt CSC523Fall2022Classify_generator
month_aggregate_HMS_goodyears.arff.gz '' > CSC523Fal
l2022ClassifyOut.txt
diff --ignore-trailing-space --strip-trailing-cr
CSC523Fall2022ClassifyOut.txt CSC523Fall2022ClassifyOut.txt.ref
> CSC523Fall2022ClassifyOut.txt.dif
make: *** [test] Error 1
Inspect the .dif, output, and reference files,
CSC523Fall2022ClassifyOut.txt.dif, CSC523Fall2022ClassifyOut.txt,
and CSC523Fall2022ClassifyOut.txt.ref in this example, and copy
the output to the reference file AFTER ensuring that the output is
correct, like this:
cp CSC523Fall2022ClassifyOut.txt
CSC523Fall2022ClassifyOut.txt.ref
Running make clean test will work correctly after
verifying all of the diffs in this way.
2. Edit README.txt
At the top of the README.txt file list all of the regressor and/or
classifier changes you have made including exploration of
configuration parameters.
Rewrite each README Qn question as needed and answer it. Some
questions may not fit your new models, or you may think of better
questions. In those cases just rewrite the Qn&A to explain
something that you discovered.
3. Run make turnitin by the due date.
Have a good winter break!