Assignment 3 due 11:59 PM Thursday March 21 via

Download compressed ARFF data file

You must answer questions in

Start Weka, bring up the Explorer GUI, and

Set

This ARFF file has 45 attributes (columns) and 226 instances (rows) of monthly aggregate data from August through December of 1976 through 2021.

Here are the attributes in the file. It is a monthly aggregate of daily aggregates of (mostly 1-hour) observation periods.

year 1976-2021

month 8-12

HMtempC_mean mean for month of temp Celsius during observation times

WindSpd_mean same for wind speed in km/hour

HMtempC_median median for month

WindSpd_median

HMtempC_pstdv population standard deviation

WindSpd_pstdv

HMtempC_min minimum & maximum

WindSpd_min

HMtempC_max

WindSpd_max

wndN tally of North winds for all observations in the month, etc.

wndNNE

wndNE

wndENE

wndE

wndESE

wndSE

wndSSE

wndS

wndSSW

wndSW

wndWSW

wndW

wndWNW

wndNW

wndNNW

wndUNK

HMtempC_24_mean Changes in magnitude (absolute value of change) over 24, 48, and 72 hours

HMtempC_48_mean

HMtempC_72_mean

HMtempC_24_median

HMtempC_48_median

HMtempC_72_median

HMtempC_24_pstdv

HMtempC_48_pstdv

HMtempC_72_pstdv

HMtempC_24_min The min & max are their signed values.

HMtempC_48_min

HMtempC_72_min

HMtempC_24_max

HMtempC_48_max

HMtempC_72_max

SS_All Tally of sharp-shinned hawk observations during each month 8-12, 1976-2021. Target attribute.

You can examine its contents and sort based on attributes by opening the file in the Preprocess

Enter

Be careful

Enter

Enter

This step compresses the SS_All tally. (You can compare the min and the max of the original attribute

in Weka's Preprocess upper-right panel against the sqrt by using a calculator. This approach also

worls for log10 in the next step. Do NOT try it with the mean, which does not change linearly.)

Enter

This step compresses SS_All even more.

The reason for adding +1 to aN is to avoid taking the log(0) for SS_All counts of 0, which is undefined. None will be negative.

Set the Discretize

Set the Discretize

Save this dataset as

Figures 1 through 5 show the statistical distributions of SS_All and these 4 derived attributes. Use the Preprocessor to make sure yours look the same.

ZeroR predicts class value: N.N (This is the

Correlation coefficient N.N

Mean absolute error N.N

Root mean squared error N.N

Relative absolute error N %

Root relative squared error N %

Total Number of Instances 226

Correlation coefficient N.N

Mean absolute error N.N

Root mean squared error N.N

Relative absolute error N %

Root relative squared error N %

Total Number of Instances 226

Correlation coefficient N.N

Mean absolute error N.N

Root mean squared error N.N

Relative absolute error N %

Root relative squared error N %

Total Number of Instances 226

M5 pruned model tree:

(using smoothed linear models)

LM num: 1

| | | | ATTRIBUTE_NAME <= N.N : LM4 (13/26.17%)

In that leaf, 13 is the COUNT of the total 226 observation instances reaching that decision, 26.17% is an

What month or months have the lowest

https://acad.kutztown.edu/~parson/HawkMtnDaleParson2022/#SS

"