Skip to main content

CF-Miner Task

The CF-Miner (Category Fishery Miner) looks for interesting histograms — it finds conditions under which the distribution of a target categorical attribute shifts in a notable way. For example, you can find circumstances under which accident severity is rising or falling compared to the overall dataset distribution.

Task creation is handled through a three-step wizard: Task Setup → Logic Configuration → Quantifiers.


Step 1 — Task Setup

CF-Miner Step 1 - Task Setup

The first step captures the basic task information:

FieldDescription
Task NameA name to identify this task
Procedure MethodSelect CFMiner
DatasetThe dataset to mine — select from your uploaded datasets
ProjectOptionally assign the task to a project (can be left empty)

Click Next Step to proceed to logic configuration.


Step 2 — Logic Configuration

CF-Miner Step 2 - Logic Configuration

The CF-Miner has a simpler cedent structure than the other procedures. The logic configuration step has a single tab — Condition (Filter) — plus a dedicated Target Attribute selector at the top.

Target Attribute

Select the categorical column whose histogram you want to analyse. This is the attribute whose distribution will be examined across different conditions. For example, selecting Severity will tell the miner to look for conditions under which the distribution of Fatal, Serious, and Slight accidents shifts.

Condition (Filter)

The condition defines the search space — the combinations of attribute values the miner will explore to find interesting histograms. Configure it the same way as any other cedent:

  • Cedent Type — toggle between Conjunction (AND) or Disjunction (OR) using the Switch Type button
  • Cedent Length (Min / Max) — controls how many attributes can be combined in a single condition

For each attribute added:

FieldDescription
ColumnSelect an attribute from the dataset
TypeHow the attribute's values are grouped — see Literal Types below
Min / MaxThe minimum and maximum number of values to combine for this attribute

Use + Add Attribute to add more columns, and the ✕ button to remove one.

Literal Types

TypeDescription
subsetAny subset of the attribute's categories (unordered)
seqSequences of consecutive ordered values
lcutLeft cut — takes values from the left end of the ordered range
rcutRight cut — takes values from the right end of the ordered range

Click Next Step to proceed to quantifier setup.


Step 3 — Quantifiers

CF-Miner Step 3 - Quantifiers

CF-Miner quantifiers are based on the shape and size of the discovered histograms rather than confidence or probability. Only rules meeting all specified conditions are returned.

Base

QuantifierDescription
BaseMinimum number of records satisfying the condition
Relative BaseMinimum base as a fraction of the total dataset size

Histogram Steps

These quantifiers describe how the histogram values change across the target attribute's ordered categories:

QuantifierDescription
Steps UpMinimum number of consecutive increases between adjacent category counts
Steps DownMinimum number of consecutive decreases between adjacent category counts
Any Steps UpMinimum total number of increases anywhere in the histogram
Any Steps DownMinimum total number of decreases anywhere in the histogram

Extremes

These quantifiers constrain the absolute or relative size of the histogram's highest and lowest category counts:

QuantifierDescription
Max ValueMinimum absolute value of the largest category count
Min ValueMinimum absolute value of the smallest category count
Rel MaxMinimum relative share of the largest category (out of total)
Rel MinMinimum relative share of the smallest category (out of total)
Rel Max UpperMaximum relative share of the largest category — sets an upper bound
Rel Min UpperMaximum relative share of the smallest category — sets an upper bound
Upper bound quantifiers

Upper bound variants (Rel Max Upper, Rel Min Upper) use a less than or equal condition instead of the usual greater than or equal. They are useful when you want to find histograms where values are relatively balanced — for example, constraining both the max and min to be close to average.

Leave any field empty (Not set) to skip that threshold.

Submitting the Task

At the bottom of the quantifiers step, two actions are available:

  • Save Task — saves the task configuration for later execution
  • Run Task — saves the task and immediately dispatches it to the execution pipeline
tip

A common starting configuration is to set Base (to ensure conditions are backed by enough records) and Steps Down or Steps Up (to filter for conditions where the histogram actually changes shape in the direction you're interested in).