首页 > 代码库 > weka控制台指令

weka控制台指令

java weka.classifiers.trees.J48 -t data/weather.arff

java 类的完整名称 -t表示下一个参数是训练数据集的名称

 java weka.classifiers.trees.J48 -h

查看java命令行中各个参数的具体含义

-h or -help    Output help information.-synopsis or -info    Output synopsis for classifier (use in conjunction  with -h)-t <name of training file>    Sets training file.-T <name of test file>    Sets test file. If missing, a cross-validation will be performed    on the training data.-c <class index>    Sets index of class attribute (default: last).-x <number of folds>    Sets number of folds for cross-validation (default: 10).-no-cv    Do not perform any cross validation.-force-batch-training    Always train classifier in batch mode, never incrementally.-split-percentage <percentage>    Sets the percentage for the train/test set split, e.g., 66.-preserve-order    Preserves the order in the percentage split.-s <random number seed>    Sets random number seed for cross-validation or percentage split    (default: 1).-m <name of file with cost matrix>    Sets file with cost matrix.-disable <comma-separated list of evaluation metric names>    Comma separated list of metric names not to print to the output.    Available metrics:    Correct,Incorrect,Kappa,Total cost,Average cost,KB relative,KB information,    Correlation,Complexity 0,Complexity scheme,Complexity improvement,    MAE,RMSE,RAE,RRSE,Coverage,Region size,TP rate,FP rate,Precision,Recall,    F-measure,MCC,ROC area,PRC area-l <name of input file>    Sets model input file. In case the filename ends with ‘.xml‘,    a PMML file is loaded or, if that fails, options are loaded    from the XML file.-d <name of output file>    Sets model output file. In case the filename ends with ‘.xml‘,    only the options are saved to the XML file, not the model.-v    Outputs no statistics for training data.-o    Outputs statistics only, not the classifier.-i    Outputs detailed information-retrieval statistics for each class.-k    Outputs information-theoretic statistics.-classifications "weka.classifiers.evaluation.output.prediction.AbstractOutput + options"    Uses the specified class for generating the classification output.    E.g.: weka.classifiers.evaluation.output.prediction.PlainText-p range    Outputs predictions for test instances (or the train instances if    no test instances provided and -no-cv is used), along with the     attributes in the specified range (and nothing else).     Use ‘-p 0‘ if no attributes are desired.    Deprecated: use "-classifications ..." instead.-distribution    Outputs the distribution instead of only the prediction    in conjunction with the ‘-p‘ option (only nominal classes).    Deprecated: use "-classifications ..." instead.-r    Only outputs cumulative margin distribution.-z <class name>    Only outputs the source representation of the classifier,    giving it the supplied name.-g    Only outputs the graph representation of the classifier.-xml filename | xml-string    Retrieves the options from the XML-data instead of the command line.-threshold-file <file>    The file to save the threshold data to.    The format is determined by the extensions, e.g., ‘.arff‘ for ARFF     format or ‘.csv‘ for CSV.-threshold-label <label>    The class label to determine the threshold data for    (default is the first label)Options specific to weka.classifiers.trees.J48:-U    Use unpruned tree.-O    Do not collapse tree.-C <pruning confidence>    Set confidence threshold for pruning.    (default 0.25)-M <minimum number of instances>    Set minimum number of instances per leaf.    (default 2)-R    Use reduced error pruning.-N <number of folds>    Set number of folds for reduced error    pruning. One fold is used as pruning set.    (default 3)-B    Use binary splits only.-S    Don‘t perform subtree raising.-L    Do not clean up after the tree has been built.-A    Laplace smoothing for predicted probabilities.-J    Do not use MDL correction for info gain on numeric attributes.-Q <seed>    Seed for random data shuffling (default 1).

 

weka.core  

weka核心包,基本所有类都与他有联系

核心包中的关键类:Attribute:包含attribute’s name, its type, and, in the case of a nominal or string attribute, its possible values

Instance:contains the attribute values of a particular instance

Instances:holds an ordered set of instances—in other words, a dataset

 

weka.classifiers

内容:contains implementations of most of the algorithms for clas-sification  and  numeric  prediction

关键抽象类:Classifier---->>defines the general structure of any  scheme  for  classification  or  numeric  prediction

包含三个核心方法:buildClassifier(), classifyInstance(),distributionForInstance()

继承这个抽象类的例子:

  • weka.classifiers.trees.DecisionStump
  • 覆写了distributionForInstance()
  • 包含getRevision(),simply returns the revision number of the classifier,used  by  Weka  maintainers  when  diagnosing  and debugging  problems  reported  by  users.
  • 包含globalInfo(),returns  a  string describing  the  classifier,  which,  along  with  the  scheme’s  options
  • 包含toString(), returns a textual representation of the classifier
  • 包含toSource(),s used to obtain a source code repre-sentation  of  the  learned  classifier
  • 包含main(),called  when  you  ask  for a  decision  stump  from  the  command  line,相当于执行这个类的入口
  • 包含getCapabilities() ,called  by  the  generic  object  editor  to  provide information about the capabilities of a learning scheme

 

其他的一些比较重要的包

weka.associations

:contains association-rule  learners

weka.clusterers 

:contains  methods  for  unsupervised  learning.包含非监督学习方法

weka.datagenerators

:产生人工数据

weka.estimators package

:computes  different  types  of  probability  distribution

 weka.filters

:提供数据清理的相关方法

 

weka控制台指令