首页 > 代码库 > Moses翻译过程中的参数,程序运行弹出的列表,记录在这了
Moses翻译过程中的参数,程序运行弹出的列表,记录在这了
Moses - A beam search decoder for phrase-based statistical machine translation modelsCopyright (C) 2006 University of EdinburghThis library is free software; you can redistribute it and/ormodify it under the terms of the GNU Lesser General PublicLicense as published by the Free Software Foundation; eitherversion 2.1 of the License, or (at your option) any later version.This library is distributed in the hope that it will be useful,but WITHOUT ANY WARRANTY; without even the implied warranty ofMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNULesser General Public License for more details.You should have received a copy of the GNU Lesser General PublicLicense along with this library; if not, write to the Free SoftwareFoundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA***********************************************************************Built on Aug 17 2014 at 00:05:32WHO‘S FAULT IS THIS GODDAM SOFTWARE:Marcello Federico contact: federico at itc at it Researcher at ITC-irst, Trento, Italy I‘ll answer question on: IRST language modelChristine Moran contact: weird building at MITOndrej Bojar czech this out!Chris Callison-Burch contact: anytime, anywhere international playboyChris Dyer contact: can‘t. i‘ll be out driving my mustang driving my mustangPhilipp Koehn contact: only between 2 and 4am I‘ll answer question on: Nothing fazes this dudeRichard Zens contact: richard at aachen dot de I‘ll answer question on: ambiguous source input, confusion networks, confusing source codeEvan Herbst contact: Small college in upstate New YorkHieu Hoang contact: http://www.hoang.co.uk/hieu/ phd student at Edinburgh Uni. Original Moses developer I‘ll answer question on: general queries/ flames on Moses.Nicola Bertoldi contact: 911 I‘ll answer question on: scripts & other stuffBrooke Cowan contact: brooke@csail.mit.edu if you‘re going to san francisco, be sure to wear a flower in your hairAlexandra Constantin eu sunt varzaWade Shen contact: via morse code buying another laptopUsage: -alignment-output-file: print output word alignments into given file -alternate-weight-setting (aws): alternate set of weights to used per xml specification -beam-threshold (b): threshold for threshold pruning -clean-lm-cache: clean language model caches after N translations (default N=1) -config (f): location of the configuration file -consensus-decoding (con): use consensus decoding (De Nero et. al. 2009) -cube-pruning-diversity (cbd): How many hypotheses should be created for each coverage. (default = 0) -cube-pruning-lazy-scoring (cbls): Don‘t fully score a hypothesis until it is popped -cube-pruning-pop-limit (cbp): How many hypotheses should be popped for each stack. (default = 1000) -decoding-graph-backoff (dpb): only use subsequent decoding paths for unknown spans of given length -default-non-term-for-empty-range-only: Don‘t add [X] to all ranges, just ranges where there isn‘t a source non-term. Default = false (ie. add [X] everywhere) -description: Source language, target language, description -disable-discarding (dd): disable hypothesis discarding -distortion: configurations for each factorized/lexicalized reordering model. -distortion-file: source factors (0 if table independent of source), target factors, location of the factorized/lexicalized reordering tables -distortion-limit (dl): distortion (reordering) limit in maximum number of words (0 = monotone, -1 = unlimited) -dlm-model: DEPRECATED. DO NOT USE. Order, factor and vocabulary file for discriminative LM. Use * for filename to indicate unlimited vocabulary. -drop-unknown (du): drop unknown words instead of copying them -early-discarding-threshold (edt): threshold for constructing hypotheses based on estimate cost -early-distortion-cost (edc): include estimate of distortion cost yet to be incurred in the score [Moore & Quirk 2007]. Default is no -factor-delimiter (fd): specify a different factor delimiter than the default -feature: All the feature functions should be here -feature-add: Add a feature function on the command line. Used by mira to add BLEU feature -feature-name-overwrite: Override feature name (NOT arguments). Eg. SRILM-->KENLM, PhraseDictionaryMemory-->PhraseDictionaryScope3 -feature-overwrite: Override arguments in a particular feature function with a particular key. Format: -feature-overwrite "FeatureName key=value" -generation-file: DEPRECATED. DO NOT USE. location and properties of the generation table -glm-feature: DEPRECATED. DO NOT USE. discriminatively trained global lexical translation feature, sparse producer -global-lexical-file (gl): DEPRECATED. DO NOT USE. discriminatively trained global lexical translation model file -include-lhs-in-search-graph (lhssg): When outputting chart search graph, include the label of the LHS of the rule (useful when using syntax) -include-segmentation-in-n-best: include phrasal segmentation in the n-best list. default is false -input-factors: list of factors in the input -input-file (i): location of the input file to be translated -input-scores: DEPRECATED. DO NOT USE. 2 numbers on 2 lines - [1] of scores on each edge of a confusion network or lattice input (default=1). [2] Number of ‘real‘ word scores (0 or 1. default=0) -inputtype: text (0), confusion network (1), word lattice (2), tree (3) (default = 0) -labeled-n-best-list: print out labels for each weight type in n-best list. default is true -lattice-hypo-set: to use lattice as hypo set during lattice MBR -lattice-samples: generate samples from lattice, in same format as nbest list. Uses the file and size arguments, as in n-best-list -link-param-count: DEPRECATED. DO NOT USE. Number of parameters on word links when using confusion networks or lattices (default = 1) -lmbr-map-weight: weight given to map solution when doing lattice MBR (default 0) -lmbr-p: unigram precision value for lattice mbr -lmbr-pruning-factor: average number of nodes/word wanted in pruned lattice -lmbr-r: ngram precision decay value for lattice mbr -lmbr-thetas: theta(s) for lattice mbr calculation -lminimum-bayes-risk (lmbr): use lattice miminum Bayes risk to determine best translation -lmodel-dub: DEPRECATED. DO NOT USE. dictionary upper bounds of language models -lmodel-file: DEPRECATED. DO NOT USE. location and properties of the language models -lmodel-oov-feature: add language model oov feature, one per model -mapping: description of decoding steps -mark-unknown (mu): mark unknown words in output -max-chart-span: maximum num. of source word chart rules can consume (default 10) -max-partial-trans-opt: maximum number of partial translation options per input span (during mapping steps) -max-phrase-length: maximum phrase length (default 20) -max-trans-opt-per-coverage: maximum number of translation options per input span (after applying mapping steps) -mbr-scale: scaling factor to convert log linear score probability in MBR decoding (default 1.0) -mbr-size: number of translation candidates considered in MBR decoding (default 200) -minimum-bayes-risk (mbr): use miminum Bayes risk to determine best translation -minlexr-memory: Load lexical reordering table in minlexr format into memory -minphr-memory: Load phrase table in minphr format into memory -mira: do mira training -monotone-at-punctuation (mp): do not reorder over punctuation -n-best-factor: factor to compute the maximum number of contenders (=factor*nbest-size). value 0 means infinity, i.e. no threshold. default is 0 -n-best-list: file and size of n-best-list to be generated; specify - as the file in order to write to STDOUT -no-cache: Disable all phrase-table caching. Default = false (ie. enable caching) -non-terminals: list of non-term symbols, space separated -output-factors: list if factors in the output -output-hypo-score: Output the hypo score to stdout with the output string. For search error analysis. Default is false -output-search-graph (osg): Output connected hypotheses of search into specified filename -output-search-graph-extended (osgx): Output connected hypotheses of search into specified filename, in extended format -output-search-graph-hypergraph: Output connected hypotheses of search into specified directory, one file per sentence, in a hypergraph format (see Kenneth Heafield‘s lazy hypergraph decoder). This flag is followed by 3 values: ‘true (gz|txt|bz) directory-name‘ -output-search-graph-slf (slf): Output connected hypotheses of search into specified directory, one file per sentence, in HTK standard lattice format (SLF) - the flag should be followed byy a directory name, which must exist -output-unknowns: Output the unknown (OOV) words to the given file, one line per sentence -output-word-graph (owg): Output stack info as word graph. Takes filename, 0=only hypos in stack, 1=stack + nbest hypos -phrase-boundary-source-feature: DEPRECATED. DO NOT USE. Source factors for phrase boundary feature -phrase-boundary-target-feature: DEPRECATED. DO NOT USE. Target factors for phrase boundary feature -phrase-drop-allowed (da): if present, allow dropping of source words -phrase-length-feature: DEPRECATED. DO NOT USE. Count features for source length, target length, both of each phrase -phrase-pair-feature: DEPRECATED. DO NOT USE. Source and target factors for phrase pair feature -placeholder-factor: Which source factor to use to store the original text for placeholders. The factor must not be used by a translation or gen model -print-alignment-info: Output word-to-word alignment to standard out, separated from translation by |||. Word-to-word alignments are takne from the phrase table if any. Default is false -print-alignment-info-in-n-best: Include word-to-word alignment in the n-best list. Word-to-word alignments are takne from the phrase table if any. Default is false -print-all-derivations: to print all derivations in search graph -print-id: prefix translations with id. Default if false -recover-input-path (r): (conf net/word lattice only) - recover input path corresponding to the best translation -references: Reference file(s) - used for bleu score feature -report-all-factors: report all factors in output, not just first -report-all-factors-in-n-best: Report all factors in n-best-lists. Default is false -report-segmentation (t): report phrase segmentation in the output -report-segmentation-enriched (tt): report phrase segmentation in the output with additional information -rule-limit: a little like table limit. But for chart decoding rules. Default is DEFAULT_MAX_TRANS_OPT_SIZE -search-algorithm: Which search algorithm to use. 0=normal stack, 1=cube pruning, 2=cube growing, 4=stack with batched lm requests (default = 0) -show-weights: print feature weights and exit -sort-word-alignment: Sort word alignments for more consistent display. 0=no sort (default), 1=target order -source-label-overlap: What happens if a span already has a label. 0=add more. 1=replace. 2=discard. Default is 0 -source-word-deletion-feature: DEPRECATED. DO NOT USE. Count feature for each unaligned source word -stack (s): maximum stack size for histogram pruning. 0 = unlimited stack size -stack-diversity (sd): minimum number of hypothesis of each coverage in stack (default 0) -start-translation-id: Id of 1st input. Default = 0 -target-word-insertion-feature: DEPRECATED. DO NOT USE. Count feature for each unaligned target word -text-type: DEPRECATED. DO NOT USE. should be one of dev/devtest/test, used for domain adaptation features -threads (th): number of threads to use in decoding (defaults to single-threaded) -time-out: seconds after which is interrupted (-1=no time-out, default is -1) -translation-all-details (Tall): for all hypotheses, report translation details to the given file -translation-details (T): for each best hypothesis, report translation details to the given file -translation-option-threshold (tot): threshold for translation options relative to best for input phrase -tree-translation-details (Ttree): for each hypothesis, report translation details with tree fragment info to given file -ttable-file: DEPRECATED. DO NOT USE. location and properties of the translation tables -unknown-lhs: file containing target lhs of unknown words. 1 per line: LHS prob -unpruned-search-graph (usg): When outputting chart search graph, do not exclude dead ends. Note: stack pruning may have eliminated some hypotheses -verbose (v): verbosity level of the logging -weight: weights for ALL models, 1 per line ‘WeightName value‘. Weight names can be repeated -weight-add: Add weight for FF if it doesn‘t exist, i.e weights here are added 1st, and can be override by the ini file or on the command line. Used to specify initial weights for FF that was also specified on the copmmand line -weight-bl (bl): DEPRECATED. DO NOT USE. weight for bleu score feature -weight-d (d): DEPRECATED. DO NOT USE. weight(s) for distortion (reordering components) -weight-dlm (dlm): DEPRECATED. DO NOT USE. weight for discriminative LM feature function (on top of sparse weights) -weight-e (e): DEPRECATED. DO NOT USE. weight for word deletion -weight-file (wf): feature weights file. Do *not* put weights for ‘core‘ features in here - they go in moses.ini -weight-generation (g): DEPRECATED. DO NOT USE. weight(s) for generation components -weight-glm (glm): DEPRECATED. DO NOT USE. weight for global lexical feature, sparse producer -weight-i (I): DEPRECATED. DO NOT USE. weight(s) for word insertion - used for parameters from confusion network and lattice input links -weight-l (lm): DEPRECATED. DO NOT USE. weight(s) for language models -weight-lex (lex): DEPRECATED. DO NOT USE. weight for global lexical model -weight-lr (lr): DEPRECATED. DO NOT USE. weight(s) for lexicalized reordering, if not included in weight-d -weight-overwrite: special parameter for mert. All on 1 line. Overrides weights specified in ‘weights‘ argument -weight-pb (pb): DEPRECATED. DO NOT USE. weight for phrase boundary feature -weight-pp (pp): DEPRECATED. DO NOT USE. weight for phrase pair feature -weight-slm (slm): DEPRECATED. DO NOT USE. weight(s) for syntactic language model -weight-t (tm): DEPRECATED. DO NOT USE. weights for translation model components -weight-u (u): DEPRECATED. DO NOT USE. weight for unknown word penalty -weight-w (w): DEPRECATED. DO NOT USE. weight for word penalty -weight-wt (wt): DEPRECATED. DO NOT USE. weight for word translation feature -word-translation-feature: DEPRECATED. DO NOT USE. Count feature for word translation according to word alignment -xml-brackets (xb): specify strings to be used as xml tags opening and closing, e.g. "{{ }}" (default "< >"). Avoid square brackets because of configuration file format. Valid only with text input mode -xml-input (xi): allows markup of input with desired translations and probabilities. values can be ‘pass-through‘ (default), ‘inclusive‘, ‘exclusive‘, ‘constraint‘, ‘ignore‘Available feature functions:BleuScoreFeature ConstrainedDecoding ControlRecombination CountNonTerms CoveredReferenceFeature Distortion ExternalFeature Generation GlobalLexicalModel HyperParameterAsWeight InputFeature KENLM LexicalReordering MaxSpanFreeNonTermSource NieceTerminal OpSequenceModel PhraseBoundaryFeature PhraseDictionaryALSuffixArray PhraseDictionaryBinary PhraseDictionaryDynSuffixArray PhraseDictionaryFuzzyMatch PhraseDictionaryMemory PhraseDictionaryMultiModel PhraseDictionaryMultiModelCounts PhraseDictionaryOnDisk PhraseDictionaryScope3 PhraseDictionaryTransliteration PhraseLengthFeature PhrasePairFeature PhrasePenalty ReferenceComparison RuleScope SetSourcePhrase SkeletonChangeInput SkeletonLM SkeletonPT SkeletonStatefulFF SkeletonStatelessFF SoftMatchingFeature SoftSourceSyntacticConstraintsFeature SourceGHKMTreeInputMatchFeature SourceWordDeletionFeature SpanLength SparseHieroReorderingFeature SyntaxRHS TargetBigramFeature TargetNgramFeature TargetWordInsertionFeature TreeStructureFeature UnknownWordPenalty WordPenalty WordTranslationFeature
Moses翻译过程中的参数,程序运行弹出的列表,记录在这了
声明:以上内容来自用户投稿及互联网公开渠道收集整理发布,本网站不拥有所有权,未作人工编辑处理,也不承担相关法律责任,若内容有误或涉及侵权可进行投诉: 投诉/举报 工作人员会在5个工作日内联系你,一经查实,本站将立刻删除涉嫌侵权内容。