public final class RDFUpdate extends MLUpdate<String>
MODEL_FILE_NAME| Constructor and Description |
|---|
RDFUpdate(com.typesafe.config.Config config) |
| Modifier and Type | Method and Description |
|---|---|
org.dmg.pmml.PMML |
buildModel(org.apache.spark.api.java.JavaSparkContext sparkContext,
org.apache.spark.api.java.JavaRDD<String> trainData,
List<?> hyperParameters,
org.apache.hadoop.fs.Path candidatePath) |
double |
evaluate(org.apache.spark.api.java.JavaSparkContext sparkContext,
org.dmg.pmml.PMML model,
org.apache.hadoop.fs.Path modelParentPath,
org.apache.spark.api.java.JavaRDD<String> testData,
org.apache.spark.api.java.JavaRDD<String> trainData) |
List<HyperParamValues<?>> |
getHyperParameterValues() |
canPublishAdditionalModelData, getTestFraction, publishAdditionalModelData, runUpdate, splitNewDataToTrainTestpublic List<HyperParamValues<?>> getHyperParameterValues()
getHyperParameterValues in class MLUpdate<String>HyperParamValues per
hyperparameter. Different combinations of the values derived from the list will be
passed back into MLUpdate.buildModel(JavaSparkContext,JavaRDD,List,Path)public org.dmg.pmml.PMML buildModel(org.apache.spark.api.java.JavaSparkContext sparkContext,
org.apache.spark.api.java.JavaRDD<String> trainData,
List<?> hyperParameters,
org.apache.hadoop.fs.Path candidatePath)
buildModel in class MLUpdate<String>sparkContext - active Spark ContexttrainData - training data on which to build a modelhyperParameters - ordered list of hyper parameter values to use in building modelcandidatePath - directory where additional model files can be writtenPMML representation of a model trained on the given datapublic double evaluate(org.apache.spark.api.java.JavaSparkContext sparkContext,
org.dmg.pmml.PMML model,
org.apache.hadoop.fs.Path modelParentPath,
org.apache.spark.api.java.JavaRDD<String> testData,
org.apache.spark.api.java.JavaRDD<String> trainData)
evaluate in class MLUpdate<String>sparkContext - active Spark Contextmodel - model to evaluatemodelParentPath - directory containing model files, if applicabletestData - data on which to test the model performancetrainData - data on which model was trained, which can also be useful in evaluating
unsupervised learning problemsCopyright © 2014–2018. All rights reserved.