Improve the docs

andrewdalpino · andrewdalpino · commit be4097eae2d2 · 2021-06-05T20:11:38.000-05:00
diff --git a/docs/basic-introduction.md b/docs/basic-introduction.md
@@ -67,7 +67,7 @@ We can verify that the learner has been trained by calling the `trained()` metho
 var_dump($estimator->trained());
 ```
 
-```sh
+```
 bool(true)
 ```
 
@@ -96,16 +96,17 @@ $dataset = new Unlabeled($samples);
 
 $predictions = $estimator->predict($dataset);
 
-var_dump($predictions);
+print_r($predictions);
 ```
 
-```sh
-array(4) {
-	[0] => 'married'
-	[1] => 'divorced'
-	[2] => 'divorced'
-	[4] => 'married'
-}
+```php
+Array
+(
+    [0] => married
+    [1] => divorced
+    [2] => divorced
+    [3] => married
+)
 ```
 
 The output of the estimator are the predicted class labels of the unknown samples. We could either trust these predictions as-is or we could proceed to further evaluate the model. In the next section, we'll learn how to test its accuracy using a process called cross validation.
@@ -152,4 +153,4 @@ The return value is the accuracy score which can be interpreted as the degree to
     More info can be found in the [Cross Validation](cross-validation.md) section of the docs.
 
 ## Next Steps
-Congratulations! You've completed the basic introduction to machine learning in PHP with Rubix ML. For a more in-depth tutorial using the K Nearest Neighbors classifier and a real dataset, check out the [Divorce Predictor](https://github.com/RubixML/Divorce) tutorial and example project. Have fun!
+Congratulations! You've completed the basic introduction to machine learning in PHP with Rubix ML. For a more in-depth tutorial using the K Nearest Neighbors classifier and a real dataset, check out the [Divorce Predictor](https://github.com/RubixML/Divorce) tutorial and example project. Have fun!
diff --git a/docs/cross-validation.md b/docs/cross-validation.md
@@ -39,7 +39,7 @@ $score = $metric->score($predictions, $testing->labels());
 echo $score;
 ```
 
-```sh
+```
 0.85
 ```
 
@@ -167,7 +167,7 @@ $score = $validator->test($estimator, $dataset, new FBeta());
 echo $score;
 ```
 
-```sh
+```
 0.9175
 ```
 
diff --git a/docs/extracting-data.md b/docs/extracting-data.md
@@ -29,7 +29,7 @@ We can check the number of records that were imported by calling the `numSamples
 echo $dataset->numSamples();
 ```
 
-```sh
+```
 5000
 ```
 
diff --git a/docs/hyper-parameter-tuning.md b/docs/hyper-parameter-tuning.md
@@ -23,6 +23,10 @@ $score = $metric->score($predictions, $testing->labels());
 echo $score;
 ```
 
+```
+-4.75
+```
+
 ## Hyper-parameter Optimization
 In distinction to manual tuning, Hyper-parameter optimization is an AutoML technique that employs search and meta-learning strategies to explore various algorithm configurations. In Rubix ML, hyper-parameter optimizers are implemented as meta-estimators that wrap a base learner whose hyper-parameters we wish to optimize.
 
@@ -58,7 +62,7 @@ We can also dump the selected hyper-parameters by calling the `params()` method
 print_r($estimator->base()->params());
 ```
 
-```sh
+```php
 Array
 (
     [k] => 3
@@ -85,4 +89,4 @@ use Rubix\ML\Helpers\Params;
 $params = [
     Params::ints(1, 10, 4), [true, false], // ...
 ];
-```
+```
diff --git a/docs/inference.md b/docs/inference.md
@@ -23,7 +23,7 @@ $predictions = $estimator->predict($dataset);
 print_r($predictions);
 ```
 
-```sh
+```php
 Array
 (
     [0] => cat
@@ -41,7 +41,7 @@ $probabilities = $estimator->proba($dataset);
 print_r($probabilities);
 ```
 
-```sh
+```php
 Array
 (
     [0] => Array
@@ -52,9 +52,9 @@ Array
         )
     [1] => Array
         (
-            [cat] => 3.0
-            [dog] => 6.0
-            [frog] => 1.0
+            [cat] => 0.3
+            [dog] => 0.6
+            [frog] => 0.1
         )
     [2] => Array
         (
@@ -74,7 +74,7 @@ $scores = $estimator->score($dataset);
 print_r($scores);
 ```
 
-```sh
+```php
 Array
 (
     [0] => 0.35033
diff --git a/docs/learner.md b/docs/learner.md
@@ -24,6 +24,6 @@ public trained() : bool
 var_dump($estimator->trained());
 ```
 
-```sh
+```
 bool(true)
 ```
diff --git a/docs/model-persistence.md b/docs/model-persistence.md
@@ -2,7 +2,7 @@
 Model persistence is the ability to save and subsequently load a learner's state in another process. Trained estimators can be used for real-time inference by loading the model onto a server or they can be saved to make predictions in batches offline at a later time. Estimators that implement the [Persistable](persistable.md) interface are able to have their internal state captured between processes. In addition, the library provides the [Persistent Model](persistent-model.md) meta-estimator that acts as a wrapper for persistable estimators.
 
 ## Serialization
-Serialization occurs in between saving and loading a model and can be thought of as packaging the model's parameters. The data can be in a lightweight format such as with PHP's [Native](serializers/native.md) serializer or in a more robust format such as with the library's own [RBX](serializers/rbx.md) serializer. In the this example, we'll demonstrate how to encode a Persistable learner using the compressed RBX format, save the encoding with a [Persister](persisters/api.md), and then how to deserialize the encoding.
+Serialization occurs in between saving and loading a model and can be thought of as packaging the model's parameters. The data can be in a lightweight format such as with PHP's [Native](serializers/native.md) serializer or in a robust format such as [RBX](serializers/rbx.md). In the this example, we'll demonstrate how to encode a Persistable learner using the compressed RBX format, save the encoding with a [Persister](persisters/api.md), and then how to deserialize the encoding.
 
 ```php
 use Rubix\ML\Classifiers\RandomForest;
@@ -15,13 +15,11 @@ $serializer = new RBX();
 
 $encoding = $serializer->serialize($estimator);
 
-
-
 $estimator = $serializer->deserialize($encoding);
 ```
 
 !!! note
-    Due to a limitation in PHP, anonymous classes and functions (*closures*) are not able to be deserialized. Avoid adding anonymous classes or functions to an object that you intend to persist.
+    Due to a limitation in PHP, anonymous classes and functions (*closures*) are not able to be deserialized. Therefore, avoid anonymous classes or functions if you intend to persist the model.
 
 ## Persistent Model Meta-estimator
 The persistence subsystem can be interfaces at a low level with Serializer and Persister objects or it can be interacted with at a higher level using the [Persistent Model](persistent-model.md) meta-estimator. It is a decorator that provides `save()` and `load()` methods giving the estimator the ability to save and load itself.
@@ -38,7 +36,7 @@ $estimator->save();
 ```
 
 ## Persisting Transformers
-In addition to Learners, the persistence subsystem can be used to individually save and load any Stateful transformer that implements the [Persistable](persistable.md) interface . In the example below we'll fit a transformer to a dataset and then save it to the [Filesystem](persisters/filesystem.md).
+In addition to Learners, the persistence subsystem can be used to individually save and load any Stateful transformer that implements the [Persistable](persistable.md) interface. In the example below we'll fit a transformer to a dataset and then save it to the [Filesystem](persisters/filesystem.md).
 
 ```php
 use Rubix\ML\Transformers\OneHotEncoder;
@@ -47,10 +45,10 @@ use Rubix\ML\Persisters\Filesystem;
 
 $transformer = new OneHotEncoder();
 
-// Fit transformer
-
 $serializer = new RBX();
 
+$transformer->fit($dataset);
+
 $serializer->serialize($transformer)->saveTo(new Filesystem('example.rbx'));
 ```
 
diff --git a/docs/online.md b/docs/online.md
@@ -10,7 +10,7 @@ public partial(Dataset $dataset) : void
 ```php
 $folds = $dataset->fold(3);
 
-$estimator->partial($folds[0]);
+$estimator->train($folds[0]);
 
 $estimator->partial($folds[1]);
 
diff --git a/docs/persistable.md b/docs/persistable.md
@@ -1,2 +1,15 @@
 # Persistable
-An estimator that implements the Persistable interface can be saved and loaded by a [Persister](persisters/api.md) object or using the [Persistent Model](persistent-model.md) meta-estimator. The interface provides no additional methods otherwise.
+An estimator that implements the Persistable interface can be serialized by a [Serializer](serializers/api.md) or save and loaded using the [Persistent Model](persistent-model.md) meta-estimator.
+
+To return the current class revision hash:
+```php
+public revision() : string
+```
+
+```php
+echo $persistable->revision();
+```
+
+```
+e7eeec9a
+```
diff --git a/docs/probabilistic.md b/docs/probabilistic.md
@@ -10,22 +10,26 @@ public proba(Dataset $dataset) : array
 ```php
 $probabilities = $estimator->proba($dataset);  
 
-var_dump($probabilities);
+print_r($probabilities);
 ```
 
-```sh
-array(2) {
-	[0] => array(2) {
-		['monster'] => 0.975,
-		['not monster'] => 0.025,
-	}
-	[1] => array(2) {
-		['monster'] => 0.2,
-		['not monster'] => 0.8,
-	}
-	[2] => array(2) {
-		['monster'] => 0.6,
-		['not monster'] => 0.4,
-	}
-}
+```php
+Array
+(
+    [0] => Array
+        (
+            [monster] => 0.6
+            [not monster] => 0.4
+        )
+    [1] => Array
+        (
+            [monster] => 0.5
+            [not monster] => 0.5
+        )
+    [2] => Array
+        (
+            [monster] => 0.2
+            [not monster] => 0.8
+        )
+)
 ```
diff --git a/docs/ranks-features.md b/docs/ranks-features.md
@@ -12,14 +12,15 @@ $estimator->train($dataset);
 
 $importances = $estimator->featureImportances();
 
-var_dump($importances);
+print_r($importances);
 ```
 
-```sh
-array(4) {
-  [0]=> float(0.047576266783176)
-  [1]=> float(0.3794817175945)
-  [2]=> float(0.53170249909942)
-  [3]=> float(0.041239516522901)
-}
+```php
+Array
+(
+    [0] => 0.04757
+    [1] => 0.37948
+    [2] => 0.53170
+    [3] => 0.04123
+)
 ```
diff --git a/docs/scoring.md b/docs/scoring.md
@@ -10,13 +10,14 @@ public score(Dataset $dataset) : array
 ```php
 $scores = $estimator->score($dataset);
 
-var_dump($scores);
+print_r($scores);
 ```
 
-```sh
-array(3) {
-  [0]=> float(0.35033859096744)
-  [1]=> float(0.40992076925443)
-  [2]=> float(1.68163357834096)
-}
+```php
+Array
+(
+    [0] => 0.35033
+    [1] => 0.40992
+    [2] => 1.68153
+)
 ```
diff --git a/docs/training.md b/docs/training.md
@@ -46,7 +46,7 @@ $estimator->setLogger(new Screen());
 $estimator->train($dataset);
 ```
 
-```sh
+```
 [2020-09-04 08:39:04] INFO: Logistic Regression (batch_size: 128, optimizer: Adam (rate: 0.01, momentum_decay: 0.1, norm_decay: 0.001), alpha: 0.0001, epochs: 1000, min_change: 0.0001, window: 5, cost_fn: Cross Entropy) initialized
 [2020-09-04 08:39:04] INFO: Epoch 1 - Cross Entropy: 0.16895133388673
 [2020-09-04 08:39:04] INFO: Epoch 2 - Cross Entropy: 0.16559247705179
@@ -101,9 +101,9 @@ print_r($importances);
 ```sh
 Array
 (
-    [0] => 0.047576266783176
-    [1] => 0.3794817175945
-    [2] => 0.53170249909942
-    [3] => 0.041239516522901
+    [0] => 0.04757
+    [1] => 0.37948
+    [2] => 0.53170
+    [3] => 0.04123
 )
 ```
diff --git a/docs/verbose.md b/docs/verbose.md
@@ -24,7 +24,7 @@ $estimator->setLogger(new Screen('example'));
 $estimator->train($dataset);
 ```
 
-```sh
+```
 [2020-08-05 04:26:11] INFO: Learner init Adaline {batch_size: 128, optimizer: Adam {rate: 0.01, momentum_decay: 0.1, norm_decay: 0.001}, alpha: 0.0001, epochs: 100, min_change: 0.001, window: 5, cost_fn: Huber Loss {alpha: 1}}
 [2020-08-05 04:26:11] INFO: Training started
 [2020-08-05 04:26:11] example.INFO: Epoch 1 - Huber Loss {alpha: 1}: 0.36839299586132

Original file line number	Diff line number	Diff line change
@@ -29,7 +29,7 @@ We can check the number of records that were imported by calling the `numSamples
`29`	`29`	`echo $dataset->numSamples();`
`30`	`30`	```
`31`	`31`
`32`		-```sh
	`32`	+```
`33`	`33`	`5000`
`34`	`34`	```
`35`	`35`
Original file line number	Diff line number	Diff line change
`@@ -23,7 +23,7 @@ $predictions = $estimator->predict($dataset);`
`23`	`23`	`print_r($predictions);`
`24`	`24`	```
`25`	`25`
`26`		-```sh
	`26`	+```php
`27`	`27`	`Array`
`28`	`28`	`(`
`29`	`29`	`[0] => cat`
`@@ -41,7 +41,7 @@ $probabilities = $estimator->proba($dataset);`
`41`	`41`	`print_r($probabilities);`
`42`	`42`	```
`43`	`43`
`44`		-```sh
	`44`	+```php
`45`	`45`	`Array`
`46`	`46`	`(`
`47`	`47`	`[0] => Array`
`@@ -52,9 +52,9 @@ Array`
`52`	`52`	`)`
`53`	`53`	`[1] => Array`
`54`	`54`	`(`
`55`		`- [cat] => 3.0`
`56`		`- [dog] => 6.0`
`57`		`- [frog] => 1.0`
	`55`	`+ [cat] => 0.3`
	`56`	`+ [dog] => 0.6`
	`57`	`+ [frog] => 0.1`
`58`	`58`	`)`
`59`	`59`	`[2] => Array`
`60`	`60`	`(`
`@@ -74,7 +74,7 @@ $scores = $estimator->score($dataset);`
`74`	`74`	`print_r($scores);`
`75`	`75`	```
`76`	`76`
`77`		-```sh
	`77`	+```php
`78`	`78`	`Array`
`79`	`79`	`(`
`80`	`80`	`[0] => 0.35033`