</p> <h2 class="code-line" data-line-end="1" data-line-start="0"><a id="Why_Cross_Validation_0"/>Why Cross Validation</h2> <p class="has-line-data" data-line-end="6" data-line-start="1">Cross validation is a widely used model validation approach. After a machine learning model is built, it is essential to measure the performance of the model before deployment. Cross validation allows us to understand the model performance from the perspective of both bias and variance, which is a sound and reliable way.<br />In many cases, to get some sense of the model performance quickly, you just divide the dataset into train and test sets once, then compute one or more model quality metrics on the test set. But the measurement obtained in that way can be biased. For instance, only one division of the dataset could leave some patterns that the model needs to learn only in the test set and unseen in the training set. The model will not behave very well and that leads to underestimating model performance. On the other hand, sometimes the model is trained on data that includes sparse features. By coincidence, that particular feature reappears in the test set and the result could look very good. Actually, the model is overfitted and may not behave that well again in the new data. This is related to the concept of model variance, which characterizes how the model prediction varies if using a different pair of train and test set. With only one split, we cannot get a full picture how the model behaves across different train/test splits.<br />So far, the widely accepted way to evaluate model quality avoiding bias and assessing variance is to do K-fold cross validation. This is to divide the dataset into K segments with almost equal size, then build the model by using one segment of the data as a test set and train the model on the other K-1 segments. Repeat the train-validation process K times, which results in K sets of models and metrics. By doing this, we can obtain both the mean and variance of the metrics and thus we can get a clear picture of how the model behaves.<br />The newly released Oracle Machine Learning for Python (<a href="https://blogs.oracle.com/machinelearning/introducing-oracle-machine-learning-for-python-v2">OML4Py</a>) API brings benefits that are similar to those in <a href="https://www.oracle.com/database/technologies/datawarehouse-bigdata/oml4r.html">OML4R</a>: transparency layer, in-database algorithms, and embedded Python execution. New in OML4Py is automated machine learning.<br />Cross validation can be handled conveniently and efficiently using the OML4Py transparency layer. Moreover, by leveraging embedded Python execution, we can parallelize the K train/test processes. We will cover this topic in a two-part blog series. Part I focuses on cross validation for models trained by OML in database algorithms. Part II focuses on cross validation on open source models. In this blog, we will show our approach to do cross validation in OML4Py.</p> <h2 class="code-line" data-line-end="8" data-line-start="7"><a id="Data_Overview_7"/>Data Overview</h2> <p class="has-line-data" data-line-end="9" data-line-start="8">We use the dataset customer insurance lifetime value for our demonstration, an Oracle-produced dataset. The use case involves an insurance company targeting customers likely to buy insurance based on their lifetime value, demographic, and financial features for each customer. The following is a glimpse into this dataset with a subset of the columns.</p> <p class="has-line-data" data-line-end="12" data-line-start="10"><img alt src="https://i2.wp.com/cdn.app.compendium.com/uploads/user/e7c690e8-6ff9-102a-ac6d-e4aebca50425/00b40098-051d-415c-be23-4ceb933d5311/Image/10d53ea6a2828a4249e5f031000c193a/woe_overview1.png?w=1440&ssl=1" style="width: 950px; height: 236px;" data-recalc-dims="1" data-lazy-src="https://i2.wp.com/cdn.app.compendium.com/uploads/user/e7c690e8-6ff9-102a-ac6d-e4aebca50425/00b40098-051d-415c-be23-4ceb933d5311/Image/10d53ea6a2828a4249e5f031000c193a/woe_overview1.png?w=1440&is-pending-load=1#038;ssl=1" srcset="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" class=" jetpack-lazy-image"><noscript><img alt="" src="https://i2.wp.com/cdn.app.compendium.com/uploads/user/e7c690e8-6ff9-102a-ac6d-e4aebca50425/00b40098-051d-415c-be23-4ceb933d5311/Image/10d53ea6a2828a4249e5f031000c193a/woe_overview1.png?w=1440&ssl=1" style="width: 950px; height: 236px;" data-recalc-dims="1"/></noscript><img alt src="https://i1.wp.com/cdn.app.compendium.com/uploads/user/e7c690e8-6ff9-102a-ac6d-e4aebca50425/00b40098-051d-415c-be23-4ceb933d5311/Image/a1dbc4bc2a540f7c738a13bba09cd715/woe_overview2.png?w=1440&ssl=1" style="width: 833px; height: 241px;" data-recalc-dims="1" data-lazy-src="https://i1.wp.com/cdn.app.compendium.com/uploads/user/e7c690e8-6ff9-102a-ac6d-e4aebca50425/00b40098-051d-415c-be23-4ceb933d5311/Image/a1dbc4bc2a540f7c738a13bba09cd715/woe_overview2.png?w=1440&is-pending-load=1#038;ssl=1" srcset="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" class=" jetpack-lazy-image"><noscript><img alt="" src="https://i1.wp.com/cdn.app.compendium.com/uploads/user/e7c690e8-6ff9-102a-ac6d-e4aebca50425/00b40098-051d-415c-be23-4ceb933d5311/Image/a1dbc4bc2a540f7c738a13bba09cd715/woe_overview2.png?w=1440&ssl=1" style="width: 833px; height: 241px;" data-recalc-dims="1"/></noscript></p> <p class="has-line-data" data-line-end="15" data-line-start="13">Based on the column names, we can see that the dataset contains user demographic features such as state, region, gender, marital status, and some financial features like income, credit card limits.<br />The main business problem here is to find out which customer is likely to buy an insurance policy. From the dataset, this ground truth that a customer has purchased the insurance policy based on the column BUY_INSURANCE. This is a typical binary classification problem and we can use all features columns provided in this dataset to build a model.</p> <h2 class="code-line" data-line-end="16" data-line-start="15"><a id="Metrics_for_Model_Evaluation_15"/>Metrics for Model Evaluation</h2> <p class="has-line-data" data-line-end="17" data-line-start="16">To validate the model, we need to choose which metric we want to use. The choice of metric depends on the type of machine learning tasks. For classification problem, we can use metrics like accuracy, AUC (area under curve), precision, recall, F1 – score etc. For regression problem, we can choose metrics like mean squared error (MSE), mean absolute err (MAE) and R squared. Here, the problem we have is a classification task and we will pick AUC as the metric since it does not depend on a specific threshold of the prediction score and it is widely used in many data science projects.</p> <h2 class="code-line" data-line-end="19" data-line-start="18"><a id="Cross_Validation_using_the_OML4Py_Transparency_Layer_18"/>Cross Validation using the OML4Py Transparency Layer</h2> <p class="has-line-data" data-line-end="21" data-line-start="19">The OML4Py transparency layer provides the KFold function to split the data into K folds to support cross validation. Although it looks similar to sklearn.model_selection.KFold, the OML4Py KFold function outputs the OML DataFrames of the split dataset, with no data being pulled from the DB to enable the splits.<br />Let us see an example of a 5-fold cross validation with the code below.</p> <pre> <code class="has-line-data" data-line-end="26" data-line-start="23">fold = <span class="hljs-number">5</span> pairs = CUST_SUBSET_DF.KFold(n_splits = fold) </code></pre> <p class="has-line-data" data-line-end="29" data-line-start="27">After running the code, we will have a tuple that contains 5 pairs of OML DataFrames. Each pair is a tuple (train set, test set) and each element is an OML DataFrame.<br />While we could view the data directly, this would involve pulling it to the client. Instead, a better option is to print out the dimension of the elements and the length, since we are already comfortable with the data set as a whole:</p> <pre> <code class="has-line-data" data-line-end="35" data-line-start="31"><span class="hljs-keyword">for</span> pair <span class="hljs-keyword">in</span> pairs: print(type(pair)) print(pair[<span class="hljs-number">0</span>].shape, pair[<span class="hljs-number">1</span>].shape, len(pair)) </code></pre> <p class="has-line-data" data-line-end="36" data-line-start="35">The output is</p> <pre> <code class="has-line-data" data-line-end="43" data-line-start="37">2 <class 'oml.core.frame.DataFrame'> (11105, 14) <class 'oml.core.frame.DataFrame'> (2775, 14) 2 <class 'oml.core.frame.DataFrame'> (11100, 14) <class 'oml.core.frame.DataFrame'> (2780, 14) 2 <class 'oml.core.frame.DataFrame'> (11148, 14) <class 'oml.core.frame.DataFrame'> (2732, 14) 2 <class 'oml.core.frame.DataFrame'> (11117, 14) <class 'oml.core.frame.DataFrame'> (2763, 14) 2 <class 'oml.core.frame.DataFrame'> (11050, 14) <class 'oml.core.frame.DataFrame'> (2830, 14) </code></pre> <p class="has-line-data" data-line-end="47" data-line-start="44">We can see that the entire dataset (13880 rows) is divided into 5 segments of the sizes (2775, 2780, 2732, 2763, 2830). Each pair contains the particular segment as the test set and the rest as the train set. This provides a convenient way to iterate on each pair for model training and scoring for model validation.<br />Think about the next use case. We want to train a generalized linear model and do cross validation to get the best estimate of model accuracy. In the validation, we use the metric AUC (Area Under the Curve) to check model performance.<br />For scalability, we implemented the AUC computation by calling a SQL query using oml.cursor. This is an efficient and fast computation using Oracle SQL window functions to speed up the computation of AUC score.</p> <pre> <code class="has-line-data" data-line-end="84" data-line-start="49"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">auc_score</span><span class="hljs-params">(table_name, prob, target)</span>:</span> <span class="hljs-keyword">import</span> oml cr = oml.cursor() query_template = <span class="hljs-string">""" WITH pos_prob_and_counts AS ( SELECT <PROB1> pos_prob, DECODE(<TARGET>, 1, 1, 0) pos_cnt FROM <TABLE> ), tpf_fpf AS ( SELECT pos_cnt, SUM(pos_cnt) OVER (ORDER BY pos_prob DESC) /SUM(pos_cnt) OVER () tpf, SUM(1 - pos_cnt) OVER (ORDER BY pos_prob DESC) / SUM(1 - pos_cnt) OVER () fpf FROM pos_prob_and_counts ), trapezoid_areas AS ( SELECT 0.5 * (fpf - LAG(fpf, 1, 0) OVER (ORDER BY fpf, tpf))*(tpf + LAG(tpf, 1, 0) OVER (ORDER BY fpf, tpf)) area FROM tpf_fpf WHERE pos_cnt = 1 OR (tpf = 1 AND fpf = 1) ) SELECT SUM(area) auc FROM trapezoid_areas"""</span> query = query_template.replace(<span class="hljs-string">'<PROB1>'</span>, prob) query = query.replace(<span class="hljs-string">'<TARGET>'</span>, target) query = query.replace(<span class="hljs-string">'<TABLE>'</span>, table_name) _ = cr.execute(query) auc = cr.fetchall() cr.close() <span class="hljs-keyword">return</span> auc[<span class="hljs-number">0</span>][<span class="hljs-number">0</span>] </code></pre> <p class="has-line-data" data-line-end="86" data-line-start="85">We put both the training and testing functionalities into the following function.</p> <pre> <code class="has-line-data" data-line-end="110" data-line-start="88"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">train_and_validate</span><span class="hljs-params">(TRAIN_DF, TEST_DF, target, case_id, idx_fold, prefix, stats_map)</span>:</span> train_x = TRAIN_DF.drop([target]) train_y = TRAIN_DF[target] test_x = TEST_DF test_y = TEST_DF[target] model_name = prefix + <span class="hljs-string">'_'</span> + str(idx_fold) <span class="hljs-keyword">try</span>: oml.drop(model= model_name) <span class="hljs-keyword">except</span>: print(model_name + <span class="hljs-string">" not found"</span>) print(<span class="hljs-string">'Training model'</span> + model_name) setting = {<span class="hljs-string">'GLMS_RIDGE_REGRESSION'</span>: <span class="hljs-string">'GLMS_RIDGE_REG_ENABLE'</span>} glm_mod = oml.glm(<span class="hljs-string">"classification"</span>, **setting) glm_mod.fit(train_x, train_y, case_id = case_id, model_name = model_name) GLM_RES_DF = glm_mod.predict(test_x, supplemental_cols = test_x[[case_id, target]]) GLM_RES_PROB = glm_mod.predict_proba(test_x, supplemental_cols = test_x[case_id]) GLM_RES_DF = GLM_RES_DF.merge(GLM_RES_PROB, how = <span class="hljs-string">"inner"</span>, on = case_id, suffixes = [<span class="hljs-string">""</span>, <span class="hljs-string">""</span>]) GLM_RES_DF = GLM_RES_DF.materialize() stats_map[idx_fold] = auc_score(GLM_RES_DF, <span class="hljs-string">'PROBABILITY_OF_1'</span>, target) <span class="hljs-keyword">return</span> glm_mod </code></pre> <p class="has-line-data" data-line-end="111" data-line-start="110">We can loop through all K pairs of train/test sets in OML DataFrame format and use a Python dictionary to record all the AUCs we obtained as follows.</p> <pre> <code class="has-line-data" data-line-end="126" data-line-start="112">fold = <span class="hljs-number">5</span> pairs = CUST_SUBSET_DF.KFold(n_splits = fold) stats_map = {} models = [] <span class="hljs-keyword">for</span> i, pair <span class="hljs-keyword">in</span> enumerate(pairs): TRAIN_DF, TEST_DF = pair print(<span class="hljs-string">'Running fold %s '</span> % i) print(<span class="hljs-string">'Training data:'</span>) print(TRAIN_DF.shape) print(<span class="hljs-string">'Test data:'</span>) print(TEST_DF.shape) model = train_and_validate(TRAIN_DF, TEST_DF, <span class="hljs-string">'BUY_INSURANCE'</span>, <span class="hljs-string">'CUSTOMER_ID'</span>, i, <span class="hljs-string">'GLM_LTV_MDL'</span>, stats_map) models.append(model) </code></pre> <p class="has-line-data" data-line-end="127" data-line-start="126">After the loop, we want to check the K models built in this process. In the code, we saved the models from the function output. We can also check the models from the list of models right away. Another way to retrieve the model is that we can leverage the OML model constructor and the model name.</p> <pre> <code class="has-line-data" data-line-end="133" data-line-start="129"><span class="hljs-keyword">for</span> id <span class="hljs-keyword">in</span> range(fold): model = oml.glm(model_name = <span class="hljs-string">'LTV_MDL_'</span> + str(id)) model </code></pre> <p class="has-line-data" data-line-end="135" data-line-start="134">We can take a look at the information of one of the models</p> <pre> <code class="has-line-data" data-line-end="159" data-line-start="137">Model Name: LTV_MDL_0 Model Owner: JIE Algorithm Name: Generalized Linear Model Mining Function: CLASSIFICATION Target: BUY_INSURANCE Settings: setting name setting value <span class="hljs-number">0</span> ALGO_NAME ALGO_GENERALIZED_LINEAR_MODEL <span class="hljs-number">1</span> CLAS_WEIGHTS_BALANCED OFF <span class="hljs-number">2</span> GLMS_CONF_LEVEL <span class="hljs-number">.95</span> <span class="hljs-number">3</span> GLMS_FTR_GENERATION GLMS_FTR_GENERATION_DISABLE <span class="hljs-number">4</span> GLMS_FTR_SELECTION GLMS_FTR_SELECTION_DISABLE <span class="hljs-number">5</span> ODMS_DETAILS ODMS_ENABLE <span class="hljs-number">6</span> ODMS_MISSING_VALUE_TREATMENT ODMS_MISSING_VALUE_AUTO <span class="hljs-number">7</span> ODMS_SAMPLING ODMS_SAMPLING_DISABLE <span class="hljs-number">8</span> PREP_AUTO ON </code></pre> <p class="has-line-data" data-line-end="161" data-line-start="160">The benefit of cross validation is that we can compute the mean and standard deviation of the AUC scores to have a clearer picture of how the model performs across all the data.</p> <pre> <code class="has-line-data" data-line-end="167" data-line-start="163">auc_scores = list(stats_map.values()) print(np.around(auc_scores, <span class="hljs-number">4</span>)) print(<span class="hljs-string">'Average AUC %s, STD %s'</span> % (np.round(np.mean(auc_scores),<span class="hljs-number">4</span>), np.round(np.std(auc_scores),<span class="hljs-number">5</span>))) </code></pre> <p class="has-line-data" data-line-end="168" data-line-start="167">Let’s check the output:</p> <pre> <code class="has-line-data" data-line-end="172" data-line-start="169">[0.7677 0.7744 0.7788 0.764 0.7694] Average AUC 0.7708, STD 0.00519 </code></pre> <p class="has-line-data" data-line-end="173" data-line-start="172">The result shows that the average AUC score is 0.7708. We also have a sense of the variance: the standard deviation is 0.00519.</p> <h2 class="code-line" data-line-end="175" data-line-start="174"><a id="Run_Cross_Validation_in_Parallel_174"/>Run Cross Validation in Parallel</h2> <p class="has-line-data" data-line-end="177" data-line-start="175">In the last example, we iterate through all K folds and repeat the same train and test processes K times. Depending on the data size, this can be a time costly approach. Alternatively, we can run the K computations in parallel. To achieve this, we can use Embedded Python Execution (EPE) of OML4Py. EPE allows the user to specify the desired number of Python engines that the database environment should start to run the user-defined Python function. EPE functions support automatically or manually loading database table data into the user-defined function. In this example, we show how to submit multiple in-DB OML mode training and testing jobs through index_apply() in parallel.<br />The first thing we need to do is upload the python script to the OML4Py script repository. This is to submit the source code of the two functions we are using to ADB and let ADB have access to the code so that they can be used by database spawned and controlled Python engines. We can use the oml.script.create function to achieve that.</p> <pre> <code class="has-line-data" data-line-end="239" data-line-start="180">auc_src = <span class="hljs-string">''' def auc_score(table_name, prob, target): import oml cr = oml.cursor() query_template = """ WITH pos_prob_and_counts AS ( SELECT <PROB1> pos_prob, DECODE(<TARGET>, 1, 1, 0) pos_cnt FROM <TABLE> ), tpf_fpf AS ( SELECT pos_cnt, SUM(pos_cnt) OVER (ORDER BY pos_prob DESC) /SUM(pos_cnt) OVER () tpf, SUM(1 - pos_cnt) OVER (ORDER BY pos_prob DESC) / SUM(1 - pos_cnt) OVER () fpf FROM pos_prob_and_counts ), trapezoid_areas AS ( SELECT 0.5 * (fpf - LAG(fpf, 1, 0) OVER (ORDER BY fpf, tpf))*(tpf + LAG(tpf, 1, 0) OVER (ORDER BY fpf, tpf)) area FROM tpf_fpf WHERE pos_cnt = 1 OR (tpf = 1 AND fpf = 1) ) SELECT SUM(area) auc FROM trapezoid_areas""" query = query_template.replace('<PROB1>', prob) query = query.replace('<TARGET>', target) query = query.replace('<TABLE>', table_name) _ = cr.execute(query) auc = cr.fetchall() cr.close() return auc[0][0]'''</span> train_src = <span class="hljs-string">""" def train_and_validate(TRAIN_DF, TEST_DF, target, case_id, idx_fold, prefix, stats_map): import oml auc_score = oml.script.load('auc_score') train_x = TRAIN_DF.drop([target]) train_y = TRAIN_DF[target] test_x = TEST_DF test_y = TEST_DF[target] model_name = prefix + '_' + str(idx_fold) try: oml.drop(model= model_name) except: print(model_name + " not found") print('Training model' + model_name) setting = {'GLMS_RIDGE_REGRESSION': 'GLMS_RIDGE_REG_ENABLE'} glm_mod = oml.glm("classification", **setting) glm_mod.fit(train_x, train_y, case_id = case_id, model_name = model_name) GLM_RES_DF = glm_mod.predict(test_x, supplemental_cols = test_x[[case_id, target]]) GLM_RES_PROB = glm_mod.predict_proba(test_x, supplemental_cols = test_x[case_id]) GLM_RES_DF = GLM_RES_DF.merge(GLM_RES_PROB, how = "inner", on = case_id, suffixes = ["", ""]) GLM_RES_DF = GLM_RES_DF.materialize() stats_map[idx_fold] = auc_score(GLM_RES_DF, 'PROBABILITY_OF_1', target) return glm_mod"""</span> oml.script.create(<span class="hljs-string">'auc_score'</span>, auc_src, is_global = <span class="hljs-keyword">True</span>, overwrite = <span class="hljs-keyword">True</span>) oml.script.create(<span class="hljs-string">'train_and_validate'</span>, train_src, is_global = <span class="hljs-keyword">True</span>, overwrite = <span class="hljs-keyword">True</span>) </code></pre> <p class="has-line-data" data-line-end="241" data-line-start="240">To further improve runtime performance, we can materialize the K fold train and test tables as follows.</p> <pre> <code class="has-line-data" data-line-end="260" data-line-start="243">fold = <span class="hljs-number">5</span> pairs = CUST_SUBSET_DF.KFold(n_splits = fold) <span class="hljs-keyword">for</span> i, pair <span class="hljs-keyword">in</span> enumerate(pairs): train_tbl_name = <span class="hljs-string">'CUST_TRAIN_TBL_'</span> test_tbl_name = <span class="hljs-string">'CUST_TEST_TBL_'</span> train_tbl_name += str(i+<span class="hljs-number">1</span>) test_tbl_name += str(i+<span class="hljs-number">1</span>) <span class="hljs-keyword">try</span>: oml.drop(table = train_tbl_name) oml.drop(table = test_tbl_name) <span class="hljs-keyword">except</span>: print(<span class="hljs-string">"No such table"</span>) _ = pair[<span class="hljs-number">0</span>].materialize(table = train_tbl_name) _ = pair[<span class="hljs-number">1</span>].materialize(table = test_tbl_name) print(train_tbl_name) print(test_tbl_name) </code></pre> <p class="has-line-data" data-line-end="262" data-line-start="261">After that, we can write a function to call <em>train_and_validate</em> uploaded above to do the cross validation. The input argument is the run id of each job. In this example, it is equivalent to the i-th fold.</p> <pre> <code class="has-line-data" data-line-end="274" data-line-start="263"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">build_oml</span><span class="hljs-params">(idx)</span>:</span> <span class="hljs-keyword">import</span> oml train_and_validate = oml.script.load(<span class="hljs-string">'train_and_validate'</span>) stats_map = {} train_tbl_name = <span class="hljs-string">'CUST_TRAIN_TBL_'</span> test_tbl_name = <span class="hljs-string">'CUST_TEST_TBL_'</span> TRAIN_DF = oml.sync(table = train_tbl_name + str(idx)) TEST_DF = oml.sync(table = test_tbl_name + str(idx)) train_and_validate(TRAIN_DF, TEST_DF, <span class="hljs-string">'BUY_INSURANCE'</span>, <span class="hljs-string">'CUSTOMER_ID'</span>, idx, <span class="hljs-string">'GLM_LTV_MDL'</span>, stats_map) <span class="hljs-keyword">return</span> stats_map[idx] </code></pre> <p class="has-line-data" data-line-end="275" data-line-start="274">Supply this function as the input to oml.index_apply, then we can let all K=5 folds run at the same time. We let each job output the AUC score for each fold.</p> <pre> <code class="has-line-data" data-line-end="280" data-line-start="276">res = oml.index_apply(times=<span class="hljs-number">5</span>, func=build_oml, oml_connect=<span class="hljs-keyword">True</span>, parallel = <span class="hljs-number">5</span>) res [<span class="hljs-number">0.7127303485647599</span>, <span class="hljs-number">0.7054260012897073</span>, <span class="hljs-number">0.6969313963582076</span>, <span class="hljs-number">0.7049151140304264</span>, <span class="hljs-number">0.6859887524179962</span>] </code></pre> <p class="has-line-data" data-line-end="283" data-line-start="280">One may wonder how much time this parallel execution saves. In general, it depends on many factors, such as the dataset size, number of OCPUs, service level (low, medium, high) and of course the number of cross validation folds K. In our example, the dataset is relatively small (13880). We are testing under medium and using 16 OCPUs. The entire process takes around 28s, while the loop approach takes around 36s, which is not a major difference. However, as data volumes grow and number of folds increases, the difference can become more significant. Consider a data set just 20 times larger and the loop approach takes 68 seconds, while the parallel approach takes 37 seconds. If we go even further to test a K=20 cross validation, the loop approach takes 4 minutes 34 seconds and the parallel approach takes 1 minutes 15 seconds. This is also an illustration of who embedded Python execution can be used in creative ways to reduce elapsed runtimes. Therefore, we are better off with the parallel approach in this case.<br />As noted above, multiple factors will impact the actual runtime. To get a sense of how the performance changes in different settings, we have a two blog posts discussing <a href="https://blogs.oracle.com/machinelearning/machine-learning-performance-on-autonomous-database">model build performance</a> and <a href="https://blogs.oracle.com/machinelearning/machine-learning-scoring-performance-on-autonomous-database">scoring performance</a>.<br />If the user has purely open source training/testing jobs, index_apply will likely lead to performance benefits because each user-defined function invocation is running in one python engine. We will show how to achieve that in cross validation Part II.</p> <h2 class="code-line" data-line-end="284" data-line-start="283"><a id="Conclusion_283"/>Conclusion</h2> <p class="has-line-data" data-line-end="285" data-line-start="284">In this blog, we demonstrated the convenience of running cross validation on in-database models with OML4Py. By using KFold function and AUC computation in transparency layer, the cross validation is made easy. We also show how to submit multiple jobs in parallel using index apply and python script repository. We can see the benefit of running in database algorithms in parallel for this dataset example and also discussed performance variation in different situations.</p> </p></div> <p><br /> <br /><a href="https://blogs.oracle.com/machinelearning/cross-validation-using-oml4py-part-i"> Source link </a></p> <div class="post-views post-1539 entry-meta"> <span class="post-views-icon dashicons dashicons-chart-bar"></span> <span class="post-views-label">Post Views:</span> <span class="post-views-count">37</span> </div><!-- AddThis Advanced Settings above via filter on the_content --><!-- AddThis Advanced Settings below via filter on the_content --><!-- AddThis Advanced Settings generic via filter on the_content --><!-- AddThis Share Buttons above via filter on the_content --><!-- AddThis Share Buttons below via filter on the_content --><div class="at-below-post addthis_tool" data-url="https://machinelearningmastery.in/2021/05/06/cross-validation-using-oml4py-part-i/"></div><!-- AddThis Share Buttons generic via filter on the_content --><div class="sharedaddy sd-sharing-enabled"><div class="robots-nocontent sd-block sd-social sd-social-official sd-sharing"><h3 class="sd-title">Share this:</h3><div class="sd-content"><ul><li class="share-twitter"><a href="https://twitter.com/share" class="twitter-share-button" data-url="https://machinelearningmastery.in/2021/05/06/cross-validation-using-oml4py-part-i/" data-text="Cross Validation Using OML4Py Part I" data-via="sitworld" >Tweet</a></li><li class="share-facebook"><div class="fb-share-button" data-href="https://machinelearningmastery.in/2021/05/06/cross-validation-using-oml4py-part-i/" data-layout="button_count"></div></li><li class="share-linkedin"><div class="linkedin_button"><script type="in/share" data-url="https://machinelearningmastery.in/2021/05/06/cross-validation-using-oml4py-part-i/" data-counter="right"></script></div></li><li class="share-reddit"><div class="reddit_button"><iframe src="https://www.reddit.com/static/button/button1.html?newwindow=true&width=120&url=https%3A%2F%2Fmachinelearningmastery.in%2F2021%2F05%2F06%2Fcross-validation-using-oml4py-part-i%2F&title=Cross%20Validation%20Using%20OML4Py%20Part%20I" height="22" width="120" scrolling="no" frameborder="0"></iframe></div></li><li class="share-telegram"><a rel="nofollow noopener noreferrer" data-shared="" class="share-telegram sd-button" href="https://machinelearningmastery.in/2021/05/06/cross-validation-using-oml4py-part-i/?share=telegram" target="_blank" title="Click to share on Telegram"><span>Telegram</span></a></li><li class="share-jetpack-whatsapp"><a rel="nofollow noopener noreferrer" data-shared="" class="share-jetpack-whatsapp sd-button" href="https://machinelearningmastery.in/2021/05/06/cross-validation-using-oml4py-part-i/?share=jetpack-whatsapp" target="_blank" title="Click to share on WhatsApp"><span>WhatsApp</span></a></li><li class="share-print"><a rel="nofollow noopener noreferrer" data-shared="" class="share-print sd-button" href="https://machinelearningmastery.in/2021/05/06/cross-validation-using-oml4py-part-i/#print" target="_blank" title="Click to print"><span>Print</span></a></li><li class="share-tumblr"><a class="tumblr-share-button" target="_blank" href="https://www.tumblr.com/share" data-title="Cross Validation Using OML4Py Part I" data-content="https://machinelearningmastery.in/2021/05/06/cross-validation-using-oml4py-part-i/" title="Share on Tumblr">Share on Tumblr</a></li><li class="share-pinterest"><div class="pinterest_button"><a href="https://www.pinterest.com/pin/create/button/?url=https%3A%2F%2Fmachinelearningmastery.in%2F2021%2F05%2F06%2Fcross-validation-using-oml4py-part-i%2F&media=https%3A%2F%2Fi1.wp.com%2Fmachinelearningmastery.in%2Fwp-content%2Fuploads%2F2021%2F07%2Fwoe_overview1.png%3Ffit%3D950%252C236%26ssl%3D1&description=Cross%20Validation%20Using%20OML4Py%20Part%20I" data-pin-do="buttonPin" data-pin-config="beside"><img src="https://i2.wp.com/assets.pinterest.com/images/pidgets/pinit_fg_en_rect_gray_20.png?w=1440" data-recalc-dims="1" data-lazy-src="https://i2.wp.com/assets.pinterest.com/images/pidgets/pinit_fg_en_rect_gray_20.png?w=1440&is-pending-load=1" srcset="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" class=" jetpack-lazy-image"><noscript><img src="https://i2.wp.com/assets.pinterest.com/images/pidgets/pinit_fg_en_rect_gray_20.png?w=1440" data-recalc-dims="1" /></noscript></a></div></li><li class="share-skype"><div class="skype-share" data-href="https://machinelearningmastery.in/2021/05/06/cross-validation-using-oml4py-part-i/" data-lang="en-US" data-style="small" data-source="jetpack" ></div></li><li class="share-email"><a rel="nofollow noopener noreferrer" data-shared="" class="share-email sd-button" href="https://machinelearningmastery.in/2021/05/06/cross-validation-using-oml4py-part-i/?share=email" target="_blank" title="Click to email this to a friend"><span>Email</span></a></li><li class="share-end"></li></ul></div></div></div><div class='sharedaddy sd-block sd-like jetpack-likes-widget-wrapper jetpack-likes-widget-unloaded' id='like-post-wrapper-170785677-1539-6148f15a9b07f' data-src='https://widgets.wp.com/likes/#blog_id=170785677&post_id=1539&origin=machinelearningmastery.in&obj_id=170785677-1539-6148f15a9b07f' data-name='like-post-frame-170785677-1539-6148f15a9b07f'><h3 class="sd-title">Like this:</h3><div class='likes-widget-placeholder post-likes-widget-placeholder' style='height: 55px;'><span class='button'><span>Like</span></span> <span class="loading">Loading...</span></div><span class='sd-text-color'></span><a class='sd-link-color'></a></div> <div id='jp-relatedposts' class='jp-relatedposts' > <h3 class="jp-relatedposts-headline"><em>Related</em></h3> </div> </div> </div><!-- .entry-content --> <div class="screen-reader-text" itemprop="datePublished" itemtype="https://schema.org/Date">2021-05-06</div> </article><!-- .entry --> <div id="loop-nav-wrap" class="loop-nav"><div class="prev">Previous Post: <a href="https://machinelearningmastery.in/2021/05/06/create-a-serverless-pipeline-to-translate-large-documents-with-amazon-translate/" rel="prev">Create a serverless pipeline to translate large documents with Amazon Translate</a></div><div class="next">Next Post: <a href="https://machinelearningmastery.in/2021/05/07/build-an-intelligent-search-solution-with-automated-content-enrichment/" rel="next">Build an intelligent search solution with automated content enrichment</a></div></div><!-- .loop-nav --> <section id="comments-template"> <div id="respond" class="comment-respond"> <h3 id="reply-title" class="comment-reply-title">Leave a Reply <small><a rel="nofollow" id="cancel-comment-reply-link" href="/2021/05/06/cross-validation-using-oml4py-part-i/#respond" style="display:none;">Cancel reply</a></small></h3><form action="https://machinelearningmastery.in/wp-comments-post.php" method="post" id="commentform" class="comment-form" novalidate><p class="comment-notes"><span id="email-notes">Your email address will not be published.</span></p><p class="comment-form-comment"><label for="comment">Comment</label> <textarea id="comment" name="comment" cols="45" rows="8" maxlength="65525" required="required"></textarea></p><p class="comment-form-author"><label for="author">Name</label> <input id="author" name="author" type="text" value="" size="30" maxlength="245" /></p> <p class="comment-form-email"><label for="email">Email</label> <input id="email" name="email" type="email" value="" size="30" maxlength="100" aria-describedby="email-notes" /></p> <p class="comment-form-url"><label for="url">Website</label> <input id="url" name="url" type="url" value="" size="30" maxlength="200" /></p> <p class="comment-form-cookies-consent"><input id="wp-comment-cookies-consent" name="wp-comment-cookies-consent" type="checkbox" value="yes" /> <label for="wp-comment-cookies-consent">Save my name, email, and website in this browser for the next time I comment.</label></p> <p class="form-submit"><input name="submit" type="submit" id="submit" class="submit" value="Post Comment" /> <input type='hidden' name='comment_post_ID' value='1539' id='comment_post_ID' /> <input type='hidden' name='comment_parent' id='comment_parent' value='0' /> </p><p style="display: none;"><input type="hidden" id="akismet_comment_nonce" name="akismet_comment_nonce" value="e92aa9a493" /></p><input type="hidden" id="ak_js" name="ak_js" value="100"/><textarea name="ak_hp_textarea" cols="45" rows="8" maxlength="100" style="display: none !important;"></textarea></form> </div><!-- #respond --> </section><!-- #comments-template --> </div><!-- #content-wrap --> </main><!-- #content --> <aside id="sidebar-primary" class="sidebar sidebar-primary hgrid-span-3 layout-narrow-right " role="complementary" itemscope="itemscope" itemtype="https://schema.org/WPSideBar"> <div class=" sidebar-wrap"> <section id="tag_cloud-3" class="widget widget_tag_cloud"><h3 class="widget-title"><span>Categories</span></h3><div class="tagcloud"><a href="https://machinelearningmastery.in/category/articles/" class="tag-cloud-link tag-link-404 tag-link-position-1" style="font-size: 11.529411764706pt;" aria-label="Articles (7 items)">Articles</a> <a href="https://machinelearningmastery.in/category/automation-anywhere/" class="tag-cloud-link tag-link-158 tag-link-position-2" style="font-size: 9.0588235294118pt;" aria-label="Automation Anywhere (2 items)">Automation Anywhere</a> <a href="https://machinelearningmastery.in/category/certification/" class="tag-cloud-link tag-link-12 tag-link-position-3" style="font-size: 10.352941176471pt;" aria-label="Certification (4 items)">Certification</a> <a href="https://machinelearningmastery.in/category/cloud/" class="tag-cloud-link tag-link-289 tag-link-position-4" style="font-size: 10.352941176471pt;" aria-label="Cloud (4 items)">Cloud</a> <a href="https://machinelearningmastery.in/category/code/" class="tag-cloud-link tag-link-511 tag-link-position-5" style="font-size: 8pt;" aria-label="Code (1 item)">Code</a> <a href="https://machinelearningmastery.in/category/database-2/" class="tag-cloud-link tag-link-593 tag-link-position-6" style="font-size: 8pt;" aria-label="Database (1 item)">Database</a> <a href="https://machinelearningmastery.in/category/data-science/" class="tag-cloud-link tag-link-9 tag-link-position-7" style="font-size: 12.117647058824pt;" aria-label="Data Science (9 items)">Data Science</a> <a href="https://machinelearningmastery.in/category/data-science-topics/" class="tag-cloud-link tag-link-530 tag-link-position-8" style="font-size: 9.0588235294118pt;" aria-label="data science topics (2 items)">data science topics</a> <a href="https://machinelearningmastery.in/category/data-science-update/" class="tag-cloud-link tag-link-13 tag-link-position-9" style="font-size: 22pt;" aria-label="Data Science Update (478 items)">Data Science Update</a> <a href="https://machinelearningmastery.in/category/deep-learning/" class="tag-cloud-link tag-link-290 tag-link-position-10" style="font-size: 11.235294117647pt;" aria-label="Deep Learning (6 items)">Deep Learning</a> <a href="https://machinelearningmastery.in/category/financial-assistance/" class="tag-cloud-link tag-link-8 tag-link-position-11" style="font-size: 8pt;" aria-label="Financial assistance (1 item)">Financial assistance</a> <a href="https://machinelearningmastery.in/category/google-cloud/" class="tag-cloud-link tag-link-583 tag-link-position-12" style="font-size: 9.0588235294118pt;" aria-label="Google Cloud (2 items)">Google Cloud</a> <a href="https://machinelearningmastery.in/category/interview-tips/" class="tag-cloud-link tag-link-181 tag-link-position-13" style="font-size: 8pt;" aria-label="Interview tips (1 item)">Interview tips</a> <a href="https://machinelearningmastery.in/category/machine-learning/" class="tag-cloud-link tag-link-11 tag-link-position-14" style="font-size: 15.647058823529pt;" aria-label="Machine Learning (39 items)">Machine Learning</a> <a href="https://machinelearningmastery.in/category/open-data-source/" class="tag-cloud-link tag-link-207 tag-link-position-15" style="font-size: 9.7647058823529pt;" aria-label="Open Data Source (3 items)">Open Data Source</a> <a href="https://machinelearningmastery.in/category/power-bi/" class="tag-cloud-link tag-link-341 tag-link-position-16" style="font-size: 9.0588235294118pt;" aria-label="Power BI (2 items)">Power BI</a> <a href="https://machinelearningmastery.in/category/project-management/" class="tag-cloud-link tag-link-409 tag-link-position-17" style="font-size: 9.7647058823529pt;" aria-label="Project Management (3 items)">Project Management</a> <a href="https://machinelearningmastery.in/category/python/" class="tag-cloud-link tag-link-2 tag-link-position-18" style="font-size: 12.117647058824pt;" aria-label="Python (9 items)">Python</a> <a href="https://machinelearningmastery.in/category/quiz-of-the-day/" class="tag-cloud-link tag-link-429 tag-link-position-19" style="font-size: 8pt;" aria-label="Quiz of the Day (1 item)">Quiz of the Day</a> <a href="https://machinelearningmastery.in/category/robotic-process-automation/" class="tag-cloud-link tag-link-159 tag-link-position-20" style="font-size: 9.0588235294118pt;" aria-label="Robotic Process Automation (2 items)">Robotic Process Automation</a> <a href="https://machinelearningmastery.in/category/r-programming/" class="tag-cloud-link tag-link-157 tag-link-position-21" style="font-size: 9.0588235294118pt;" aria-label="R Programming (2 items)">R Programming</a> <a href="https://machinelearningmastery.in/category/sas/" class="tag-cloud-link tag-link-156 tag-link-position-22" style="font-size: 10.823529411765pt;" aria-label="SAS (5 items)">SAS</a> <a href="https://machinelearningmastery.in/category/statistics/" class="tag-cloud-link tag-link-337 tag-link-position-23" style="font-size: 10.352941176471pt;" aria-label="Statistics (4 items)">Statistics</a> <a href="https://machinelearningmastery.in/category/tableau/" class="tag-cloud-link tag-link-340 tag-link-position-24" style="font-size: 9.0588235294118pt;" aria-label="Tableau (2 items)">Tableau</a> <a href="https://machinelearningmastery.in/category/visualization/" class="tag-cloud-link tag-link-10 tag-link-position-25" style="font-size: 12.352941176471pt;" aria-label="visualization (10 items)">visualization</a></div> </section><section id="newsletterwidget-2" class="widget widget_newsletterwidget"><div class="tnp tnp-widget"><form method="post" action="https://machinelearningmastery.in/?na=s"> <input type="hidden" name="nr" value="widget"><input type="hidden" name="nlang" value=""><div class="tnp-field tnp-field-email"><label for="tnp-email">Email</label> <input class="tnp-email" type="email" name="ne" value="" required></div> <div class="tnp-field tnp-field-button"><input class="tnp-submit" type="submit" value="Subscribe" > </div> </form> </div></section> <section id="recent-posts-2" class="widget widget_recent_entries"> <h3 class="widget-title"><span>Recent Posts</span></h3> <ul> <li> <a href="https://machinelearningmastery.in/2021/09/20/how-to-label-time-series-efficiently-and-boost-your-ai/">How to label time series efficiently – and boost your AI</a> </li> <li> <a href="https://machinelearningmastery.in/2021/09/20/how-to-be-a-data-scientist-without-a-stem-degree/">How to be a Data Scientist without a STEM degree</a> </li> <li> <a href="https://machinelearningmastery.in/2021/09/17/arcelik-hosts-global-aws-deepracer-league-using-new-live-feature-to-educate-over-200-employees-on-machine-learning/">Arçelik hosts global AWS DeepRacer League using new LIVE feature to educate over 200 employees on machine learning</a> </li> <li> <a href="https://machinelearningmastery.in/2021/09/17/paradoxes-in-data-science-kdnuggets/">Paradoxes in Data Science – KDnuggets</a> </li> <li> <a href="https://machinelearningmastery.in/2021/09/17/what-2-years-of-self-teaching-data-science-taught-me/">What 2 years of self-teaching data science taught me</a> </li> <li> <a href="https://machinelearningmastery.in/2021/09/17/introducing-tensorflow-similarity-kdnuggets/">Introducing TensorFlow Similarity – KDnuggets</a> </li> <li> <a href="https://machinelearningmastery.in/2021/09/16/what-is-the-real-difference-between-data-engineers-and-data-scientists/">What Is The Real Difference Between Data Engineers and Data Scientists?</a> </li> <li> <a href="https://machinelearningmastery.in/2021/09/16/adventures-in-mlops-with-github-actions-iterative-ai-label-studio-and-nbdev/">Adventures in MLOps with Github Actions, Iterative.ai, Label Studio and NBDEV</a> </li> <li> <a href="https://machinelearningmastery.in/2021/09/16/the-machine-deep-learning-compendium-open-book/">The Machine & Deep Learning Compendium Open Book</a> </li> <li> <a href="https://machinelearningmastery.in/2021/09/16/easy-sql-in-native-python/">Easy SQL in Native Python</a> </li> </ul> </section> </div><!-- .sidebar-wrap --> </aside><!-- #sidebar-primary --> </div><!-- .main-content-grid --> </div><!-- #main --> <footer id="footer" class="site-footer footer hgrid-stretch inline-nav" role="contentinfo" itemscope="itemscope" itemtype="https://schema.org/WPFooter"> <div class="hgrid"> <div class="hgrid-span-6 footer-column"> <section id="hootkit-ticker-9" class="widget widget_hootkit-ticker"> <div class="ticker-widget ticker-usercontent ticker-simple ticker-userstyle ticker-withbg ticker-style1" style="background:#f1f1f1;color:#ff4530;" ><i class="fa-weixin fab ticker-icon"></i> <div class="ticker-msg-box" data-speed='0.03'> <div class="ticker-msgs"> <div class="ticker-msg"><div class="ticker-msg-inner">Subscribe for the latest news, updates, tips and more delivered right to your inbox.</div></div> </div> </div> </div></section> </div> <div class="hgrid-span-3 footer-column"> <section id="media_image-13" class="widget widget_media_image"><img width="220" height="49" src="https://i2.wp.com/machinelearningmastery.in/wp-content/uploads/2019/12/Machine-Learning-Mastery-banner.gif?fit=220%2C49&ssl=1" class="image wp-image-127 attachment-full size-full jetpack-lazy-image" alt="" loading="lazy" style="max-width: 100%; height: auto;" data-lazy-src="https://i2.wp.com/machinelearningmastery.in/wp-content/uploads/2019/12/Machine-Learning-Mastery-banner.gif?fit=220%2C49&ssl=1&is-pending-load=1" srcset="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" /></section> </div> <div class="hgrid-span-3 footer-column"> <section id="hootkit-social-icons-8" class="widget widget_hootkit-social-icons"> <div class="social-icons-widget social-icons-small"><a href="https://github.com/machinelearningmasteryindia" class=" social-icons-icon fa-github-block" target="_blank"> <i class="fa-github fab"></i> </a><a href="mailto:machinelearningmasteryindia@gmail.com" class=" social-icons-icon fa-envelope-block"> <i class="fa-envelope fas"></i> </a><a href="https://www.linkedin.com/in/machine-learning-b065081a9/" class=" social-icons-icon fa-linkedin-block" target="_blank"> <i class="fa-linkedin-in fab"></i> </a><a href="https://twitter.com/sitworld" class=" social-icons-icon fa-twitter-block" target="_blank"> <i class="fa-twitter fab"></i> </a></div></section> </div> </div> </footer><!-- #footer --> <div id="post-footer" class=" post-footer hgrid-stretch linkstyle"> <div class="hgrid"> <div class="hgrid-span-12"> <p class="credit small"> <a class="privacy-policy-link" href="https://machinelearningmastery.in/privacy-policy/">Privacy Policy</a> Designed using <a class="theme-link" href="https://wphoot.com/themes/unos/" title="Unos WordPress Theme">Unos</a>. Powered by <a class="wp-link" href="https://wordpress.org">WordPress</a>. </p><!-- .credit --> </div> </div> </div> </div><!-- #page-wrapper --> <!--googleoff: all--><div id="cookie-law-info-bar" data-nosnippet="true"><span>This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. <a role='button' tabindex='0' class="cli_settings_button" style="margin:5px 20px 5px 20px;" >Cookie settings</a><a role='button' tabindex='0' data-cli_action="accept" id="cookie_action_close_header" class="medium cli-plugin-button cli-plugin-main-button cookie_action_close_header cli_action_button" style="display:inline-block; margin:5px; ">ACCEPT</a></span></div><div id="cookie-law-info-again" style="display:none;" data-nosnippet="true"><span id="cookie_hdr_showagain">Privacy & Cookies Policy</span></div><div class="cli-modal" data-nosnippet="true" id="cliSettingsPopup" tabindex="-1" role="dialog" aria-labelledby="cliSettingsPopup" aria-hidden="true"> <div class="cli-modal-dialog" role="document"> <div class="cli-modal-content cli-bar-popup"> <button type="button" class="cli-modal-close" id="cliModalClose"> <svg class="" viewBox="0 0 24 24"><path d="M19 6.41l-1.41-1.41-5.59 5.59-5.59-5.59-1.41 1.41 5.59 5.59-5.59 5.59 1.41 1.41 5.59-5.59 5.59 5.59 1.41-1.41-5.59-5.59z"></path><path d="M0 0h24v24h-24z" fill="none"></path></svg> <span class="wt-cli-sr-only">Close</span> </button> <div class="cli-modal-body"> <div class="cli-container-fluid cli-tab-container"> <div class="cli-row"> <div class="cli-col-12 cli-align-items-stretch cli-px-0"> <div class="cli-privacy-overview"> <h4>Privacy Overview</h4> <div class="cli-privacy-content"> <div class="cli-privacy-content-text">This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.</div> </div> <a class="cli-privacy-readmore" aria-label="Show more" tabindex="0" role="button" data-readmore-text="Show more" data-readless-text="Show less"></a> </div> </div> <div class="cli-col-12 cli-align-items-stretch cli-px-0 cli-tab-section-container"> <div class="cli-tab-section"> <div class="cli-tab-header"> <a role="button" tabindex="0" class="cli-nav-link cli-settings-mobile" data-target="necessary" data-toggle="cli-toggle-tab"> Necessary </a> <div class="wt-cli-necessary-checkbox"> <input type="checkbox" class="cli-user-preference-checkbox" id="wt-cli-checkbox-necessary" data-id="checkbox-necessary" checked="checked" /> <label class="form-check-label" for="wt-cli-checkbox-necessary">Necessary</label> </div> <span class="cli-necessary-caption">Always Enabled</span> </div> <div class="cli-tab-content"> <div class="cli-tab-pane cli-fade" data-id="necessary"> <div class="wt-cli-cookie-description"> Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information. </div> </div> </div> </div> <div class="cli-tab-section"> <div class="cli-tab-header"> <a role="button" tabindex="0" class="cli-nav-link cli-settings-mobile" data-target="non-necessary" data-toggle="cli-toggle-tab"> Non-necessary </a> <div class="cli-switch"> <input type="checkbox" id="wt-cli-checkbox-non-necessary" class="cli-user-preference-checkbox" data-id="checkbox-non-necessary" checked='checked' /> <label for="wt-cli-checkbox-non-necessary" class="cli-slider" data-cli-enable="Enabled" data-cli-disable="Disabled"><span class="wt-cli-sr-only">Non-necessary</span></label> </div> </div> <div class="cli-tab-content"> <div class="cli-tab-pane cli-fade" data-id="non-necessary"> <div class="wt-cli-cookie-description"> Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website. </div> </div> </div> </div> </div> </div> </div> </div> <div class="cli-modal-footer"> <div class="wt-cli-element cli-container-fluid cli-tab-container"> <div class="cli-row"> <div class="cli-col-12 cli-align-items-stretch cli-px-0"> <div class="cli-tab-footer wt-cli-privacy-overview-actions"> <a id="wt-cli-privacy-save-btn" role="button" tabindex="0" data-cli-action="accept" class="wt-cli-privacy-btn cli_setting_save_button wt-cli-privacy-accept-btn cli-btn">SAVE & ACCEPT</a> </div> </div> </div> </div> </div> </div> </div> </div> <div class="cli-modal-backdrop cli-fade cli-settings-overlay"></div> <div class="cli-modal-backdrop cli-fade cli-popupbar-overlay"></div> <!--googleon: all--> <div id="fb-root"></div> <script async defer crossorigin="anonymous" src="https://connect.facebook.net/en_US/sdk.js#xfbml=1&version=v8.0&appId=683648729088349&autoLogAppEvents=1"> </script> <!--Start of Tawk.to Script (0.5.5)--> <script type="text/javascript"> var Tawk_API=Tawk_API||{}; var Tawk_LoadStart=new Date(); (function(){ var s1=document.createElement("script"),s0=document.getElementsByTagName("script")[0]; s1.async=true; s1.src='https://embed.tawk.to/5ec04a92967ae56c521a742a/default'; s1.charset='UTF-8'; s1.setAttribute('crossorigin','*'); s0.parentNode.insertBefore(s1,s0); })(); </script> <!--End of Tawk.to Script (0.5.5)--> <script type="text/javascript"> window.WPCOM_sharing_counts = {"https:\/\/machinelearningmastery.in\/2021\/05\/06\/cross-validation-using-oml4py-part-i\/":1539}; </script> <script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script> <div id="fb-root"></div> <script>(function(d, s, id) { var js, fjs = d.getElementsByTagName(s)[0]; if (d.getElementById(id)) return; js = d.createElement(s); js.id = id; js.src = 'https://connect.facebook.net/en_US/sdk.js#xfbml=1&appId=249643311490&version=v2.3'; fjs.parentNode.insertBefore(js, fjs); }(document, 'script', 'facebook-jssdk'));</script> <script> document.body.addEventListener( 'is.post-load', function() { if ( 'undefined' !== typeof FB ) { FB.XFBML.parse(); } } ); </script> <script type="text/javascript"> ( function () { var currentScript = document.currentScript; // Helper function to load an external script. function loadScript( url, cb ) { var script = document.createElement( 'script' ); var prev = currentScript || document.getElementsByTagName( 'script' )[ 0 ]; script.setAttribute( 'async', true ); script.setAttribute( 'src', url ); prev.parentNode.insertBefore( script, prev ); script.addEventListener( 'load', cb ); } function init() { loadScript( 'https://platform.linkedin.com/in.js?async=true', function () { if ( typeof IN !== 'undefined' ) { IN.init(); } } ); } if ( document.readyState === 'loading' ) { document.addEventListener( 'DOMContentLoaded', init ); } else { init(); } document.body.addEventListener( 'is.post-load', function() { if ( typeof IN !== 'undefined' ) { IN.parse(); } } ); } )(); </script> <script id="tumblr-js" type="text/javascript" src="https://assets.tumblr.com/share-button.js"></script> <script type="text/javascript"> ( function () { // Pinterest shared resources var s = document.createElement( 'script' ); s.type = 'text/javascript'; s.async = true; s.setAttribute( 'data-pin-hover', true ); s.src = window.location.protocol + '//assets.pinterest.com/js/pinit.js'; var x = document.getElementsByTagName( 'script' )[ 0 ]; x.parentNode.insertBefore(s, x); // if 'Pin it' button has 'counts' make container wider function init() { var shares = document.querySelectorAll( 'li.share-pinterest' ); for ( var i = 0; i < shares.length; i++ ) { var share = shares[ i ]; if ( share.querySelector( 'a span:visible' ) ) { share.style.width = '80px'; } } } if ( document.readyState !== 'complete' ) { document.addEventListener( 'load', init ); } else { init(); } } )(); </script> <script> (function(r, d, s) { r.loadSkypeWebSdkAsync = r.loadSkypeWebSdkAsync || function(p) { var js, sjs = d.getElementsByTagName(s)[0]; if (d.getElementById(p.id)) { return; } js = d.createElement(s); js.id = p.id; js.src = p.scriptToLoad; js.onload = p.callback sjs.parentNode.insertBefore(js, sjs); }; var p = { scriptToLoad: 'https://swx.cdn.skype.com/shared/v/latest/skypewebsdk.js', id: 'skype_web_sdk' }; r.loadSkypeWebSdkAsync(p); })(window, document, 'script'); </script> <div id="sharing_email" style="display: none;"> <form action="/2021/05/06/cross-validation-using-oml4py-part-i/" method="post"> <label for="target_email">Send to Email Address</label> <input type="email" name="target_email" id="target_email" value="" /> <label for="source_name">Your Name</label> <input type="text" name="source_name" id="source_name" value="" /> <label for="source_email">Your Email Address</label> <input type="email" name="source_email" id="source_email" value="" /> <input type="text" id="jetpack-source_f_name" name="source_f_name" class="input" value="" size="25" autocomplete="off" title="This field is for validation and should not be changed" /> <img style="float: right; display: none" class="loading" src="https://machinelearningmastery.in/wp-content/plugins/jetpack/modules/sharedaddy/images/loading.gif" alt="loading" width="16" height="16" /> <input type="submit" value="Send Email" class="sharing_send" /> <a rel="nofollow" href="#cancel" class="sharing_cancel" role="button">Cancel</a> <div class="errors errors-1" style="display: none;"> Post was not sent - check your email addresses! </div> <div class="errors errors-2" style="display: none;"> Email check failed, please try again </div> <div class="errors errors-3" style="display: none;"> Sorry, your blog cannot share posts by email. </div> </form> </div> <script data-cfasync="false" type="text/javascript">if (window.addthis_product === undefined) { window.addthis_product = "wpp"; } if (window.wp_product_version === undefined) { window.wp_product_version = "wpp-6.2.6"; } if (window.addthis_share === undefined) { window.addthis_share = {}; } if (window.addthis_config === undefined) { window.addthis_config = {"data_track_clickback":true,"ui_atversion":"300"}; } if (window.addthis_plugin_info === undefined) { window.addthis_plugin_info = {"info_status":"enabled","cms_name":"WordPress","plugin_name":"Share Buttons by AddThis","plugin_version":"6.2.6","plugin_mode":"AddThis","anonymous_profile_id":"wp-2f16336e765908d13c2d341ff0393457","page_info":{"template":"posts","post_type":["post","page","e-landing-page"]},"sharing_enabled_on_post_via_metabox":false}; } (function() { var first_load_interval_id = setInterval(function () { if (typeof window.addthis !== 'undefined') { window.clearInterval(first_load_interval_id); if (typeof window.addthis_layers !== 'undefined' && Object.getOwnPropertyNames(window.addthis_layers).length > 0) { window.addthis.layers(window.addthis_layers); } if (Array.isArray(window.addthis_layers_tools)) { for (i = 0; i < window.addthis_layers_tools.length; i++) { window.addthis.layers(window.addthis_layers_tools[i]); } } } },1000) }()); </script><script src='https://machinelearningmastery.in/wp-content/plugins/jetpack/_inc/build/photon/photon.min.js?ver=20191001' id='jetpack-photon-js'></script> <script src='https://machinelearningmastery.in/wp-includes/js/comment-reply.min.js?ver=5.6.5' id='comment-reply-js'></script> <script id='hoverIntent-js-extra'> var hootData = {"stickySidebar":"disable","contentblockhover":"enable","contentblockhovertext":"disable"}; </script> <script src='https://machinelearningmastery.in/wp-includes/js/hoverIntent.min.js?ver=1.8.1' id='hoverIntent-js'></script> <script src='https://machinelearningmastery.in/wp-content/themes/unos/js/jquery.superfish.min.js?ver=1.7.5' id='jquery-superfish-js'></script> <script src='https://machinelearningmastery.in/wp-content/themes/unos/js/jquery.fitvids.min.js?ver=1.1' id='jquery-fitvids-js'></script> <script src='https://machinelearningmastery.in/wp-content/themes/unos/js/jquery.parallax.min.js?ver=1.4.2' id='jquery-parallax-js'></script> <script id='ap-frontend-js-js-extra'> var ap_form_required_message = ["This field is required","accesspress-anonymous-post"]; var ap_captcha_error_message = ["Sum is not correct.","accesspress-anonymous-post"]; </script> <script src='https://machinelearningmastery.in/wp-content/plugins/accesspress-anonymous-post/js/frontend.js?ver=2.8.1' id='ap-frontend-js-js'></script> <script src='https://machinelearningmastery.in/wp-content/plugins/hootkit/assets/jquery.lightSlider.min.js?ver=1.1.2' id='jquery-lightSlider-js'></script> <script src='https://machinelearningmastery.in/wp-content/plugins/hootkit/assets/widgets.min.js?ver=2.0.7' id='hootkit-widgets-js'></script> <script id='hootkit-miscmods-js-extra'> var hootkitMiscmodsData = {"ajaxurl":"https:\/\/machinelearningmastery.in\/wp-admin\/admin-ajax.php"}; </script> <script src='https://machinelearningmastery.in/wp-content/plugins/hootkit/assets/miscmods.min.js?ver=2.0.7' id='hootkit-miscmods-js'></script> <script src='https://machinelearningmastery.in/wp-content/plugins/page-links-to/dist/new-tab.js?ver=3.3.5' id='page-links-to-js'></script> <script src='https://s7.addthis.com/js/300/addthis_widget.js?ver=5.6.5#pubid=ra-5e0c443d44eeeb15' id='addthis_widget-js'></script> <script src='https://machinelearningmastery.in/wp-content/plugins/jetpack/vendor/automattic/jetpack-lazy-images/src/js/intersectionobserver-polyfill.min.js?ver=1.1.2' id='jetpack-lazy-images-polyfill-intersectionobserver-js'></script> <script id='jetpack-lazy-images-js-extra'> var jetpackLazyImagesL10n = {"loading_warning":"Images are still loading. Please cancel your print and try again."}; </script> <script src='https://machinelearningmastery.in/wp-content/plugins/jetpack/vendor/automattic/jetpack-lazy-images/src/js/lazy-images.min.js?ver=1.1.2' id='jetpack-lazy-images-js'></script> <script src='https://machinelearningmastery.in/wp-content/plugins/jetpack/_inc/build/postmessage.min.js?ver=9.8.1' id='postmessage-js'></script> <script src='https://machinelearningmastery.in/wp-content/plugins/jetpack/_inc/build/jquery.jetpack-resize.min.js?ver=9.8.1' id='jetpack_resize-js'></script> <script src='https://machinelearningmastery.in/wp-content/plugins/jetpack/_inc/build/likes/queuehandler.min.js?ver=9.8.1' id='jetpack_likes_queuehandler-js'></script> <script src='https://machinelearningmastery.in/wp-content/themes/unos/js/hoot.theme.min.js?ver=2.9.11' id='hoot-theme-js'></script> <script src='https://machinelearningmastery.in/wp-content/plugins/youtube-embed-plus/scripts/fitvids.min.js?ver=13.4.3' id='__ytprefsfitvids__-js'></script> <script id='wpgdprc.js-js-extra'> var wpgdprcData = {"ajaxURL":"https:\/\/machinelearningmastery.in\/wp-admin\/admin-ajax.php","ajaxSecurity":"dfe2f68189","isMultisite":"","path":"\/","blogId":""}; </script> <script src='https://machinelearningmastery.in/wp-content/plugins/wp-gdpr-compliance/dist/js/front.min.js?ver=1629244814' id='wpgdprc.js-js'></script> <script src='https://machinelearningmastery.in/wp-includes/js/wp-embed.min.js?ver=5.6.5' id='wp-embed-js'></script> <script id='jetpack_related-posts-js-extra'> var related_posts_js_options = {"post_heading":"h4"}; </script> <script src='https://machinelearningmastery.in/wp-content/plugins/jetpack/_inc/build/related-posts/related-posts.min.js?ver=20210604' id='jetpack_related-posts-js'></script> <script defer src='https://machinelearningmastery.in/wp-content/plugins/akismet/_inc/form.js?ver=4.1.12' id='akismet-form-js'></script> <script id='sharing-js-js-extra'> var sharing_js_options = {"lang":"en","counts":"1","is_stats_active":"1"}; </script> <script src='https://machinelearningmastery.in/wp-content/plugins/jetpack/_inc/build/sharedaddy/sharing.min.js?ver=9.8.1' id='sharing-js-js'></script> <script id='sharing-js-js-after'> var windowOpen; ( function () { function matches( el, sel ) { return !! ( el.matches && el.matches( sel ) || el.msMatchesSelector && el.msMatchesSelector( sel ) ); } document.body.addEventListener( 'click', function ( event ) { if ( ! event.target ) { return; } var el; if ( matches( event.target, 'a.share-facebook' ) ) { el = event.target; } else if ( event.target.parentNode && matches( event.target.parentNode, 'a.share-facebook' ) ) { el = event.target.parentNode; } if ( el ) { event.preventDefault(); // If there's another sharing window open, close it. if ( typeof windowOpen !== 'undefined' ) { windowOpen.close(); } windowOpen = window.open( el.getAttribute( 'href' ), 'wpcomfacebook', 'menubar=1,resizable=1,width=600,height=400' ); return false; } } ); } )(); var windowOpen; ( function () { function matches( el, sel ) { return !! ( el.matches && el.matches( sel ) || el.msMatchesSelector && el.msMatchesSelector( sel ) ); } document.body.addEventListener( 'click', function ( event ) { if ( ! event.target ) { return; } var el; if ( matches( event.target, 'a.share-telegram' ) ) { el = event.target; } else if ( event.target.parentNode && matches( event.target.parentNode, 'a.share-telegram' ) ) { el = event.target.parentNode; } if ( el ) { event.preventDefault(); // If there's another sharing window open, close it. if ( typeof windowOpen !== 'undefined' ) { windowOpen.close(); } windowOpen = window.open( el.getAttribute( 'href' ), 'wpcomtelegram', 'menubar=1,resizable=1,width=450,height=450' ); return false; } } ); } )(); </script> <iframe src='https://widgets.wp.com/likes/master.html?ver=202138#ver=202138' scrolling='no' id='likes-master' name='likes-master' style='display:none;'></iframe> <div id='likes-other-gravatars'><div class="likes-text"><span>%d</span> bloggers like this:</div><ul class="wpl-avatars sd-like-gravatars"></ul></div> <script>!function(){window.advanced_ads_ready_queue=window.advanced_ads_ready_queue||[],advanced_ads_ready_queue.push=window.advanced_ads_ready;for(var d=0,a=advanced_ads_ready_queue.length;d<a;d++)advanced_ads_ready(advanced_ads_ready_queue[d])}();</script><script src='https://stats.wp.com/e-202138.js' defer></script> <script> _stq = window._stq || []; _stq.push([ 'view', {v:'ext',j:'1:9.8.1',blog:'170785677',post:'1539',tz:'-5.5',srv:'machinelearningmastery.in'} ]); _stq.push([ 'clickTrackerInit', '170785677', '1539' ]); </script> </body> </html> <!-- Page generated by LiteSpeed Cache 4.4.1 on 2021-09-20 15:08:50 -->