The gradient boosting yields a better recall score but performs poorer than the logistic regression in terms of accuracy and precision.

Here we compare the performance of gradient boosting classifier with logistic regression. We use the same data set as this post

from mltools import *

with HDFS("data.h5") as store:

df = store.get("linear_model/heart")

X = df.drop('target', axis = 1)

y = df.targettrain_idx, test_idx = train_test_split(range(len(y)))

y_train = y.iloc[train_idx]

X_train = X.iloc[train_idx,:]

y_test = y.iloc[test_idx]

X_test = X.iloc[test_idx,:]

We use cross-validation to find the best hyperparameter for the gradient boosting classifier

`tr_mean = [] tr_std = [] te_mean = []…`

In this post, we described logistic regression. This post discusses logistic regression with regularization.

Using the same dataset,

from mltools import *

with HDFS("data.h5") as store:

df = store.get("linear_model/heart")

X = df.drop('target', axis = 1)

y = df.targettrain_idx, test_idx = train_test_split(range(len(y)))

y_train = y.iloc[train_idx]

X_train = X.iloc[train_idx,:]

y_test = y.iloc[test_idx]

X_test = X.iloc[test_idx,:]

The logistic regression with regularization is performed using the standard cross-validation procedure. We varying the penalty parameter `C`

in `LogisticRegression`

tr_mean = []

tr_std = []

te_mean = []

te_std = []for lr in np.logspace(-4, 0, 30): pipe = Pipeline([ ('scale', StandardScaler()), ('logit', LogisticRegression(C =…

Logistic regression is the benchmark method for classification. It serves as one crucial step in the IEBAE (I/O, Exploration, Benchmark, Analysis, Evaluation) framework. In this note, we illustrate how to perform logistic regression using the heart attack dataset: https://www.kaggle.com/nareshbhat/health-care-data-set-on-heart-attack-possibility

**Step 1: **Import necessary packages

`from mltools import *`

**Step 2: **data I/O; I already saved all the data to a HDFS file

`with HDFS("data.h5") as store:`

df = store.get("linear_model/heart")

`X = df.drop('target', axis = 1)`

y = df.target

**Step 3: **Explore the dataset. We use `kdeplot`

`fig, axs = subplots(ncols=len(X.columns), figsize = (100,5)) for (k,c) in enumerate(X.columns): kdeplot(X[c], ax =…`

In the following post

We discussed the IEBAE framework using a simple regression example. However, we did not tune the hyperparameters in the gradient boosting approach. In this post, we expand the discussion on the gradient boosting example using the standard KFold cross-validation approach.

We import all necessary packages (`mltools`

can be downloaded here: https://github.com/kailaix/mltools)

from mltools import *train_idx, test_idx = train_test_split(range(df.shape[0]), test_size = 0.2)

train_idx = np.array(train_idx)

test_idx = np.array(test_idx)kf = KFold(n_splits=10)

The metric is

`def metric(y1, y2):`

return np.sqrt(np.mean((y1-y2)**2))

The score function in the cross-validation approach is

`def get_score(lr = 1.0): pipe = Pipeline([ ('scale', StandardScaler())…`

In this article, we illustrate the IEBAE (pronounced as “eBay”, I/O, Exploration, Benchmark, Analysis, Evaluation) framework with a linear model example.

We use the Graduate Admission 2 dataset, which can be obtained from https://www.kaggle.com/mohansacharya/graduate-admissions. Our ultimate goal is to predict “Chance of Admit” from other variables.

We import some useful packages:

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

import numpy as np

import statsmodels.api as smfrom sklearn.preprocessing import StandardScaler

from sklearn.model_selection import train_test_split, KFold

from sklearn.pipeline import Pipeline

from sklearn.linear_model import Ridge

from sklearn.ensemble import GradientBoostingRegressor

We can glimpse at the dataset **using a…**

This article is adapted from the physics-based machine learning library ADCME documentation

The inverse modeling problem can be mathematically formulated as finding an unknown parameter X given input X and output u of a forward model

u=*F*(*θ, *X)

Here X and u can be a sample from a stochastic process. The scope of inverse problems that can be tackled with ADCME is

- The forward model must be
*differentiable*, i.e., ∂*F/*∂*X*and ∂*F/*∂*θ* exist. However, we do not require those gradients to be implemented by users; they can be computed with automatic differentiation in ADCME. - The forward model must be a…

Inverse modeling is everywhere in scientific computing. TensorFlow and PyTorch are everywhere in deep learning. ADCME.jl is the dealer that brings the power of techniques from machine learning to inverse modeling in scientific computing. https://github.com/kailaix/ADCME.jl

Many scientific computing problems in physics, chemistry, biology and so on can be formulated as an inverse problem with partial differential equations (PDE). For example, in seismic imaging, researchers find the oil reservoirs underneath the ground from waveform images. People locate the centroid of earthquakes from data collected from seismic stations. Doctors monitor the growth of tumors by reading data from MRI. …

How can neural networks stack up against parametrization methods used in engineering developed over last half century? A tentative answer is: Universal approximation, implicit regularization, alleviating the curse of dimensionality and efficient computing frameworks!

In order to carry out quantitative analysis, researchers usually apply different mathematical models for their problems. For example, one can use time series to describe stock prices, use graph theory for social networks, or use Markov processes for events where the future is independent of the past, given the present. For physics, one of the predominant models is partial differential equations (PDEs).

One of the important…