EazyML Insights Template¶
Define Imports¶
In [ ]:
!pip install --upgrade eazyml-insight
!pip install --upgrade eazyml-automl
!pip install gdown python-dotenv
In [1]:
import os
from eazyml_insight import (
ez_insight,
ez_init,
ez_validate
)
from eazyml import ez_display_df
import gdown
import pandas as pd
from dotenv import load_dotenv
load_dotenv()
Out[1]:
True
1. Initialize EazyML¶
The ez_init function uses the EAZYML_ACCESS_KEY environment variable for authentication. If the variable is not set, it defaults to a trial license.
In [2]:
ez_init(access_key=os.getenv('EAZYML_ACCESS_KEY'))
Out[2]:
{'success': True,
'message': 'Initialized successfully. You may revoke your consent to sharing usage stats anytime. You have exclusive paid access.'}
2. Define Dataset Files and Outcome Variable¶
In [ ]:
gdown.download_folder(id='1-RO9K9-YYGK7Wp__ioth0xPD8XqtgvKT')
In [3]:
# Names of the files that will be used by EazyML APIs
train_file_path = os.path.join('data', 'IRIS_Train.csv')
test_file_path = os.path.join('data', 'IRIS_Test.csv')
# The column name for outcome of interest
outcome = "species"
3. Dataset Information¶
The dataset used in this notebook is the Iris Dataset, which is a well-known dataset in machine learning and statistics. It contains data about 150 iris flowers, with four features (sepal length, sepal width, petal length, and petal width) and the species of the flower (setosa, versicolor, or virginica).
You can find more details and download the dataset from Kaggle using the following link:
Columns in the Dataset:¶
- sepal_length: Sepal length of the flower (cm)
- sepal_width: Sepal width of the flower (cm)
- petal_length: Petal length of the flower (cm)
- petal_width: Petal width of the flower (cm)
- species: Species of the iris flower (setosa, versicolor, virginica)
3.1 Display the Dataset¶
Below is a preview of the dataset:
In [4]:
# Load the dataset from the provided file
train = pd.read_csv(train_file_path)
# Display the first few rows of the dataset
ez_display_df(train.head())
| sepal_length | sepal_width | petal_length | petal_width | species | |
|---|---|---|---|---|---|
| 0 | 5.100000 | 3.500000 | 1.400000 | 0.200000 | Iris-setosa |
| 1 | 4.900000 | 3.000000 | 1.400000 | 0.200000 | Iris-setosa |
| 2 | 4.700000 | 3.200000 | 1.300000 | 0.200000 | Iris-setosa |
| 3 | 4.600000 | 3.100000 | 1.500000 | 0.200000 | Iris-setosa |
| 4 | 5.000000 | 3.600000 | 1.400000 | 0.200000 | Iris-setosa |
4. EazyML Insights¶
4.1 Auto-derive Insights¶
4.1.1 Build Insight Model¶
In [5]:
response = ez_insight(train_file_path, outcome, options={})
4.1.2 Convert Response to DataFrame¶
In [6]:
insights_df = pd.DataFrame(response['insights']['data'], columns=response['insights']['columns'])
4.1.3 Display Augmented Insights¶
4.1.3.1 For Class Iris-virginica¶
In [7]:
insights_df1 = insights_df[insights_df[outcome] == 'Iris-virginica']
ez_display_df(insights_df1.head())
| species | Augmented Intelligence Insights | Insight Scores | |
|---|---|---|---|
| 0 | Iris-virginica | sepal_length is greater than 5.55, petal_width is greater than 1.75 | 0.836000 |
| 1 | Iris-virginica | petal_width is greater than 0.8, petal_length is greater than 4.75 | 0.833900 |
| 2 | Iris-virginica | petal_width is greater than 1.75 | 0.802800 |
| 3 | Iris-virginica | sepal_length is greater than 6.25, sepal_width is less than equal to 3.7, petal_length is greater than 5.05 | 0.752600 |
| 4 | Iris-virginica | sepal_length is greater than 6.25, sepal_width is less than equal to 3.7 | 0.583500 |
4.1.3.2 For Class Iris-versicolor¶
In [8]:
insights_df0 = insights_df[insights_df[outcome] == 'Iris-versicolor']
ez_display_df(insights_df0.head())
| species | Augmented Intelligence Insights | Insight Scores | |
|---|---|---|---|
| 20 | Iris-versicolor | petal_width is greater than 0.8, petal_length is less than equal to 4.75 | 0.862100 |
| 21 | Iris-versicolor | petal_width in ( 0.8, 1.75 ) | 0.843200 |
| 22 | Iris-versicolor | petal_width in ( 0.8, 1.75 ), petal_length is less than equal to 4.95, sepal_width is greater than 2.55 | 0.707500 |
| 23 | Iris-versicolor | sepal_length is greater than 5.55, petal_width in ( 0.7, 1.75 ), petal_length is less than equal to 4.95 | 0.707500 |
| 24 | Iris-versicolor | sepal_length in ( 5.55, 6.25 ), sepal_width in ( 2.65, 3.7 ), petal_width is less than equal to 1.7 | 0.698400 |
4.2 Validation of Insights¶
4.2.1 Validating Insights¶
In [9]:
record_number = [3, 5]
options = {'record_number': record_number}
val_response = ez_validate(train_file_path, outcome, response['insights'], train_file_path, options=options)
4.2.2 Convert Response to DataFrame¶
In [10]:
validate_df = pd.DataFrame(val_response['validations']['data'], columns=val_response['validations']['columns'])
4.2.3 Display Validation Metrics¶
4.2.3.1 For Class Iris-virginica¶
In [11]:
validate_df1 = validate_df[validate_df[outcome] == 'Iris-virginica']
ez_display_df(validate_df1.head())
| Test Data Point Number | species | Augmented Intelligence Insights | Insight Scores | Accuracy | Coverage | Population | Accuracy Count | Total Population | |
|---|---|---|---|---|---|---|---|---|---|
| 28 | 29 | Iris-virginica | sepal_length is greater than 5.55, petal_width is greater than 1.75 | 0.836000 | 0.978300 | 0.306700 | 46 | 45 | 150 |
| 29 | 30 | Iris-virginica | petal_width is greater than 0.8, petal_length is greater than 4.75 | 0.833900 | 0.890900 | 0.366700 | 55 | 49 | 150 |
| 30 | 31 | Iris-virginica | petal_width is greater than 1.75 | 0.802800 | 0.978300 | 0.306700 | 46 | 45 | 150 |
| 31 | 32 | Iris-virginica | sepal_length is greater than 6.25, sepal_width is less than equal to 3.7, petal_length is greater than 5.05 | 0.752600 | 1.000000 | 0.220000 | 33 | 33 | 150 |
| 32 | 33 | Iris-virginica | sepal_length is greater than 6.25, sepal_width is less than equal to 3.7 | 0.583500 | 0.714300 | 0.326700 | 49 | 35 | 150 |
4.2.3.2 For Class Iris-versicolor¶
In [12]:
validate_df0 = validate_df[validate_df[outcome] == 'Iris-versicolor']
ez_display_df(validate_df0.head())
| Test Data Point Number | species | Augmented Intelligence Insights | Insight Scores | Accuracy | Coverage | Population | Accuracy Count | Total Population | |
|---|---|---|---|---|---|---|---|---|---|
| 7 | 8 | Iris-versicolor | petal_width is greater than 0.8, petal_length is less than equal to 4.75 | 0.862100 | 0.977800 | 0.300000 | 45 | 44 | 150 |
| 8 | 9 | Iris-versicolor | petal_width in ( 0.8, 1.75 ) | 0.843200 | 0.907400 | 0.360000 | 54 | 49 | 150 |
| 9 | 10 | Iris-versicolor | petal_width in ( 0.8, 1.75 ), petal_length is less than equal to 4.95, sepal_width is greater than 2.55 | 0.707500 | 1.000000 | 0.226700 | 34 | 34 | 150 |
| 10 | 11 | Iris-versicolor | sepal_length is greater than 5.55, petal_width in ( 0.7, 1.75 ), petal_length is less than equal to 4.95 | 0.707500 | 1.000000 | 0.240000 | 36 | 36 | 150 |
| 11 | 12 | Iris-versicolor | sepal_length in ( 5.55, 6.25 ), sepal_width in ( 2.65, 3.7 ), petal_width is less than equal to 1.7 | 0.698400 | 1.000000 | 0.126700 | 19 | 19 | 150 |
4.2.4 Display Filtered Data for Specific Record Numbers¶
In [13]:
for i in range(len(record_number)):
print (val_response['validation_filter'][i]['Augmented Intelligence Insights'])
filter_df = pd.DataFrame(val_response['validation_filter'][i]['filtered_data']['data'], columns=val_response[
'validation_filter'][i]['filtered_data']['columns'])
ez_display_df(filter_df.head())
print ('\n')
sepal_length is less than equal to 5.55
| sepal_length | sepal_width | petal_length | petal_width | species | |
|---|---|---|---|---|---|
| 0 | 5.100000 | 3.500000 | 1.400000 | 0.200000 | IRIS-SETOSA |
| 1 | 4.900000 | 3.000000 | 1.400000 | 0.200000 | IRIS-SETOSA |
| 2 | 4.700000 | 3.200000 | 1.300000 | 0.200000 | IRIS-SETOSA |
| 3 | 4.600000 | 3.100000 | 1.500000 | 0.200000 | IRIS-SETOSA |
| 4 | 5.000000 | 3.600000 | 1.400000 | 0.200000 | IRIS-SETOSA |
sepal_length in ( 5.55, 6.75 ), sepal_width is greater than 3.7
| sepal_length | sepal_width | petal_length | petal_width | species | |
|---|---|---|---|---|---|
| 0 | 5.800000 | 4.000000 | 1.200000 | 0.200000 | IRIS-SETOSA |
| 1 | 5.700000 | 4.400000 | 1.500000 | 0.400000 | IRIS-SETOSA |
| 2 | 5.700000 | 3.800000 | 1.700000 | 0.300000 | IRIS-SETOSA |
In [ ]: