Codementor Events

How to make a custom activation function in Keras?

Published Aug 25, 2021Last updated Aug 26, 2021
How to make a custom activation function  in Keras?

Hello, my name is Alex Polymath.

This is another post about hacking neural network - creating custom activation function.

You can run google colab or use your computer.

Overview

In this tutorial we will use mnist dataset from kaggle

  1. First we will prepare data for training
  2. Second - set up activation function in python (RELU but provided by our function)
  3. Compile neural network
  4. Train neural network
  5. Test if it still gives good results

1. Download data from kaggle.

There will be 2 files

  • train.csv.zip
  • test.csv.zip

I've no idea why, but test file doesn't make any sense,
since there are no lables there.
https://www.kaggle.com/oddrationale/mnist-in-csv

If you using google colab

Drag'n'Drop train.csv.zip file to files

!unzip archive.zip

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('ggplot')
import keras
import matplotlib.pyplot as plt

from sklearn.preprocessing import LabelBinarizer
from sklearn.model_selection import train_test_split

First we will prepare data for training

train_df = pd.read_csv('/content/train.csv') #might be in other place
train_labels = train_df['label'] #We need Y values - labels
train_labels = train_labels.to_numpy() # nothing smart just convert to numpy array
del train_df['label'] # remove label from original dataframe to use it as X
train_data = train_df.to_numpy()


# we can't use values 1,2,3,4,4,5 for Y
# instead we should use smth like [1,0,0,0,0,0], [0,1,0,0,0,0], ...
y = LabelBinarizer().fit_transform(train_labels) 


#Split train and test data
X_train, X_test, y_train, y_test = train_test_split(train_data, y, test_size=0.1)

Create custom activation function


from keras import backend as K
from keras.layers.core import Activation
from keras.utils.generic_utils import get_custom_objects

### Note! You cannot use random python functions, activation function gets as an input tensorflow tensors and should return tensors. There are a lot of helper functions in keras backend.
def custom_activation(x):
  
    return (1/(1 + K.exp(-x)))
     
get_custom_objects().update({'custom_activation': Activation(custom_activation)})

Compile neural network

# Define sequential model

model = keras.Sequential()

# Define the first layer
model.add(keras.layers.Dense(128, activation="custom_activation", input_shape=(784,)))
model.add(keras.layers.Dense(128, activation="custom_activation", input_shape=(128,)))
model.add(keras.layers.Dense(128, activation="custom_activation", input_shape=(128,)))
model.add(keras.layers.Dense(128, activation="custom_activation", input_shape=(128,)))

# Add activation function to classifier
model.add(keras.layers.Dense(10, activation='softmax'))

# Finish the modecl compilation
model.compile('adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Complete the model fit operation

Train neural network

model.fit(train_data, y, epochs=10, validation_data=(X_test, y_test), callbacks=[], verbose=0)

Follow me in twitter
@alexpolymath

Discover and read more posts from Alex Polymath
get started
post commentsBe the first to share your opinion
Gulshan Negi
6 months ago

Thanks a lot for sharing it here with us.
I have also seen a simple program for making a custom activation function in Keras.

import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from keras import backend as K
import tensorflow as tf

Define a custom activation function

def custom_activation(x):
return K.square(x)

Create a simple dataset for demonstration

X_train = np.array([[1.0, 2.0, 3.0],
[4.0, 5.0, 6.0],
[7.0, 8.0, 9.0]])
y_train = np.array([[1.0],
[0.0],
[1.0]])

Build a Keras model using the custom activation function

model = Sequential()
model.add(Dense(64, activation=custom_activation, input_dim=X_train.shape[1]))
model.add(Dense(1, activation=‘sigmoid’)) # Output layer with sigmoid activation

Compile the model

model.compile(optimizer=‘adam’, loss=‘binary_crossentropy’, metrics=[‘accuracy’])

Train the model

model.fit(X_train, y_train, epochs=100, batch_size=1)

Test the model

X_test = np.array([[2.0, 4.0, 6.0]])
y_pred = model.predict(X_test)
print(“Prediction:”, y_pred)

Thanks

Shaved Man
2 years ago

I find it difficult to actually make out those K.-functions, u can use here. Found some page, documenting some tf.-functions, seems to be the same ones, u can use with this “backend as K”-method. wish they yould make an official tutorial to make an own functions and where I can find the backend-functions I can use.

burbigo deen
3 years ago

Hey, thank you so much for this little tutorial.

Show more replies