This site is to serve as my note-book and to effectively communicate with my students and collaborators. Every now and then, a blog may be of interest to other researchers or teachers. Views in this blog are my own. All rights of research results and findings on this blog are reserved. See also http://youtube.com/c/hongqin @hongqin
Monday, September 30, 2019
Friday, September 27, 2019
softmax and probability modeling
softmax
HYAA -> image classification -> probably estimatation -> guide to cell segemenation
Thursday, September 26, 2019
semantic segmentation (pixel wise classification)
semantic segmentation (pixel wise classification)
https://arxiv.org/abs/1511.00561
segnet :
https://arxiv.org/abs/1511.00561
segnet :
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
Tuesday, September 24, 2019
Single-cell transcriptomic profiling of the aging mouse brain
Single-cell transcriptomic profiling of the aging mouse brain
https://www.nature.com/articles/s41593-019-0491-3?fbclid=IwAR1AZtVaFSyd_cNpzWpLoRjVUhYiu5WvTuYVR5Wz_DlUWRaePByTC5svsbASunday, September 22, 2019
tensor
tensor is a general term:
2d tensor is matrix, 1D tensor is vector, 0D tensor is scalar.,
tensor can be more than 3D.
2d tensor is matrix, 1D tensor is vector, 0D tensor is scalar.,
tensor can be more than 3D.
Friday, September 20, 2019
regularization, L1 and L2 norm
What is the main purpose of using regularization?
Regularization helps to choose preferred model complexity, so that model is better at predicting. Regularization is nothing but adding a penalty term to the objective function and control the model complexity using that penalty term. It can be used for many machine learning algorithms.
Wednesday, September 18, 2019
ssh with key
ssh -i ~/.ssh/private_key username@server.com
openstack R tensorflow
qinlab3 instance
inside of Rstudio
install devtools from the drop down menu. This took some time.
ERROR:
openstack ubuntu libcurl was not found
See previous post for installing devetools in R
https://hongqinlab.blogspot.com/2017/11/microsoft-r-client-on-ubuntu-virtual.html
Apparently, default Ubuntu lack curl, openssl, xml2 etc.
1:45pm, finall be able to install keras on VirturalBox VM:
inside of R 3.4.x (installed through apt-get)
devtools::install_github("rstudio/keras")
> library(keras)
> install_keras()
Creating virtual environment '~/.virtualenvs/r-reticulate' ...
Using python: /home/hqin/anaconda3/bin/python3.7
Collecting pip
Downloading https://files.pythonhosted.org/packages/30/db/9e38760b32e3e7f40cce46dd5fb107b8c73840df38f0046d8e6514e675a1/pip-19.2.3-py2.py3-none-any.whl (1.4MB)
100% |████████████████████████████████| 1.4MB 21.5MB/s
Collecting wheel
Downloading https://files.pythonhosted.org/packages/00/83/b4a77d044e78ad1a45610eb88f745be2fd2c6d658f9798a15e384b7d57c9/wheel-0.33.6-py2.py3-none-any.whl
Collecting setuptools
Downloading https://files.pythonhosted.org/packages/b2/86/095d2f7829badc207c893dd4ac767e871f6cd547145df797ea26baea4e2e/setuptools-41.2.0-py2.py3-none-any.whl (576kB)
100% |████████████████████████████████| 583kB 9.8MB/s
red-hat open stack will generate a key-pair, which save a private key on local computer, and leave a public key on the openstack server. The format of the key is *.pem.
Tuesday, September 17, 2019
Saturday, September 7, 2019
Sunday, September 1, 2019
disk.frame R data larger than RAM
R dealing with dataset larger than RAM
https://www.youtube.com/watch?v=3XMTyi_H4q4
https://www.youtube.com/watch?v=3XMTyi_H4q4
tensorflow, keras
Keras tensorflow
# Initialize x_1 and x_2
x_1 = Variable(6.0,float32)
x_2 = Variable(0.3,float32)
# Define the optimization operation
opt = keras.optimizers.SGD(learning_rate=0.01)
for j in range(100):
# Perform minimization using the loss function and x_1
opt.minimize(lambda: loss_function(x_1), var_list=[x_1])
# Perform minimization using the loss function and x_2
opt.minimize(lambda: loss_function(x_2), var_list=[x_2])
# Print x_1 and x_2 as numpy arrays
print(x_1.numpy(), x_2.numpy())
# Define the first dense layer
model.add(keras.layers.Dense(16, activation='sigmoid', input_shape=(784,)))
# Apply dropout to the first layer's output
model.add(keras.layers.Dropout(0.25))
# Define the output layer
model.add(keras.layers.Dense(4, activation='softmax'))
# Compile the model
model.compile('adam', loss='categorical_crossentropy')
m1_layer1 = keras.layers.Dense(12, activation='sigmoid')(m1_inputs)
m1_layer2 = keras.layers.Dense(4, activation='softmax')(m1_layer1)
# For model 2, pass the input layer to layer 1 and layer 1 to layer 2
m2_layer1 = keras.layers.Dense(12, activation='relu')(m2_inputs)
m2_layer2 = keras.layers.Dense(4, activation='softmax')(m2_layer1)
# Merge model outputs and define a functional model
merged = keras.layers.add([m1_layer2, m2_layer2])
model = keras.Model(inputs=[m1_inputs, m2_inputs], outputs=merged)
# Print a model summary
print(model.summary())
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 784)] 0
__________________________________________________________________________________________________
input_2 (InputLayer) [(None, 784)] 0
__________________________________________________________________________________________________
dense_5 (Dense) (None, 12) 9420 input_1[0][0]
__________________________________________________________________________________________________
dense_7 (Dense) (None, 12) 9420 input_2[0][0]
__________________________________________________________________________________________________
dense_6 (Dense) (None, 4) 52 dense_5[0][0]
__________________________________________________________________________________________________
dense_8 (Dense) (None, 4) 52 dense_7[0][0]
__________________________________________________________________________________________________
add_1 (Add) (None, 4) 0 dense_6[0][0]
dense_8[0][0]
==================================================================================================
Total params: 18,944
Trainable params: 18,944
Non-trainable params: 0
__________________________________________________________________________________________________
None
<script.py> output:
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 784)] 0
__________________________________________________________________________________________________
input_2 (InputLayer) [(None, 784)] 0
__________________________________________________________________________________________________
dense (Dense) (None, 12) 9420 input_1[0][0]
__________________________________________________________________________________________________
dense_2 (Dense) (None, 12) 9420 input_2[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 4) 52 dense[0][0]
__________________________________________________________________________________________________
dense_3 (Dense) (None, 4) 52 dense_2[0][0]
__________________________________________________________________________________________________
add (Add) (None, 4) 0 dense_1[0][0]
dense_3[0][0]
==================================================================================================
Total params: 18,944
Trainable params: 18,944
Non-trainable params: 0
__________________________________________________________________________________________________
None
Keras: A metric is a function that is used to judge the performance of your model. Metric functions are to be supplied in the
Define sequential model
model = keras.Sequential()
# Define the first layer
model.add(keras.layers.Dense(32, activation='sigmoid', input_shape=(784,)))
# Add activation function to classifier
model.add(keras.layers.Dense(4, activation='softmax'))
# Set the optimizer, loss function, and metrics
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
# Add the number of epochs and the validation split
model.fit(sign_language_features, sign_language_labels, epochs=10, validation_split=0.1)
Train on 1799 samples, validate on 200 samples
Epoch 1/10
32/1799 [..............................] - ETA: 13s - loss: 1.5621 - accuracy: 0.1250
384/1799 [=====>........................] - ETA: 1s - loss: 1.3521 - accuracy: 0.3151
800/1799 [============>.................] - ETA: 0s - loss: 1.2557 - accuracy: 0.4425
1216/1799 [===================>..........] - ETA: 0s - loss: 1.1894 - accuracy: 0.5173
1792/1799 [============================>.] - ETA: 0s - loss: 1.1164 - accuracy: 0.5714
1799/1799 [==============================] - 1s 382us/sample - loss: 1.1150 - accuracy: 0.5725 - val_loss: 0.9990 - val_accuracy: 0.4700
Epoch 2/10
32/1799 [..............................] - ETA: 0s - loss: 0.8695 - accuracy: 0.6562
640/1799 [=========>....................] - ETA: 0s - loss: 0.8454 - accuracy: 0.7609
1184/1799 [==================>...........] - ETA: 0s - loss: 0.8061 - accuracy: 0.7753
1799/1799 [==============================] - 0s 97us/sample - loss: 0.7713 - accuracy: 0.7916 - val_loss: 0.6902 - val_accuracy: 0.7900
... ...
32/1799 [..............................] - ETA: 0s - loss: 0.2896 - accuracy: 0.8750
672/1799 [==========>...................] - ETA: 0s - loss: 0.2077 - accuracy: 0.9717
1312/1799 [====================>.........] - ETA: 0s - loss: 0.2016 - accuracy: 0.9748
1799/1799 [==============================] - 0s 91us/sample - loss: 0.1943 - accuracy: 0.9739 - val_loss: 0.1634 - val_accuracy: 0.9800
Epoch 9/10
32/1799 [..............................] - ETA: 0s - loss: 0.1352 - accuracy: 1.0000
672/1799 [==========>...................] - ETA: 0s - loss: 0.1700 - accuracy: 0.9747
1312/1799 [====================>.........] - ETA: 0s - loss: 0.1612 - accuracy: 0.9809
1799/1799 [==============================] - 0s 89us/sample - loss: 0.1596 - accuracy: 0.9822 - val_loss: 0.1303 - val_accuracy: 0.9950
Epoch 10/10
32/1799 [..............................] - ETA: 0s - loss: 0.1017 - accuracy: 1.0000
704/1799 [==========>...................] - ETA: 0s - loss: 0.1478 - accuracy: 0.9858
1344/1799 [=====================>........] - ETA: 0s - loss: 0.1387 - accuracy: 0.9829
1799/1799 [==============================] - 0s 88us/sample - loss: 0.1358 - accuracy: 0.9817 - val_loss: 0.1126 - val_accuracy: 0.9850
Out[1]: <tensorflow.python.keras.callbacks.History at 0x7f7aab02fc18>
Overfitting
You will detect overfitting by checking whether the validation sample loss is substantially higher than the training sample loss and whether it increases with further training. With a small sample and a high learning rate, the model will struggle to converge on an optimum. You will set a low learning rate for the optimizer, which will make it easier to identify overfitting.
Excellent work! You may have noticed that the validation loss,
model.add(keras.layers.Dense(1024, activation='relu', input_shape=(784,)))
# Add activation function to classifier
model.add(keras.layers.Dense(4, activation='softmax'))
# Finish the model compilation
model.compile(optimizer=keras.optimizers.Adam(lr=0.01),
loss='categorical_crossentropy', metrics=['accuracy'])
# Complete the model fit operation
model.fit(sign_language_features, sign_language_labels, epochs=200, validation_split=0.5)
In [1]: # Evaluate the small model using the train data
small_train = small_model.evaluate(train_features, train_labels)
# Evaluate the small model using the test data
small_test = small_model.evaluate(test_features, test_labels)
# Evaluate the large model using the train data
large_train = large_model.evaluate(train_features, train_labels)
# Evaluate the large model using the test data
large_test = large_model.evaluate(test_features, test_labels)
# Print losses
print('\n Small - Train: {}, Test: {}'.format(small_train, small_test))
print('Large - Train: {}, Test: {}'.format(large_train, large_test))
32/100 [========>.....................] - ETA: 0s - loss: 0.9823
100/100 [==============================] - 0s 365us/sample - loss: 0.9452
32/100 [========>.....................] - ETA: 0s - loss: 0.9657
100/100 [==============================] - 0s 57us/sample - loss: 1.0131
32/100 [========>.....................] - ETA: 0s - loss: 0.0650
100/100 [==============================] - 0s 371us/sample - loss: 0.0487
32/100 [========>.....................] - ETA: 0s - loss: 0.1011
100/100 [==============================] - 0s 61us/sample - loss: 0.2201
Small - Train: 0.9452353072166443, Test: 1.0130866527557374
Large - Train: 0.04870099343359471, Test: 0.2201059103012085
# Initialize x_1 and x_2
x_1 = Variable(6.0,float32)
x_2 = Variable(0.3,float32)
# Define the optimization operation
opt = keras.optimizers.SGD(learning_rate=0.01)
for j in range(100):
# Perform minimization using the loss function and x_1
opt.minimize(lambda: loss_function(x_1), var_list=[x_1])
# Perform minimization using the loss function and x_2
opt.minimize(lambda: loss_function(x_2), var_list=[x_2])
# Print x_1 and x_2 as numpy arrays
print(x_1.numpy(), x_2.numpy())
How to combine two model into a merged model.
# Define the first dense layer
model.add(keras.layers.Dense(16, activation='sigmoid', input_shape=(784,)))
# Apply dropout to the first layer's output
model.add(keras.layers.Dropout(0.25))
# Define the output layer
model.add(keras.layers.Dense(4, activation='softmax'))
# Compile the model
model.compile('adam', loss='categorical_crossentropy')
How to merge 2 models with functional API
In [3]: # For model 1, pass the input layer to layer 1 and layer 1 to layer 2m1_layer1 = keras.layers.Dense(12, activation='sigmoid')(m1_inputs)
m1_layer2 = keras.layers.Dense(4, activation='softmax')(m1_layer1)
# For model 2, pass the input layer to layer 1 and layer 1 to layer 2
m2_layer1 = keras.layers.Dense(12, activation='relu')(m2_inputs)
m2_layer2 = keras.layers.Dense(4, activation='softmax')(m2_layer1)
# Merge model outputs and define a functional model
merged = keras.layers.add([m1_layer2, m2_layer2])
model = keras.Model(inputs=[m1_inputs, m2_inputs], outputs=merged)
# Print a model summary
print(model.summary())
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 784)] 0
__________________________________________________________________________________________________
input_2 (InputLayer) [(None, 784)] 0
__________________________________________________________________________________________________
dense_5 (Dense) (None, 12) 9420 input_1[0][0]
__________________________________________________________________________________________________
dense_7 (Dense) (None, 12) 9420 input_2[0][0]
__________________________________________________________________________________________________
dense_6 (Dense) (None, 4) 52 dense_5[0][0]
__________________________________________________________________________________________________
dense_8 (Dense) (None, 4) 52 dense_7[0][0]
__________________________________________________________________________________________________
add_1 (Add) (None, 4) 0 dense_6[0][0]
dense_8[0][0]
==================================================================================================
Total params: 18,944
Trainable params: 18,944
Non-trainable params: 0
__________________________________________________________________________________________________
None
<script.py> output:
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 784)] 0
__________________________________________________________________________________________________
input_2 (InputLayer) [(None, 784)] 0
__________________________________________________________________________________________________
dense (Dense) (None, 12) 9420 input_1[0][0]
__________________________________________________________________________________________________
dense_2 (Dense) (None, 12) 9420 input_2[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 4) 52 dense[0][0]
__________________________________________________________________________________________________
dense_3 (Dense) (None, 4) 52 dense_2[0][0]
__________________________________________________________________________________________________
add (Add) (None, 4) 0 dense_1[0][0]
dense_3[0][0]
==================================================================================================
Total params: 18,944
Trainable params: 18,944
Non-trainable params: 0
__________________________________________________________________________________________________
None
How to perform validation
Keras: A metric is a function that is used to judge the performance of your model. Metric functions are to be supplied in the
metrics
parameter when a model is compiled.Define sequential model
model = keras.Sequential()
# Define the first layer
model.add(keras.layers.Dense(32, activation='sigmoid', input_shape=(784,)))
# Add activation function to classifier
model.add(keras.layers.Dense(4, activation='softmax'))
# Set the optimizer, loss function, and metrics
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
# Add the number of epochs and the validation split
model.fit(sign_language_features, sign_language_labels, epochs=10, validation_split=0.1)
Train on 1799 samples, validate on 200 samples
Epoch 1/10
32/1799 [..............................] - ETA: 13s - loss: 1.5621 - accuracy: 0.1250
384/1799 [=====>........................] - ETA: 1s - loss: 1.3521 - accuracy: 0.3151
800/1799 [============>.................] - ETA: 0s - loss: 1.2557 - accuracy: 0.4425
1216/1799 [===================>..........] - ETA: 0s - loss: 1.1894 - accuracy: 0.5173
1792/1799 [============================>.] - ETA: 0s - loss: 1.1164 - accuracy: 0.5714
1799/1799 [==============================] - 1s 382us/sample - loss: 1.1150 - accuracy: 0.5725 - val_loss: 0.9990 - val_accuracy: 0.4700
Epoch 2/10
32/1799 [..............................] - ETA: 0s - loss: 0.8695 - accuracy: 0.6562
640/1799 [=========>....................] - ETA: 0s - loss: 0.8454 - accuracy: 0.7609
1184/1799 [==================>...........] - ETA: 0s - loss: 0.8061 - accuracy: 0.7753
1799/1799 [==============================] - 0s 97us/sample - loss: 0.7713 - accuracy: 0.7916 - val_loss: 0.6902 - val_accuracy: 0.7900
... ...
32/1799 [..............................] - ETA: 0s - loss: 0.2896 - accuracy: 0.8750
672/1799 [==========>...................] - ETA: 0s - loss: 0.2077 - accuracy: 0.9717
1312/1799 [====================>.........] - ETA: 0s - loss: 0.2016 - accuracy: 0.9748
1799/1799 [==============================] - 0s 91us/sample - loss: 0.1943 - accuracy: 0.9739 - val_loss: 0.1634 - val_accuracy: 0.9800
Epoch 9/10
32/1799 [..............................] - ETA: 0s - loss: 0.1352 - accuracy: 1.0000
672/1799 [==========>...................] - ETA: 0s - loss: 0.1700 - accuracy: 0.9747
1312/1799 [====================>.........] - ETA: 0s - loss: 0.1612 - accuracy: 0.9809
1799/1799 [==============================] - 0s 89us/sample - loss: 0.1596 - accuracy: 0.9822 - val_loss: 0.1303 - val_accuracy: 0.9950
Epoch 10/10
32/1799 [..............................] - ETA: 0s - loss: 0.1017 - accuracy: 1.0000
704/1799 [==========>...................] - ETA: 0s - loss: 0.1478 - accuracy: 0.9858
1344/1799 [=====================>........] - ETA: 0s - loss: 0.1387 - accuracy: 0.9829
1799/1799 [==============================] - 0s 88us/sample - loss: 0.1358 - accuracy: 0.9817 - val_loss: 0.1126 - val_accuracy: 0.9850
Out[1]: <tensorflow.python.keras.callbacks.History at 0x7f7aab02fc18>
Overfitting
You will detect overfitting by checking whether the validation sample loss is substantially higher than the training sample loss and whether it increases with further training. With a small sample and a high learning rate, the model will struggle to converge on an optimum. You will set a low learning rate for the optimizer, which will make it easier to identify overfitting.
Excellent work! You may have noticed that the validation loss,
val_loss
, was substantially higher than the training loss, loss
. Furthermore, if val_loss
started to increase before the training process was terminated, then we may have overfitted. When this happens, you will want to try decreasing the number of epochs.model.add(keras.layers.Dense(1024, activation='relu', input_shape=(784,)))
# Add activation function to classifier
model.add(keras.layers.Dense(4, activation='softmax'))
# Finish the model compilation
model.compile(optimizer=keras.optimizers.Adam(lr=0.01),
loss='categorical_crossentropy', metrics=['accuracy'])
# Complete the model fit operation
model.fit(sign_language_features, sign_language_labels, epochs=200, validation_split=0.5)
In [1]: # Evaluate the small model using the train data
small_train = small_model.evaluate(train_features, train_labels)
# Evaluate the small model using the test data
small_test = small_model.evaluate(test_features, test_labels)
# Evaluate the large model using the train data
large_train = large_model.evaluate(train_features, train_labels)
# Evaluate the large model using the test data
large_test = large_model.evaluate(test_features, test_labels)
# Print losses
print('\n Small - Train: {}, Test: {}'.format(small_train, small_test))
print('Large - Train: {}, Test: {}'.format(large_train, large_test))
32/100 [========>.....................] - ETA: 0s - loss: 0.9823
100/100 [==============================] - 0s 365us/sample - loss: 0.9452
32/100 [========>.....................] - ETA: 0s - loss: 0.9657
100/100 [==============================] - 0s 57us/sample - loss: 1.0131
32/100 [========>.....................] - ETA: 0s - loss: 0.0650
100/100 [==============================] - 0s 371us/sample - loss: 0.0487
32/100 [========>.....................] - ETA: 0s - loss: 0.1011
100/100 [==============================] - 0s 61us/sample - loss: 0.2201
Small - Train: 0.9452353072166443, Test: 1.0130866527557374
Large - Train: 0.04870099343359471, Test: 0.2201059103012085
Estimators: high level API
Subscribe to:
Posts (Atom)