TensorFlow Fundamentals
Notebook on fundamental concepts of tensorflow
- Fundamental concepts of TensorFlow:
- Introduction to Tensors
- Creating tensors with tf.Variable:
- Creating random tensors:
- shuffle the order of elements in a tensor
- Other ways to make tensors
- Turn Numpy arrays into tensors:
- Getting Information from tensors:
- Indexing tensors:
- Manipulating tensors(tensors operations):
- Matrix Multiplication in TensorFlow:
- Changing the datatype of a tensor
- Aggregating tensors
- Find the positional maximum and minimum
- Squeezing a tensor (removing all single dimensions)
- One hot encoding tensors
- More on math functions:
- Tensors and NumPy
- Using @tf.function
- Using GPUs:
- Solutions to the Exercises given in the tutorial Notebook:
Fundamental concepts of TensorFlow:
This notebook is an account of my working for the Tensorflow tutorial by Daniel Bourke on Youtube. The Notebook covers key concepts of tensorflow essential for Deep Learning. It also highlights key points of using the various methods of TensorFlow library and also notes the possible common errors we are going to encounter during tensorflow. The possible fixes for the errors are also included in the notebook.
TensorFlow:
TensorFlow is google's open-source end to end machine learning library. The basic units of the library are tensors which are generalization of matrices to higher dimensions. TensorFlow library helps in doing the computation of tensors faster by accelerating the computation process through GPUs/TPUs.
The other important library for scientific computing is NumPy and TensorFlow works well with NumPy. The only difference is tensor flow has high functionality can be used to quickly implement the code even for complex deep learning architectures, which can help us experiment more and spend more effort on making it better rather than focussing on building the Neural Networks from scratch. You can also pass on the python functions with tensorflow to accelerate the function calls.
Concepts covered in this Notebook:
- Introduction to tensors
- Getting information from tensors
- Manipulating tensors
- Tensors and Numpy
- using @tf.function(a way to speed up your python functions)
- Using GPUs with TensorFlow (or TPUs)
- Solutions to Exercises given in the tutorial notebook.
import tensorflow as tf
print(tf.__version__)
scalar = tf.constant(7)
scalar
a_scalar_1 = tf.constant(3)
a_scalar_2 = tf.constant(4)
scalar.ndim
a_scalar_1.ndim
vector = tf.constant([10,101,11])
vector
vector.ndim
matrix = tf.constant([[2,3,4],[5,6,7],[8,9,0]])
matrix
matrix.ndim
another_matrix = tf.constant([[10.,7.,4.],[3.,2.,4.]], dtype =tf.float16)
another_matrix
another_matrix_1 = tf.constant([[10.,7.,4.],[3.,2.,4.]], dtype =tf.float32)
another_matrix_1
The difference between both the dtypes are precision. The higher the no after the "float" the more exact the values inside the matrix stored in your computer.
another_matrix.ndim
Even though the matrix is (3,2) the ndim function gives the value 2. Because the number of elements in the shape gives us the number of dimensions of the matrix. Here, we have two elements (3,2) for the shape of the matrix so the ndim gives the output 2
example_mat = tf.constant([[[1,2,3],[3,4,5]],
[[6,7,3],[3,2,4]],
[[3,2,1],[2,1,4]]])
example_mat
example_mat.ndim
So, we have created a matrix with shape (3,2,3) there are three elements in the value of shape. so the ndim returned the value 3
tensor = tf.constant([[[1.,0.3,0.5],
[0.2,0.5,0.9],
[3.,6.,7.]],
[[0.2,0.5,0.8],
[2.,3.5,6.7],
[4.,8.,0.]],
[[2.8,5.6,7.9],
[0.6,7.9,6.8],
[3.4,5.6,7.8]]], dtype = tf.float16)
tensor
tensor.ndim
so, now we created a tensor of 3 dimension.
What we have created so far:
- Scalar: a single number
- Vector: a number with direction
- matrix: a two dimensional array of numbers
- Tensor: an n-dimensional array of numbers
- 0-dimensional tensor is scalar
- 1-dimensional tensor is vector
changeable_tensor = tf.Variable([10,7])
unchangeable_tensor = tf.constant([10,7])
changeable_tensor, unchangeable_tensor
changeable_tensor[0] = 7
changeable_tensor
changeable_tensor[0].assign(7)
changeable_tensor
unchangeable_tensor[0] = 7
unchangeable_tensor[0].assign(7)
unchangeable_tensor
As you can see the difference between tf.Variable
and tf.constant
. The former one is mutable and you can change and manipulate the elements using the tf.Variable
and the latter created an immutable object where you cannot change or manipulate the values of the type tf.Constant
Note: Rarely in practice you will decide whether to use tf.constant
or tf.Variable
to create tensors as TensorFlow does this for you. However, if in doubt, use tf.constant
and change it later if needed
random_1 = tf.random.Generator.from_seed(42)
random_1 = random_1.normal(shape = (3,2))
another_random_1 = tf.random.Generator.from_seed(42)
another_random_1 = another_random_1.normal(shape = (3,2))
# Let's check if they are equal?
random_1, another_random_1, random_1 == another_random_1
So, the random tensors appear as random but they are infact pseudo random numbers. The seed acts like a starting trigger for the underlying random algorithm. Specifying the seed value will help us in producing the same results since the a random generator function produce the same random value everytime if we use a same seed value.
This can help when we are reproducing the same model from any where. The paramters that neural network is learning in each step will be different if we get different intialization values of our weights. If we used the same seed value as mentioned in the previously implemented model we can generate the same intialization at the beginning and produce the exact same results.
random_2 = tf.random.Generator.from_seed(42) # seed is set for reproducing the same result
random_2 = random_2.normal(shape = (3,4))
random_2
random_3 = tf.random.Generator.from_seed(42)
random_3 = random_3.normal(shape = (4,4))
random_3
random_4 = tf.random.Generator.from_seed(21)
random_4 = random_4.normal(shape = (10,10))
random_4
random_5 = tf.random.Generator.from_seed(5)
random_5 = random_5.normal(shape = (3,3,3))
random_5
random_6 = tf.random.Generator.from_seed(6)
random_6 = random_6.normal(shape = (5,5,5))
random_6
not_shuffled = tf.constant([[10,7],
[3,4],
[2,3]])
# shuffle our non-shuffled tensor:
tf.random.shuffle(not_shuffled)
not_shuffled
tf.random.shuffle(not_shuffled)
tf.random.shuffle(not_shuffled, seed = 42)
tf.random.shuffle(not_shuffled, seed = 42)
# this kind of setting the seed only work at operation-level
# we need to declare a global seed to make this work
Even though we set the same seed the value is getting changed. Why is this happening?
refer to this link :tf.random.seed_set
documentation
# Here we set the seed as global seed
tf.random.set_seed(42)
tf.random.shuffle(not_shuffled)
# we need to pass the arguments : shape, dtype etc.
# create a tensor of all ones
tf.ones([10,7])
tf.zeros([10,7])
import numpy as np
numpy_A = np.arange(1,25,dtype = np.int32)
numpy_A
# X = tf.constant(some_matrix) # capital for tensor or matrix
# y = tf.constant(vector) # non-capital for vector
A = tf.constant(numpy_A, shape = (2,3,4))
B = tf.constant(numpy_A)
A,B
A.ndim
The unmodified shape is the same shape as our NumPy vector. If you want to change the shape of the array with tf.constant, we need to make sure the product of the three values of dimensions should be equal to the no. of values in the unmodified array
So, anything we have in NumPy we can pass it to a tensor.
numpy_C = np.arange(1,101,dtype = np.float16)
numpy_D = np.arange(1,37,dtype = np.float32)
numpy_C, numpy_D
C = tf.constant(numpy_C, shape = (10,10))
D = tf.constant(numpy_D,shape = (6,6))
C,D
C.ndim, D.ndim
Getting Information from tensors:
Attributes:
When dealing with tensors you probably want to e aware of the following attributes:
- Shape : The length of each of the dimensions of a tensor
- code:
tensor.shape
- code:
- Rank: The number of tensor dimensions. A scalar has a rank 0, vector has rank 1,a matrix is rank 2, a tensor has a rank n.
- code:
tensor.ndim
- code:
- Axis or dimension : A particular dimension of a tensor
- code:
tensor[0]
,tensor[:,1]
etc
- code:
- Size : The total number of items in the tensor.
- code: tf.size(tensor)
rank_4_tensor = tf.zeros(shape = [2,3,4,5])
rank_4_tensor
rank_4_tensor[0]
rank_4_tensor[:,1]
rank_4_tensor.shape, rank_4_tensor.ndim, tf.size(rank_4_tensor)
print("Datatype of every element", rank_4_tensor.dtype)
print("Number of dimensions (rank): ", rank_4_tensor.ndim)
print("Shape of tensor: ", rank_4_tensor.shape)
print("Elements along the 0 axis:", rank_4_tensor.shape[0])
print("Elements along the last axis:", rank_4_tensor.shape[-1])
print("Total number of elements in our tensor:",tf.size(rank_4_tensor).numpy() )
# we can put all the print statements in a function to reuse it whenever we want
def print_attributes_of_tensor(tensor_name):
print("Datatype of every element", tensor_name.dtype)
print("Number of dimensions (rank): ", tensor_name.ndim)
print("Shape of tensor: ", tensor_name.shape)
print("Elements along the 0 axis:", tensor_name.shape[0])
print("Elements along the last axis:", tensor_name.shape[-1])
print("Total number of elements in our tensor:",tf.size(tensor_name).numpy())
return 0
print_attributes_of_tensor(rank_4_tensor)
Now, we can reuse the function to print the attributes of any tensor by passing the tensor name as function argument. We can add more print statement to display more attributes of the tensor.
some_list = [1,2,3,4]
some_list[:2]
some_list[:1]
rank_4_tensor[:2,:2,:2,:2]
rank_4_tensor.shape
rank_4_tensor[:,:1,:1,:]
rank_2_tensor = tf.constant([[10,7],
[3,4]])
rank_2_tensor
print_attributes_of_tensor(rank_2_tensor)
some_list, some_list[-1]
rank_2_tensor[:,-1]
rank_3_tensor = rank_2_tensor[..., tf.newaxis]
rank_3_tensor
we added a new dimension at the end
"..." means indicating all the other previous present dimensions and the new axis gets added at the end
tf.expand_dims(rank_2_tensor, axis = -1) # "-1" means expand the final axis
# see the documentation for more details
tf.expand_dims(rank_2_tensor, axis = 0) # expand the 0-axis
tf.expand_dims(rank_2_tensor, axis = 1)
tensor = tf.constant([[10,7],
[3,4]])
tensor+10
tensor*15
tensor - 10
tensor /10
tensor
tf.math.multiply(tensor,3)
tf.add(tensor,tensor)
print(tensor)
tf.linalg.matmul(tensor, tensor) # or tf.matmul also works
tensor, tensor
tensor * tensor
tensor @ tensor
tensor.shape
X = tf.constant([[1,2],
[3,4],
[5,6]])
# create another (3,2)
Y = tf.constant([[7,8],
[9,10],
[11,12]])
X @ Y
''' This gives an error because X and Y doesn't satisfy
the criteria for matrix multiplication '''
tf.matmul(X,Y)
''' This gives an error because X and Y doesn't satisfy
the criteria for matrix multiplication '''
This fails because for two matrices to be multiplied the dimensions should satisfy these two criteria:
- Inner dimensions must match
- The resulting matrix has the shape of the inner dimensions
tf.reshape(Y, shape = (2,3))
tf.matmul(X,Y)
tf.matmul(X, tf.reshape(Y, shape = (2,3)))
This works !!!
tf.matmul(tf.reshape(X, shape = (2,3)), Y)
X.shape, tf.reshape(Y, shape = (2,3))
You can see that the inner dimensions now match, and the output of the dot product is the same as outer
Note: Matrix Multiplication is also called the "Dot Product"
X, tf.transpose(X), tf.reshape(X, shape= (3,2))
Perform the dot product on X and Y (requires X or Y to be transposed)
tf.tensordot(tf.transpose(X), Y , axes =1 )
we can use either transpose or reshape
tf.matmul(X, tf.transpose(Y))
tf.matmul(X, tf.reshape(Y, shape = (2,3)))
we are getting different values for the dot product in the above two cases. That means tf.reshape()
and tf.transpose()
does not exactly do the same thing. In some cases we might get output of both the functions same but not always.
print("Normal Y:")
print(Y, "\n")
print("Y reshaped to (2,3):")
print(tf.reshape(Y, (2,3)),"\n")
print("Y transposed:")
print(tf.transpose(Y))
tf.matmul(X, tf.transpose(Y))
Generally, when performing matrix multiplication on two tensors, and one of the axes doesn't line up, you will transpose rather than reshape one of the tensors to satisfy the matrix multiplication rules.
B = tf.constant([1.7,7.4])
B.dtype
C = tf.constant([7,10])
C.dtype
D = tf.cast(B, dtype = tf.float16)
D, D.dtype
E = tf.cast(C, dtype = tf.float32)
E.dtype
E_float16 = tf.cast(E, dtype = tf.float16)
E_float16
D = tf.constant([-7,10])
D
tf.abs(D)
Let's go through the following forms of aggregation:
- Get the minimum
- Get the maximum
- Get the mean of a tensor
- Get the sum of a tensor
E = tf.constant(np.random.randint(0,100,size = 50))
E
tf.size(E), E.shape, E.ndim
tf.reduce_min(E)
tf.reduce_max(E)
tf.reduce_mean(E)
tf.reduce_sum(E)
Exercise:
Find the variance and standard deviation of our E
tensor using TensorFlow methods
# Find the variance of our tensor
import tensorflow_probability as tfp
tfp.stats.variance(E)
tf.math.reduce_std(E)
# Error : The input must be either real or complex
# so cast it to float32
tf.math.reduce_std(tf.cast(E, dtype = tf.float32))
# The method works only if the tensor elements are either real or complex
tf.random.set_seed(42)
F = tf.random.uniform(shape =[50])
F
tf.argmax(F)
np.argmax(F)
F[tf.argmax(F)]
assert F[tf.argmax(F)] == tf.reduce_max(F)
No error so we got it right!
F[tf.argmax(F)] == tf.reduce_max(F)
tf.argmin(F)
F[tf.argmin(F)]
tf.random.set_seed(42)
G = tf.constant(tf.random.uniform(shape = [50]), shape = (1,1,1,1,50))
G
G.shape
G_squeezed = tf.squeeze(G)
G_squeezed, G_squeezed.shape
some_list = [0,1,2,3] # could be red, green , blue , purple
# one hot encoding our list of indices
tf.one_hot(some_list, depth = 4)
tf.one_hot(some_list, depth = 4, on_value = "Yo I love deep learning", off_value = "I also like to write")
H = tf.range(1,10)
H
tf.square(H)
tf.sqrt(H)
we got an error here because tensors of dtype int32 is not allowed as arguments for sqrt function. So, we cast it to different datatype
tf.sqrt(tf.cast(H, dtype = tf.float32))
tf.math.log(H)
we also get the same error for this too. so cast the argument tensor to one of the allowed values.
tf.math.log(tf.cast(H, dtype = tf.float32))
J = tf.constant(np.array([3.,7.,10.]))
J
np.array(J), type(np.array(J))
J.numpy() , type(J.numpy())
J = tf.constant([3.])
J.numpy()[0]
numpy_J = tf.constant(np.array([3.,7.,10.]))
tensor_J = tf.constant([3.,7.,10.])
# Check the datatypes of each
numpy_J.dtype , tensor_J.dtype
We can see above that creating tensors directly from tensorflow will create a default dtype of float32 values but if we pass in numpy array to tf.constant
the default dtype of created tensor is float64
@tf.function
Using In your TensorFlow adventures, you might come across Python functions which have the decorator @tf.function
.
But in short, decorators modify a function in one way or another.
In the @tf.function
decorator case, it turns a Python function into a callable TensorFlow graph. Which is a fancy way of saying, if you've written your own Python function, and you decorate it with @tf.function
, when you export your code (to potentially run on another device), TensorFlow will attempt to convert it into a fast(er) version of itself (by making it part of a computation graph).
For more on this, read the Better performnace with tf.function guide.
def function(x, y):
return x ** 2 + y
x = tf.constant(np.arange(0, 10))
y = tf.constant(np.arange(10, 20))
function(x, y)
@tf.function
def tf_function(x, y):
return x ** 2 + y
tf_function(x, y)
If you noticed no difference between the above two functions (the decorated one and the non-decorated one) you'd be right.
Much of the difference happens behind the scenes. One of the main ones being potential code speed-ups where possible.
print(tf.config.list_physical_devices('GPU'))
The PC I am working from has no GPU support.
import tensorflow as tf
print(tf.config.list_physical_devices('GPU'))
If you've got access to a GPU, the cell above should output something like:
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
You can also find information about your GPU using !nvidia-smi
.
🔑 Note: If you have access to a GPU, TensorFlow will automatically use it whenever possible.
- Create a vector, scalar, matrix and tensor with values of your choosing using
tf.constant()
.
A1 = tf.constant([3]) # scalar
A2 = tf.constant([10, 7]) # vector
A3 = tf.constant([[10,7],
[3,4]]) # matrix
A4 = tf.constant([[[10,7,3],
[3,4,5]],
[[2,3,4],
[7,8,9]],
[[1,2,3],
[6,7,8]]]) # tensor of dimension 3
A1,A2,A3,A4
- Find the shape, rank and size of the tensors you created in 1.
tf.shape(A1), tf.size(A1), tf.rank(A1)
tf.shape(A2), tf.size(A2), tf.rank(A2)
tf.shape(A3), tf.size(A3), tf.rank(A3)
tf.shape(A4), tf.size(A4), tf.rank(A4)
- Create two tensors containing random values between 0 and 1 with shape
[5, 300]
.
tf.random.set_seed(42)
B1 = tf.random.uniform([5,300], minval = 0, maxval = 1) # it works even if we not specify the min and max val since the function arguments defaults to 0 and 1
B2 = tf.random.uniform([5,300], minval = 0, maxval = 1)
B1,B2
- Multiply the two tensors you created in 3 using matrix multiplication.
tf.matmul(B1,tf.transpose(B2))
tf.matmul(tf.transpose(B1), B2)
- Multiply the two tensors you created in 3 using dot product.
tf.tensordot(B1,tf.transpose(B2), axes = 1)
- Create a tensor with random values between 0 and 1 with shape
[224, 224, 3]
.
tf.random.set_seed(42)
randtensor = tf.random.uniform([224,224,3])
randtensor
- Find the min and max values of the tensor you created in 6.
tf.reduce_max(randtensor)
tf.reduce_min(randtensor)
- Created a tensor with random values of shape
[1, 224, 224, 3]
then squeeze it to change the shape to[224, 224, 3]
.
tf.random.set_seed(42)
for_squeeze = tf.random.uniform([1,224,224,3])
for_squeeze
G_squeezed = tf.squeeze(for_squeeze)
G_squeezed, G_squeezed.shape
- Create a tensor with shape
[10]
using your own choice of values, then find the index which has the maximum value.
tf.random.set_seed(42)
nine_ans = tf.random.uniform([10], maxval = 10,dtype = tf.int32)
nine_ans
tf.argmax(nine_ans)
- One-hot encode the tensor you created in 9.
tf.one_hot(tf.cast(nine_ans,dtype = tf.int32), depth = 10)
Bibliography: