A Short Guide to Numpy Array operations.

Introducing Numpy Arrays:

Numpy is a python library used to create, modify and interact with Arrays. It is a very vast and powerful library which is very essential for data analysis tasks. The Numpy library basic routines are very simple and easy to learn. Numpy is also known for scientific computation since it can handle large amount of data in the form of arrays. Numpy is mostly written in C which makes it run time fast and execute operations faster. It is very useful for Mathematics since it offers a vast collection of mathematical functions and random number generators. In this notebook, I try to demonstrate some standard numpy operations essential for operating on matrices. Below are the five operations that I am going to demonstrate:

  • np.matmul
  • np.linalg.solve
  • np.column_stack(tup)
  • np.bmat
  • np.log

The demonstrating includes the syntax of using these operations and how to pass arguments for these functions. It also demonstrates in what special cases do we get errors using these operations and also how to solve fix the erorrs efficiently.

Let's begin by importing Numpy and listing out the functions covered in this notebook.

import numpy as np
function1 = np.matmul 
function2 = np.linalg.solve
function3 = np.column_stack
function4 = np.bmat
function5 = np.emath

Function 1 - np.matmul

The np.matmul operator is a matrix multiplication operator used to multiply arrays.The @ is short form for using this operator.
The function has the following syntax:
np.matmul(a,b)-- where a,b are arrays that are to be multiplied.

arr1 = [[15, 24], 
        [33, 4.]]

arr2 = [[5, 16, 37], 
        [18, 9, 20]]

mult=np.matmul(arr1,arr2)
print(mult)
[[ 507.  456. 1035.]
 [ 237.  564. 1301.]]

Explanation: The matmul operator takes two 2d arrays i.e. arr1 and arr2 abd multiplied by matrix multiplication. Inorder for arrays to be able to perform matrix multiplication the 2nd dimension of arr1 should be equal to first dimension of arr2. Here they satisfy the condition and the output is generated. If the dimensions of two arrays do not satisfy the property we get a traceback error msg telling us there is a mismatch in dimensions.

np.matmul([2j, 3j], [2j, 3j])
(-13+0j)

The vector vector gives us scalar inner product. Given two vector arrays the np.matmul operation does dot product or also called as scalar product. The 0j in the output is the complex part of the output since there are no complex numbers involved we get zero complex part in our output

arr1 = [[15,24,32], 
        [3, 4.,12]]

arr2 = [[5, 6, 7], 
        [8, 9, 10]]

np.matmul(arr1,arr2)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_36/2247344183.py in <module>
      6         [8, 9, 10]]
      7 
----> 8 np.matmul(arr1,arr2)

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 2 is different from 3)

It breaks because matrix multiplication takes place only when the second dimension of the first array is equal to first dimension of the second array.

This function is used to multiply arrays, for dot product of two vectors and multiplication of 3d arrays etc.

Function 2 - np.linalg.solve

This operator is used to solve linear equations and give out the value of unknowns from the system of linear equations. For example ax = b be a system of linear equations and the above function solves the equation for x. a, b are matrices formed from the system of linear equations.In mathematical terms, a is the coefficient matrix and x is the vector containing unknowns and b is the right hand side values of the system of equations.
The function has the following syntax:
np.linalg.solve(a,b) where a,b are arrays coefficient matrix , dependant variable matrix.

a = [[5,6],
     [7,8]]
b = [12,16]
x = np.linalg.solve(a,b)
print(x)
[0. 2.]

Here we found the solution for the system of linear equations where a is coefficient matrix and b is the solution matrix. the system of linear equations are of the form: 5m + 6n = 12 7m + 8n = 16 and the matrix a is made from the coefficients of the equations the output gives a vector x of two elements that are the solutions of m and n respectively.

Example 2 - working
a = [[2,3,4],[8,4,2],[5,3,4]]
b = [12,16,18]
x = np.linalg.solve(a,b)
print(x)
[ 2.  -1.6  3.2]

The above example is based on three linear equations in three variables. the np.linalg.solve all kinds of systems of linear equations.

a = [[1,2,3],[4,5,6],[7,8,9]]
b = [12,16,18]
x = np.linalg.solve(a,b)
---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
/tmp/ipykernel_36/3857377319.py in <module>
      2 a = [[1,2,3],[4,5,6],[7,8,9]]
      3 b = [12,16,18]
----> 4 x = np.linalg.solve(a,b)

<__array_function__ internals> in solve(*args, **kwargs)

/opt/conda/lib/python3.9/site-packages/numpy/linalg/linalg.py in solve(a, b)
    391     signature = 'DD->D' if isComplexType(t) else 'dd->d'
    392     extobj = get_linalg_error_extobj(_raise_linalgerror_singular)
--> 393     r = gufunc(a, b, signature=signature, extobj=extobj)
    394 
    395     return wrap(r.astype(result_t, copy=False))

/opt/conda/lib/python3.9/site-packages/numpy/linalg/linalg.py in _raise_linalgerror_singular(err, flag)
     86 
     87 def _raise_linalgerror_singular(err, flag):
---> 88     raise LinAlgError("Singular matrix")
     89 
     90 def _raise_linalgerror_nonposdef(err, flag):

LinAlgError: Singular matrix

It breaks down above because solving of linear equations has some conditions. If the matrix is singular the output does not give any solution. singular matrix is one that is not invertible. This means that the system of equations you are trying to solve does not have a unique solution.

This function proves very useful when we need to solve n number of linear equations in n variables and gives the solution. There are a lot of uses for the function Mathematically since it reduced the load of solving linear equations by row reduction process, which is very tedious and most times do not provide accurate solution.

Function 3 - np.column_stack

np.column_stack is used to stack two 1-D arrays into a 2D array of column wise.It is used in the following format np.column_stack(tuple). The parameter is given as tuple type. The 1D array are entered as tuples.
The function has the following syntax:
np.column_stack(tuple)

a = np.array((4,5,6))
b = np.array((12,14,17))
c = np.column_stack((a,b))
print(c)
[[ 4 12]
 [ 5 14]
 [ 6 17]]

The two arrays a and b are stacked side by side / column wise and the output array is a 2D array containing one column(a) and the other column(b)

a = np.array((3,34,53,72,65,81))
b = np.array((2,5,43,76,83,92))
c = np.column_stack((a,b))
print(c)
[[ 3  2]
 [34  5]
 [53 43]
 [72 76]
 [65 83]
 [81 92]]

The above example stacks two given arrays in the form of tuples and stacks them side by side.

a = np.array((1,2,3,4))
b = np.array((12,13,14))
c = np.column_stack((a,b))
print(c)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_36/354580763.py in <module>
      2 a = np.array((1,2,3,4))
      3 b = np.array((12,13,14))
----> 4 c = np.column_stack((a,b))
      5 print(c)

<__array_function__ internals> in column_stack(*args, **kwargs)

/opt/conda/lib/python3.9/site-packages/numpy/lib/shape_base.py in column_stack(tup)
    654             arr = array(arr, copy=False, subok=True, ndmin=2).T
    655         arrays.append(arr)
--> 656     return _nx.concatenate(arrays, 1)
    657 
    658 

<__array_function__ internals> in concatenate(*args, **kwargs)

ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 4 and the array at index 1 has size 3

The function breaks down because one of the array is longer in size than the other. The concatenation or stacking takes place only when both the arrays to be stacked has same size.

The function is very useful when we need to stack two list .Similar to appending but here we are appending columns. The function can be used to add the column to pre-existing ones.

Function 4 - np.bmat

It builds a matrix object from a string, nested sequence or an array.
The function has the following syntax:
np.bmat(object,ldict,gdict)where ldict and gdict are dictionaries and they are optional parameters and object is either string or array-like.

A = np.mat('1 1; 1 1')
B = np.mat('2 2; 2 2')
C = np.mat('3 4; 5 6')
D = np.mat('7 8; 9 0')
np.bmat([[A, B], [C, D]])
matrix([[1, 1, 2, 2],
        [1, 1, 2, 2],
        [3, 4, 7, 8],
        [5, 6, 9, 0]])

The above example takes four arrays A,B,C,D and builds up a matrix by stacking A and B, C and D column wise and then stacking the arrays row wise one top of the another.

A = np.mat('12 13; 14 15')
B = np.mat('23 28; 24 22')
C = np.mat('34 45; 57 69')
D = np.mat('7 8; 9 0')
np.bmat([[B, A], [D, C]])
matrix([[23, 28, 12, 13],
        [24, 22, 14, 15],
        [ 7,  8, 34, 45],
        [ 9,  0, 57, 69]])

The above example takes the arrays A,B,C,D and builds the matrix by stacking them as per the order of the parameters given in the paranthesis of the function.

A = np.mat('12 13; 14 15 18')
B = np.mat('23 28; 24 22')
C = np.mat('34 45; 57 69')
D = np.mat('7 8; 9 0 12')
np.bmat([[B, A], [D, C]])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/tmp/ipykernel_36/1923516388.py in <module>
      1 # Example 3 - breaking (to illustrate when it breaks)
----> 2 A = np.mat('12 13; 14 15 18')
      3 B = np.mat('23 28; 24 22')
      4 C = np.mat('34 45; 57 69')
      5 D = np.mat('7 8; 9 0 12')

/opt/conda/lib/python3.9/site-packages/numpy/matrixlib/defmatrix.py in asmatrix(data, dtype)
     67 
     68     """
---> 69     return matrix(data, dtype=dtype, copy=False)
     70 
     71 

/opt/conda/lib/python3.9/site-packages/numpy/matrixlib/defmatrix.py in __new__(subtype, data, dtype, copy)
    140 
    141         if isinstance(data, str):
--> 142             data = _convert_from_string(data)
    143 
    144         # now convert data to an array

/opt/conda/lib/python3.9/site-packages/numpy/matrixlib/defmatrix.py in _convert_from_string(data)
     28             Ncols = len(newrow)
     29         elif len(newrow) != Ncols:
---> 30             raise ValueError("Rows not the same size.")
     31         count += 1
     32         newdata.append(newrow)

ValueError: Rows not the same size.

It breaks because there the sizes of the rows are different so they cannot be stacked. The sizes of the rows or columns that are getting stacked must be of the same size inorder to stack them and build the matrix.
To fix this we need to remove the extra rows in the arrays or add NaN element to others to make the rows of same size.

The function is useful when we have some groups of arrays and we want to stack them into a matrix to perform operations and gain insights from that build matrix.

Function 5 - np.log

The function is used to calculate log values of the elements of the array
The sytax for the function is:
np.log(x)-- x is array-like

np.log([1, np.e, np.e**2])
array([0., 1., 2.])

In the above example the function took the input parameters and calculated the individual log values of the elements and gives the output.

np.log([1, 0, np.e])
/tmp/ipykernel_36/1939558749.py:2: RuntimeWarning: divide by zero encountered in log
  np.log([1,0, np.e])
array([  0., -inf,   1.])

The above example demonstrates that the np.log function even takes 0 and gives the -inf as the output. It gives a warning because the function detects a zero division because of log (0) computation.

np.log([1, np.e, a])
/tmp/ipykernel_36/1874325113.py:2: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
  np.log([1, np.e, a])
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
AttributeError: 'int' object has no attribute 'log'

The above exception was the direct cause of the following exception:

TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_36/1874325113.py in <module>
      1 # Example 3 - breaking (to illustrate when it breaks)
----> 2 np.log([1, np.e, a])

TypeError: loop of ufunc does not support argument 0 of type int which has no callable log method

The above example breaks down because log doesn't support strings. To fix this replace string by integer or float value.

This function is useful to calculate the log values of an array so that we can manipulate the array and do log operations on the array

Conclusion

The Notebook covers the following:
np.matmul function: which is used for dot product and multiplication of arrays, given the two arrays satisfies the size condition for matrix multiplication.
np.linalg.solve : which is used to solve system of linear equations given the coefficient matrix is non-singular.
np.column_stack: which is used to stack or concatenate two 1D arrays column-wise and give a 2D array,
np.bmat: builds a matrix from strings or arrays.
np.log: computed the log values of the elements in the array. The following functions are very useful while manipulating matrices and computation of matrices.

Back to top of page