Foundation of Data Science: Unit IV: Python Libraries for Data Wrangling

Comparisons, Masks and Boolean Logic

Python Libraries for Data Wrangling

Masking means to extract, modify, count or otherwise manipulate values in an array based on some criterion.

Comparisons, Masks and Boolean Logic

• Masking means to extract, modify, count or otherwise manipulate values in an array based on some criterion.

• Boolean masking, also called boolean indexing, is a feature in Python NumPy that allows for the filtering of values in numpy arrays. There are two main ways to carry out boolean masking:

a) Method one: Returning the result array.

b) Method two: Returning a boolean array.

Comparison operators as ufuncs

• The result of these comparison operators is always an array with a Boolean data type. All six of the standard comparison operations are available. For example, we might wish to count all values greater than a certain value, or perhaps remove all outliers that are above some threshold. In NumPy, Boolean masking is often the most efficient way to accomplish these types of tasks.

x = np.array([1,2,3,4,5])

print(x<3) # less than

print(x>3) # greater than

print(x<=3) # less than or equal

print(x>=3) #greater than or equal

print(x!=3) #not equal

print(x==3) #equal

• Comparison operators and their equivalent :

Boolean array:

• A boolean array is a numpy array with boolean (True/False) values. Such array can be obtained by applying a logical operator to another numpy array:

importnumpyasnp

a = np.reshape(np.arange(16), (4,4)) # create a 4x4 array of integers

print(a)

[[ 0 1 2 3]

[ 4 5 6 7]

[ 8 9 10 11]

[12 13 14 15]]

large values (a>10) # test which elements of a are greated than 10

print(large_values)

[[False FalseFalse False]

[False FalseFalse False]

[False Falsefalse True]

[ TrueTrueTrue True]]

even_values = (a%2==0) # test which elements of a are even

print(even_values)

[[True False True False]

[True False True False]

[True False True False]

[True False True False]]

Logical operations on boolean arrays

• Boolean arrays can be combined using logical operators :

b = ~(a%3 == 0) # test which elements of a are not divisible by 3

print('array a:\n{}\n'.format(a))

print('array b:\n{}'.format(b))

array a:

[[ 0 1 2 3]

[ 4 5 6 7]

[ 8 9 10 11]

[12 13 14 15]]

array b:

[[False TrueTrue False]

[ TrueTrue False True]

[True False True True]

[False TrueTrue False]]

Foundation of Data Science: Unit IV: Python Libraries for Data Wrangling : Tag: : Python Libraries for Data Wrangling - Comparisons, Masks and Boolean Logic


Foundation of Data Science: Unit IV: Python Libraries for Data Wrangling



Under Subject


Foundation of Data Science

CS3352 3rd Semester CSE Dept | 2021 Regulation | 3rd Semester CSE Dept 2021 Regulation



Related Subjects


Discrete Mathematics

MA3354 3rd Semester CSE Dept | 2021 Regulation | 3rd Semester CSE Dept 2021 Regulation


Digital Principles and Computer Organization

CS3351 3rd Semester CSE Dept | 2021 Regulation | 3rd Semester CSE Dept 2021 Regulation


Foundation of Data Science

CS3352 3rd Semester CSE Dept | 2021 Regulation | 3rd Semester CSE Dept 2021 Regulation


Data Structure

CS3301 3rd Semester CSE Dept | 2021 Regulation | 3rd Semester CSE Dept 2021 Regulation


Object Oriented Programming

CS3391 3rd Semester CSE Dept | 2021 Regulation | 3rd Semester CSE Dept 2021 Regulation