Series Data Structure (Python Pandas -1) |Data Handling Using Pandas Class 12 Informatics Practices |Class 11 | Class 12 | CBSE | Informatics Practices | Notes | Study Material

    

     Series Data Structure


  • Series is an important data structure of pandas.
  • It represents a one-dimensional array of indexed data.
  • A Series type object has two main components:

→ an array of actual data 

→ an associated array of indexes (Numeric index) or data labels (Labelled index). 

  • Both Components are one Dimensional arrays with the same length.
  • Index is used to accesss individual data values.

Index

Data

0

20

1

10

2

30

3

34

 

Index

Data

Jan

31

Feb

23

Mar

38

Apr

98

  

Index

Data

“A”

97

“B”

65

“C”

43

“D”

78

NOTE : Index: integer, string, char 

1) Creating Series Objects

1.1 Create empty Series object by using just the Series() with no parameter

 To Create an empty object. Having no values

       <Series object>=pandas.Series()

NOTE:S is in Uppercase

For Example:

>>>import pandas as pd

>>> obj1=pd.Series()

>>> obj1

Series([],dtype=float64)

1.2 Creating Non-empty Series Objects

  To create non-empty Series object, you need to specify arguments for data and indexes as per the following syntax

<Series object>=pd.Series(data,index=idx)

where idx is a valid Numpy datatype and data is the data part of the Series object,

NOTE: data

→ A Python sequence

→ An ndarray

→A Python dictionary

→A scalar value

1.2.1 Specify data as Python Sequences

         <Series object> = Series(any python sequences)

Example 1: Sequences data - List

Program:

import pandas as pd 

s1=pd.Series([4,5,6,7])

print("Series Object is : ")

print(s1)

Output:

Series Object is : 

0    4

1    5

2    6

3    7

dtype: int64

Example 2: Sequences data - Tuple

Program:

import pandas as pd 

s1=pd.Series((40,85,69,27))

print("Series Object is : ")

print(s1)

Output:

Series Object is : 

0    40

1    85

2    69

3    27

dtype: int64

Example 3:#Sequances data - Individual Character

Program:

import pandas as pd 

s1=pd.Series(['a','e','i','o','u'])

print("Series Object  : ")

print(s1)

Output:

Series Object  : 

0    a

1    e

2    i

3    o

4    u

dtype: object

Example 4:#Sequances data - String

Program:

import pandas as pd 

s1=pd.Series("Pythontyro")

print("Series Object  : ")

print(s1)

Output:

Series Object  : 

0    Pythontyro

dtype: object

Example 5: Series object using three different words:

Program:

import pandas as pd 

s1=pd.Series(["I","class12","IP"])

print("Series Object  : ")

print(s1)

Output:

Series Object  : 

0          I

1    class12

2         IP

dtype: object

1.2.2 Specify data as an ndarray

    The data attributes can be an ndarray 

Example 1:#Data as an ndarray           

Program:

                 import numpy as np

                import pandas as pd

                nd1=np.arange(2,30,4.5)

                print("ndarray is",nd1)

                s1=pd.Series(nd1)

                print("Series Object  : ")

                print(s1)

Output:

        ndarray is:  [ 2.   6.5 11.  15.5 20.  24.5 29. ]

        Series Object  : 

        0     2.0

        1     6.5

        2    11.0

        3    15.5

        4    20.0

        5    24.5

        6    29.0

        dtype: float64

Example 2: #Data as an ndarray :linspace interval

Program:

        import numpy as np

        import pandas as pd

        s1=pd.Series(np.linspace(24,64,5))

        print("Series Object  : ")

        print(s1)

Output:

        Series Object  : 

        0    24.0

            34.0

        2    44.0

        3    54.0

        4    64.0

        dtype: float64

Example 3:#Data as an ndarray :creating tiling a list

        #tile(A,B) A is num of time given by replication, b is                     replication

Program:

        import numpy as np

        import pandas as pd

        s1=pd.Series(np.tile([3,5,4],2))

        print("Series Object  : ")

        print(s1)

Output:

        Series Object  : 

        0    3

        1    5

        2    4

        3    3

        4    5

        5    4

        dtype: int32

1.2.3 Specify data as a python Dictionary

Program:

        #Data - Dictionary

        #key-index and values -data

        import pandas as pd

        s1=pd.Series({10:"one","python":"two"})

        print("Series Object  : ")

        print(s1)

Output

        Series Object  : 

        10        one

        python    two

        dtype: object

1.2.4 Specify data as a Scalar value Object

The data can be in the form of single values. 

But if data is a scalar value then the index arguments to Series function must be provided.

Data Match to represented index

Example 1:

Program: Scalar value is 6

import pandas as pd

a=pd.Series(6,index=range(0,2))

print(a)

Output:

0    6

1    6

dtype: int64

Example 2:Scalar value - 600,   

                   index - list

Program:

import pandas as pd

a=pd.Series(600,index=['one','two','four'])

print(a)

Output:

one     600

two     600

four    600

dtype: int64

2.Creating Series Objects-Additional Functionality

2.1 Specifying /Adding NaN values in a Series Object

    You can fill missing data with a NaN(Not a Number) values

Example 1 :

Program:

import pandas as pd

import numpy as np

a=pd.Series(["2",np.NaN,None,'5.25','a','program'])

print(a)

NOTE: Use np.NaN or None to add missing data

Output:

0          2

1        NaN

2       None

3       5.25

4          a

5    program

dtype: object


2.2 Specifying index(es) as well as data with Series()

Example 1: Both data and index is None

Program:

import pandas as pd

a=pd.Series(data=None,index=None)

print(a)

            (or)

import pandas as pd

a=pd.Series()

print(a)

Output:

The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.

  a=pd.Series(data=None,index=None)

Example 2: You could skip keyword data or Index 

Program:

import pandas as pd

a=[10,4,59,100]

i=['x','python',3.24,700]

a=pd.Series(a,i)

print(a)

Output:

x          10

python      4

3.24       59

700       100

dtype: int64

Example 3: Both data and index have to be sequences ,None is taken by default, if you skip these parameters 

Program:

import pandas as pd

a=[10,4,59,100]

i=['x','python',3.24,700]

a=pd.Series(a)

print(a)

Output:

0     10

1      4

2     59

3    100

dtype: int64

Example 4:

Program

import pandas as pd

a=[10,4,59,100]

i=['x','python',3.24,700]

a=pd.Series(i)

print(a)

Output

0         x

1    python

2      3.24

3       700

dtype: object

Example 5: You may use loop for defining index sequences 

Program : 

import pandas as pd

a=pd.Series(range(1,10,2),index=[x for x in 'xyzpq'])

print(a)

Output:

x    1

y    3

z    5

p    7

q    9

dtype: int64

NOTE: Length of values (5) does  match length of index (5)

2.3 Specify Data Type along with data and index

Example 1:

Program:

import pandas as pd

import numpy as np

a=pd.Series(data=[100,23,65,73],index=['a','b',30,40],dtype=np.float64)

print(a)

Output:

a     100.0

b      23.0

30     65.0

40     73.0

dtype: float64

2.4 using Mathematical Function/Expression to Create Data Array in Series()

Example 1:

Program:

import pandas as pd

a=[10,20,30,4]

a=pd.Series(data=a*2)

print(a)

Output: 

0    10

1    20

2    30

3     4

4    10

5    20

6    30

7     4

dtype: int64

3.Series Object Attributes

The statement about explaining 13 attributes in the context of the Series object in Pandas might not be entirely accurate. While the Series object has several attributes, there aren't specifically 13 commonly explained in most basic tutorials or chapters.

Commonly covered attributes of a Series in Pandas include:

  1. index: The labels for the Series.
  2. values: The data values in the Series.
  3. dtype: The data type of the Series.
  4. name: The name of the Series.
  5. shape: The dimensions of the Series.
  6. size: The number of elements in the Series.
  7. ndim: The number of dimensions (always 1 for Series).
  8. empty: Returns True if the Series is empty.
  9. hasnans: Checks if the Series contains any NaN values.
  10. nbytes:Returns the total number of bytes consumed by the Series in memory.
  11. itemsize:Indicates the number of bytes consumed by each element in the Series.
  12. T: Transpose the Series (swap rows and columns)
  13. index.name:The name of the Series index (if assigned).

A) Retriving Index Array (index attribute) & Data Array (Values attribute) of a Series object

          You can Access index array and index values array of an existing object obj1

           >>>obj1.index

Index(['jan', 'Feb', 'Mar'], dtype='object')

          >>>obj1.values

[31 28 31]

B) Setting the Index Name

By default index has no name for its indexes but you can set its index name by assigning string to its obj1.index.name attribute

    Program:

    import pandas as pd

    obj1=pd.Series([1,5,6,90],index=['a','b','c','d'])

    obj1.index.name="newID"

    print(obj1)

    Output:

    newID

    a     1

    b     5

    c     6

    d    90

    dtype: int64    

C) Setting Series Name

The <Series>.Name attribute can be used to get or set the name of a Series object

    Program:

    import pandas as pd

    obj1=pd.Series([1,5,6,90],index=['a','b','c','d'])

    obj1.name='My'

    print(obj1.name)

    Output:

    My

D) Retrieving Data Type (dtype) and Size of Type () itemsize

    Syntax:

    Obj.dtype

    Obj.itemsize

    Program

    import pandas as pd

    obj1=pd.Series([1,5,6,90],index=['a','b','c','d'])

    print("Item Size:  ",obj1.values.itemsize)

    print("dtype:  ",obj1.dtype)

    print("type() function:   ",type(obj1))

    Output

Item Size:   8

dtype:   int64

type() function:    <class 'pandas.core.series.Series'>

E) Retrieving Shape

    Program:

    import pandas as pd

    obj1=pd.Series([1,5,6,90],index=['a','b','c','d'])

    print("Shape:  ",obj1.shape)

    Output:

    Shape:   (4,)


F) Retrieving Dimension (Number of Axis:ndim attribute), Size (size attribute) and Number of Bytes(nbytes attributes)

Program:

import pandas as pd

obj1=pd.Series([1,5,6,90],index=['a','b','c','d'])

print("SHAPE:  ",obj1.ndim)

print("SIZE:  ",obj1.size)

print("NBYTES:  ",obj1.nbytes)

Output:

SHAPE:   1

SIZE:   4

NBYTES:   32

G) Checking Emptiness (Empty attributes) and Presence of NaNs (hasnans attributes)

        Program:

import pandas as pd

obj1=pd.Series([1,5,6,90],index=['a','b','c','d'])

print("EMPTY:  ",obj1.empty)

print("HASNANS:  ",obj1.hasnans)

print("Count():  ",obj1.count())

print("len:  ",len(obj1))

Output

EMPTY:   False

HASNANS:   False

Count():   4

len:   4


Post a Comment

0 Comments