Series Data Structure
- Series is an important data structure of pandas.
- It represents a one-dimensional array of indexed data.
- A Series type object has two main components:
→ an array of actual data
→ an associated array of indexes (Numeric index) or data labels (Labelled index).
- Both Components are one Dimensional arrays with the same length.
- Index is used to accesss individual data values.
Index | Data |
0 | 20 |
1 | 10 |
2 | 30 |
3 | 34 |
Index | Data |
Jan | 31 |
Feb | 23 |
Mar | 38 |
Apr | 98 |
Index | Data |
“A” | 97 |
“B” | 65 |
“C” | 43 |
“D” | 78 |
NOTE : Index: integer, string, char
1) Creating Series Objects
1.1 Create empty Series object by using just the Series() with no parameter
To Create an empty object. Having no values
<Series object>=pandas.Series()
NOTE:S is in Uppercase
For Example:
>>>import pandas as pd
>>> obj1=pd.Series()
>>> obj1
Series([],dtype=float64)
1.2 Creating Non-empty Series Objects
To create non-empty Series object, you need to specify arguments for data and indexes as per the following syntax
<Series object>=pd.Series(data,index=idx)
where idx is a valid Numpy datatype and data is the data part of the Series object,
NOTE: data
→ A Python sequence
→ An ndarray
→A Python dictionary
→A scalar value
1.2.1 Specify data as Python Sequences
<Series object> = Series(any python sequences)
Example 1: Sequences data - List
Program:
import pandas as pd
s1=pd.Series([4,5,6,7])
print("Series Object is : ")
print(s1)
Output:
Series Object is :
0 4
1 5
2 6
3 7
dtype: int64
Example 2: Sequences data - Tuple
Program:
import pandas as pd
s1=pd.Series((40,85,69,27))
print("Series Object is : ")
print(s1)
Output:
Series Object is :
0 40
1 85
2 69
3 27
dtype: int64
Example 3:#Sequances data - Individual Character
Program:
import pandas as pd
s1=pd.Series(['a','e','i','o','u'])
print("Series Object : ")
print(s1)
Output:
Series Object :
0 a
1 e
2 i
3 o
4 u
dtype: object
Example 4:#Sequances data - String
Program:
import pandas as pd
s1=pd.Series("Pythontyro")
print("Series Object : ")
print(s1)
Output:
Series Object :
0 Pythontyro
dtype: object
Example 5: Series object using three different words:
Program:
import pandas as pd
s1=pd.Series(["I","class12","IP"])
print("Series Object : ")
print(s1)
Output:
Series Object :
0 I
1 class12
2 IP
dtype: object
1.2.2 Specify data as an ndarray
The data attributes can be an ndarray
Example 1:#Data as an ndarray
Program:
import numpy as np
import pandas as pd
nd1=np.arange(2,30,4.5)
print("ndarray is",nd1)
s1=pd.Series(nd1)
print("Series Object : ")
print(s1)
Output:
ndarray is: [ 2. 6.5 11. 15.5 20. 24.5 29. ]
Series Object :
0 2.0
1 6.5
2 11.0
3 15.5
4 20.0
5 24.5
6 29.0
dtype: float64
Example 2: #Data as an ndarray :linspace interval
Program:
import numpy as np
import pandas as pd
s1=pd.Series(np.linspace(24,64,5))
print("Series Object : ")
print(s1)
Output:
Series Object :
0 24.0
1 34.0
2 44.0
3 54.0
4 64.0
dtype: float64
Example 3:#Data as an ndarray :creating tiling a list
#tile(A,B) A is num of time given by replication, b is replication
Program:
import numpy as np
import pandas as pd
s1=pd.Series(np.tile([3,5,4],2))
print("Series Object : ")
print(s1)
Output:
Series Object :
0 3
1 5
2 4
3 3
4 5
5 4
dtype: int32
1.2.3 Specify data as a python Dictionary
Program:
#Data - Dictionary
#key-index and values -data
import pandas as pd
s1=pd.Series({10:"one","python":"two"})
print("Series Object : ")
print(s1)
Output
Series Object :
10 one
python two
dtype: object
1.2.4 Specify data as a Scalar value Object
The data can be in the form of single values.
But if data is a scalar value then the index arguments to Series function must be provided.
Data Match to represented index
Example 1:
Program: Scalar value is 6
import pandas as pd
a=pd.Series(6,index=range(0,2))
print(a)
Output:
0 6
1 6
dtype: int64
Example 2:Scalar value - 600,
index - list
Program:
import pandas as pd
a=pd.Series(600,index=['one','two','four'])
print(a)
Output:
one 600
two 600
four 600
dtype: int64
2.Creating Series Objects-Additional Functionality
2.1 Specifying /Adding NaN values in a Series Object
You can fill missing data with a NaN(Not a Number) values
Example 1 :
Program:
import pandas as pd
import numpy as np
a=pd.Series(["2",np.NaN,None,'5.25','a','program'])
print(a)
NOTE: Use np.NaN or None to add missing data
Output:
0 2
1 NaN
2 None
3 5.25
4 a
5 program
dtype: object
2.2 Specifying index(es) as well as data with Series()
Example 1: Both data and index is None
Program:
import pandas as pd
a=pd.Series(data=None,index=None)
print(a)
(or)
import pandas as pd
a=pd.Series()
print(a)
Output:
The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning.
a=pd.Series(data=None,index=None)
Example 2: You could skip keyword data or Index
Program:
import pandas as pd
a=[10,4,59,100]
i=['x','python',3.24,700]
a=pd.Series(a,i)
print(a)
Output:
x 10
python 4
3.24 59
700 100
dtype: int64
Example 3: Both data and index have to be sequences ,None is taken by default, if you skip these parameters
Program:
import pandas as pd
a=[10,4,59,100]
i=['x','python',3.24,700]
a=pd.Series(a)
print(a)
Output:
0 10
1 4
2 59
3 100
dtype: int64
Example 4:
Program
import pandas as pd
a=[10,4,59,100]
i=['x','python',3.24,700]
a=pd.Series(i)
print(a)
Output
0 x
1 python
2 3.24
3 700
dtype: object
Example 5: You may use loop for defining index sequences
Program :
import pandas as pd
a=pd.Series(range(1,10,2),index=[x for x in 'xyzpq'])
print(a)
Output:
x 1
y 3
z 5
p 7
q 9
dtype: int64
NOTE: Length of values (5) does match length of index (5)
2.3 Specify Data Type along with data and index
Example 1:
Program:
import pandas as pd
import numpy as np
a=pd.Series(data=[100,23,65,73],index=['a','b',30,40],dtype=np.float64)
print(a)
Output:
a 100.0
b 23.0
30 65.0
40 73.0
dtype: float64
2.4 using Mathematical Function/Expression to Create Data Array in Series()
Example 1:
Program:
import pandas as pd
a=[10,20,30,4]
a=pd.Series(data=a*2)
print(a)
Output:
0 10
1 20
2 30
3 4
4 10
5 20
6 30
7 4
dtype: int64
3.Series Object Attributes
The statement about explaining 13 attributes in the context of the Series object in Pandas might not be entirely accurate. While the Series object has several attributes, there aren't specifically 13 commonly explained in most basic tutorials or chapters.
Commonly covered attributes of a Series in Pandas include:
- index: The labels for the Series.
- values: The data values in the Series.
- dtype: The data type of the Series.
- name: The name of the Series.
- shape: The dimensions of the Series.
- size: The number of elements in the Series.
- ndim: The number of dimensions (always 1 for Series).
- empty: Returns
True
if the Series is empty. - hasnans: Checks if the Series contains any
NaN
values. - nbytes:Returns the total number of bytes consumed by the Series in memory.
- itemsize:Indicates the number of bytes consumed by each element in the Series.
- T: Transpose the Series (swap rows and columns)
- index.name:The name of the Series index (if assigned).
A) Retriving Index Array (index attribute) & Data Array (Values attribute) of a Series object
You can Access index array and index values array of an existing object obj1
>>>obj1.index
Index(['jan', 'Feb', 'Mar'], dtype='object')
>>>obj1.values
[31 28 31]
By default index has no name for its indexes but you can set its index name by assigning string to its obj1.index.name attribute
Program:
import pandas as pd
obj1=pd.Series([1,5,6,90],index=['a','b','c','d'])
obj1.index.name="newID"
print(obj1)
Output:
newID
a 1
b 5
c 6
d 90
dtype: int64
C) Setting Series Name
The <Series>.Name attribute can be used to get or set the name of a Series object
Program:
import pandas as pd
obj1=pd.Series([1,5,6,90],index=['a','b','c','d'])
obj1.name='My'
print(obj1.name)
Output:
My
D) Retrieving Data Type (dtype) and Size of Type () itemsize
Syntax:
Obj.dtype
Obj.itemsize
Program
import pandas as pd
obj1=pd.Series([1,5,6,90],index=['a','b','c','d'])
print("Item Size: ",obj1.values.itemsize)
print("dtype: ",obj1.dtype)
print("type() function: ",type(obj1))
Output
Item Size: 8
dtype: int64
type() function: <class 'pandas.core.series.Series'>
Program:
import pandas as pd
obj1=pd.Series([1,5,6,90],index=['a','b','c','d'])
print("Shape: ",obj1.shape)
Output:
Shape: (4,)
F) Retrieving Dimension (Number of Axis:ndim attribute), Size (size attribute) and Number of Bytes(nbytes attributes)
Program:
import pandas as pd
obj1=pd.Series([1,5,6,90],index=['a','b','c','d'])
print("SHAPE: ",obj1.ndim)
print("SIZE: ",obj1.size)
print("NBYTES: ",obj1.nbytes)
Output:
SHAPE: 1
SIZE: 4
NBYTES: 32
import pandas as pd
obj1=pd.Series([1,5,6,90],index=['a','b','c','d'])
print("EMPTY: ",obj1.empty)
print("HASNANS: ",obj1.hasnans)
print("Count(): ",obj1.count())
print("len: ",len(obj1))
Output
EMPTY: False
HASNANS: False
Count(): 4
len: 4
0 Comments