Pandas - Series

  • Importing Pandas library
  • Creating series
  • Converting a list to a pandas series
  • Checking type
  • Statistical Functions

    1. count(): number of observations excluding missing values

    2. sum(): sum of values

    3. mean(): mean of Values

    4. median(): median of Values

    5. mode(): mode of values

    6. std(): standard Deviation of the Values

    7. min(): minimum Value

    8. max(): maximum Value

    9. value_counts(): number of unique observations

  • Unique Values
  • Descriptive statistics
  • Sorting values
  • Saving Changes

Importig pandas Library

In [2]:
import pandas as pd

Creating Series

- A pandas series is a single dimension column of data

- \n -> prints the result in the next line
In [3]:
from pandas import Series

my_salary_series = Series ([80000,50000,45000], index= ['john', 'nancy','henry'])

my_salary_series
Out[3]:
john     80000
nancy    50000
henry    45000
dtype: int64
In [4]:
print("\n - John's Salary is:", my_salary_series['john'])

print("\n - Salaries less than 50000 is: ", my_salary_series[my_salary_series<50000])

print("\n - All Salaries: ", my_salary_series.values)

print("\n - All Indexes: ", my_salary_series.index)
 - John's Salary is: 80000

 - Salaries less than 50000 is:  henry    45000
dtype: int64

 - All Salaries:  [80000 50000 45000]

 - All Indexes:  Index(['john', 'nancy', 'henry'], dtype='object')

Converting List to Series

In [5]:
my_list = [10,44,30,75,62,67,30]

my_series = pd.Series(my_list)

my_series
Out[5]:
0    10
1    44
2    30
3    75
4    62
5    67
6    30
dtype: int64
In [6]:
print('\n - my series all values:\n ', my_series.values)

print('\n - my series all indexs:\n ', my_series.index)

print('\n - Observation with index 3 : ', my_series[3])

print('\n - Observation with index 5 : ', my_series[5])

print('\n - Observation less than 50: \n ', my_series[my_series<50])
 - my series all values:
  [10 44 30 75 62 67 30]

 - my series all indexs:
  RangeIndex(start=0, stop=7, step=1)

 - Observation with index 3 :  75

 - Observation with index 5 :  67

 - Observation less than 50: 
  0    10
1    44
2    30
6    30
dtype: int64

Checking Type

In [7]:
print("Type is: ", type(my_series))

print("Type is: ", type(my_salary_series))
Type is:  <class 'pandas.core.series.Series'>
Type is:  <class 'pandas.core.series.Series'>

Statistical Functions

1.  count(): number of observations excluding missing values

2.  sum(): sum of values

3.  mean(): mean of Values

4.  median(): median of Values

5.  mode(): mode of values

6.  std(): standard Deviation of the Values

7.  min(): minimum Value

8.  max(): maximum Value

9.  value_counts():  number of unique observations

Finding Maximum

In [14]:
my_series.max()
Out[14]:
75

Finding Minimun

In [9]:
my_series.min()
Out[9]:
10

Finding Mean

In [10]:
my_series.mean()
Out[10]:
45.42857142857143

Finding Median

In [11]:
my_series.median()
Out[11]:
44.0

Finding Mode

In [12]:
my_series.mode()
Out[12]:
0    30
dtype: int64

Finding Sum of observatios

In [95]:
my_series.sum()
Out[95]:
318

Finding total number of observations

- exculing Null Observations -> missing values
In [13]:
my_series.count()
Out[13]:
7

Counting the number of unique observations

In [96]:
my_series.value_counts()
Out[96]:
30    2
67    1
62    1
44    1
75    1
10    1
dtype: int64

Finding Unique Values

In [16]:
my_series.unique()
Out[16]:
array([10, 44, 30, 75, 62, 67])

Descriptive Statistics

In [15]:
my_series.describe()
Out[15]:
count     7.000000
mean     45.428571
std      23.621016
min      10.000000
25%      30.000000
50%      44.000000
75%      64.500000
max      75.000000
dtype: float64

Sorting my_series (default: ascending)

In [97]:
my_series.sort_values()
Out[97]:
0    10
2    30
6    30
1    44
4    62
5    67
3    75
dtype: int64

Sorting my_series (descending)

In [98]:
my_series.sort_values(ascending=False)
Out[98]:
3    75
5    67
4    62
1    44
6    30
2    30
0    10
dtype: int64

Save Changes (two ways)

- 1. my_series.sort_values(ascending=False, inplace=True)

- 2. my_series = my_series.sort_values(ascending=False)

First way

In [99]:
my_series.sort_values(ascending=False, inplace=True)

my_series
Out[99]:
3    75
5    67
4    62
1    44
6    30
2    30
0    10
dtype: int64

Second way

In [100]:
my_series = my_series.sort_values(ascending=False)

my_series
Out[100]:
3    75
5    67
4    62
1    44
2    30
6    30
0    10
dtype: int64