Pandas is a popular library for data analysis and manipulation in Python. It provides data structures for efficiently storing and manipulating large datasets, including support for handling missing data, merging datasets, and reshaping data. Pandas provides a wide range of data manipulation functions, including filtering, grouping, aggregation, and more. Pandas also provides functions for reading and writing data in various formats, including CSV, Excel, SQL, and more.
The following code snippet demonstrates some of the basic data manipulation functions in Pandas:
import pandas as pd
# create a dataframe
df = pd.DataFrame({
'name': ['Alice', 'Bob', 'Charlie'],
'age': [25, 30, 35],
'gender': ['F', 'M', 'M'],
'salary': [50000, 60000, 70000]
})
# select columns
df[['name', 'age']]
# filter rows
df[df['age'] > 30]
# group and aggregate
df.groupby('gender')['salary'].mean()