Python/Pandas
< Python
Pandas: Torolalana Fohy ho an'ny Fampiasa Mandroso
hanovaMultiIndex (Hierarchical Indexing)
hanova# Mamorona DataFrame MultiIndex
arrays = [['A', 'A', 'B', 'B'], ['one', 'two', 'one', 'two']]
index = pd.MultiIndex.from_arrays(arrays, names=('first', 'second'))
df = pd.DataFrame({'value': [10, 20, 30, 40]}, index=index)
# Fidirana amin'ny angon-drakitra MultiIndex
df.loc['A', 'one']
Famolavolana amin'ny `stack` sy `unstack`
hanova# Unstacking ny DataFrame MultiIndex
df.unstack('second')
# Stacking indray
df.stack()
Famakafakana Time Series
hanova# DataFrame Time Series
date_rng = pd.date_range(start='2024-01-01', periods=5, freq='D')
df = pd.DataFrame(date_rng, columns=['date']).set_index('date')
df['data'] = np.random.randint(0, 100, size=len(date_rng))
# Resample sy Rolling
df.resample('W').sum()
df['rolling_mean'] = df['data'].rolling(window=3).mean()
Angon-drakitra Karazana (Categorical Data)
hanova# Mamorona sy mampiasa angon-drakitra karazana (Categorical Data)
df['category'] = pd.Categorical(['A', 'B', 'A', 'C', 'B'], categories=['A', 'B', 'C'])
df['category'].cat.codes
Fampiasana Method Chaining amin'ny `pipe`
hanova# Asa manokana sy chaining method
def add_ten(df): df['value'] += 10; return df
df.pipe(add_ten).pipe(lambda df: df[df['value'] > 12])
Fikarohana amin'ny `query` sy `eval`
hanova# Fikarohana sy fampitahana manazava (evaluate) expressions
df.query('A > 2 & B < 14')
df['D'] = df.eval('A + B + C')
Fampiasa String Vectorized
hanova# Asa amin'ny string mampiasa `.str` accessor
df['text_upper'] = df['text'].str.upper()
df['text_split'] = df['text'].str.split()
df['contains_pandas'] = df['text'].str.contains('Pandas')
Fikirakirana DataFrame Lehibe amin'ny `Dask`
hanovaimport dask.dataframe as dd
# Dask DataFrame avy amin'ny CSV
ddf = dd.read_csv('large_dataset.csv')
ddf.groupby('column_name').sum().compute()
Fandidiana Data I/O Efficient
hanova# Fikirakirana rakitra amin'ny format samihafa
df.to_csv('data.csv')
df.to_excel('data.xlsx')
df.to_sql('table_name', con=sqlalchemy_engine)
df.to_parquet('data.parquet')
DataFrame Sparse
hanova# Mamorona DataFrame Sparse
df = pd.DataFrame({'A': [0, 1, 0, 0, 5], 'B': [0, 0, 3, 0, 0]}).astype(pd.SparseDtype(int, fill_value=0))
df.memory_usage(deep=True)
Fampiasana Transformations Manokana amin'ny `apply`
hanova# Fampiharana asa manokana isaky ny andalana
def custom_transformation(row): return row['A'] * row['B']
df['new_column'] = df.apply(custom_transformation, axis=1)
Profiling Data amin'ny `pandas_profiling`
hanovafrom pandas_profiling import ProfileReport
# Mamorona tatitra profiling
profile = ProfileReport(df, title="Tatitra Profiling Pandas")
profile.to_file("report.html")
Fampiasana `GroupBy` amin'ny Fanaovana Aggregations Maro
hanova# Group by sy aggregations maro
df.groupby('category').agg({'value1': ['sum', 'mean'], 'value2': ['max', 'min']})
Fampiratiana Sary mandroso amin'ny `plot`
hanova# Famoronana bar chart sy histogram
df.plot(kind='bar', x='category', y='values')
df['values'].plot(kind='hist', bins=5)
Asa Rolling Window Manokana
hanova# Asa rolling window manokana
def custom_rolling_func(x): return np.sum(x) * 0.5
df['custom_rolling'] = df['value'].rolling(window=3).apply(custom_rolling_func)
Fampiasana Vectorized amin'ny `NumPy`
hanovaimport numpy as np
# Asa isaky ny singa amin'ny NumPy
df['C'] = np.multiply(df['A'], df['B'])