Python Snippets

By | 09/07/2017

Create new data frame including only certain columns

data = pd.read_csv(‘data.csv’)
#Select first 22 columns and create dataframe
df = data[data.columns[:22]]

Add new column with the difference between two dates

#datetime.new() gives current date time
df[‘new_date_column’] = datetime.now() – pd.to_datetime(df[‘date_to_be_calculatd_from’])

Reordering columns

cols = list(df.columns.values)
cols
reindexed = [‘id’, …., ‘last_column_name’]
reindexed_df = df.reindex(columns=reindexed)

Create Dictionary of stats info from data frame

def createStats(data):
    stats_values[‘mean_val’] = np.mean(data)
    stats_values[‘median’] = np.median(data)
    stats_values[‘standard_dev’] = np.std(data)

stats_values = {}
createStats(y)