margot.data.column

Module contents

class margot.data.column.BaseColumn(time_series: str, *args, **kwargs)

BaseColumn is the super class for implementing Columns.

Generally, you will only need to extend BaseColumn if you are implementing a new data provider.

A Column represents a single time series of a symbol.

Examples of commonly used time-series are adjusted_close, open, highh, low close, volume. However columns can also be used to represent fundamental time-series, or time- series from alternative sources.

To implement a new type of Column, you must implement the ‘fetch’ method.

Example:

class MyDataProvider(BaseColumn):

    def fetch(self, symbol):
        df = get_my_dataframe(symbol)
        return self.clean(df)

Optionally, you may also need to perform additional cleaning of the data, you can do this by extending the clean() method. Don’t forget to call super().clean(df).

Example:

class MyDataProvider(BaseColumn):

    def clean(self, df):
        df = df.rename(mapper={
            'Open': 'open',
            'High': 'high',
            'Low': 'low',
            'Close': 'close',
        }, axis='columns')

        return super().clean(df)

When using an implementation of a subclass of BaseColumn, users are expected to at least specify the time_series that they want to access.

Parameters

time_series (str) – the name of the time_series that will be returned.

get_label()

Return the label for this column.

clone()

Return a new instance of oneself.

setup(symbol: str, env: dict = {})

Setup the column.

Called by the Symbol so that the symbol name can be passed.

clean(df)

Clean the data.

load_or_fetch_series(symbol: str)

Load of fetch the Dataframe, return the series.

In order to return the time-series, first determine if we have it and can return it, or if we need to fetch it.

TODO: Test for up-to-dateness (or maybe that happens in Symbol)?

Parameters

symbol (str) – the name of the symbol to fetch.

Returns

time-series of the column

Return type

pd.Series

refresh()

Refresh the data from the source.

Returns

the whole dataframe (cleaned)

Return type

pd.DataFrame

load(symbol: str)

Load it.

save(df, symbol)

Save it.

property series

Get the data series as a pandas series.

Returns

time series of the field

Return type

pd.Series

simulate(when)

Simulate a time ub history.

Parameters

when (datetime) – (optional) used when simulating historical data, typically using margot.backtest.

property latest

Return the latest value in this series.