AlphaReport#

class sigtech.framework.experimental.analytics.alpha_report.alpha_report.AlphaReport

Generate a suite of plots to analyse the utility of feature(s) in contributing to a signal on target variable(s).

Example usage:

ar = AlphaReport(x_df=df_features, y_df=df_targets)

# Required to be in interactive jupyter lab/notebook environment.
ar.interactive_report()

# Individual plots can be tailored and presented in a single, continuous cell output.
ar.configuration = [
    ('Metric Grid', {'metric': 'Correlation', 'window_min': 0, 'window_max': 1, 'shift': None,
     'relationship': 'None', 'measure': 'mean'}),
    ('Marginal', {'series': 'Feature - Good0', 'histogram': True, 'hist_bins': 20, 'kde': True, 'kde_bw': 1,
     'rugplot': True}),
]
ar.report()

# If required, the plot data can be accessed for use in your own plots/analysis.
data = ar.get_plot_data(
    'Metric Grid',
    **{'metric': 'Correlation', 'window_min': 0, 'window_max': 1, 'shift': None, 'relationship':'None',
       'measure': 'mean'}
)
Parameters
  • x_df – Series or DataFrame of feature variables that may be helpful in predicting y_df.

  • y_df – Series or DataFrame of target variables to regress against (optional).

  • configuration – Optional list of config dictionaries for each requested plot.

  • highlight – Optional list of feature columns to highlight or dict of feature columns to matplotlib colours.

report(plots=None)

Generate and display static plots.

Either a string (or list of string) representing a plot name is passed, or a configuration must be specified.

  • If plots is set, a default configuration will be used.

  • The configuration can be set using the AlphaReport constructor or manually set via its member configuration. The configuration is expected to be a list of tuples, the first tuple element should be a str specifying the plot name and the second element should be a dict of arguments for the plotting method.

Available plot names can be found in AlphaReport.available_plots(). Required plotting arguments will be requested if not available or can be found in the plots.py file.

Example usage:

ar = sig.AlphaReport(x_df=df_features, y_df=df_targets)

# Individual plots can be tailored and presented in a single, continuous cell output.
ar.configuration = [
    ('Metric Grid', {'metric': 'Correlation', 'window': None, 'measure': None, 'shift': None,
     'relationship': 'None', 'window_min': 1, 'window_max': 10}),
    ('Dendrogram', {'linkage': 'ward', 'distance': 'correlation', 'variables': 'All', 'y_log': True}),
    ('ROC Curves', {'y_col': 'Returns', 'y_threshold': '0.5'})
]
ar.report()

# Alternatively, a plot or list of plots with default configurations can be provided.
ar.report('Dendrogram')
Parameters

plots – str or list of str representing individual plots.

classmethod available_plots() pandas.core.frame.DataFrame

Return a DataFrame containing information about the available plots.

interactive_report()

Generate the full suite of plots and display in ipywidgets to allow user browsing and parameter choices.

Note

The method must be called from a Jupyter Lab or Notebook session in order to display the widgets.

add_metric(metric_fun: callable, name: Optional[str] = None, short_name: Optional[str] = None, min_value: Optional[float] = None, max_value: Optional[float] = None)

Add a user-defined metric to AlphaReport.

Example usage:

def my_metric(x: pd.Series, y: pd.Series, **kwargs):
    return x.min() * y.max()

ar = sig.AlphaReport(x_df=df_features, y_df=df_targets)
ar.add_metric(my_metric, 'My Metric')
ar.interactive_report()
Parameters
  • metric_fun – Metric implementation callable accepting as input one or two pandas.Series.

  • name – Display name for the metric (optional, derived from the function name if not provided).

  • short_name – Short display name for the metric (optional).

  • min_value – Minimum value accepted for the metric (optional).

  • max_value – Maximum value accepted for the metric (optional).

remove_metric(name: str)

Remove a user-defined metric from AlphaReport.

Parameters

name – Display name for the metric.

get_plot_data(plot_name: str, **kwargs)

Return the data used in the specified plot.

Keyword arguments must be passed for all selections available in the plotting tab even if they are only used for visualisation.

Example usage:

sr.get_plot_data('ROC Curves', y_col='Returns')
Parameters

plot_name – Plot name shown in the tab title.

Returns

Plotting data, format will be different for each plot.