Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching #59

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified docs/Python/_images/articles_AGBModelVisualisation_31_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/Python/_images/articles_graph_gallery_5_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 11 additions & 0 deletions docs/Python/_sources/autoapi/pdstools/ADMDatamart/index.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -249,6 +249,17 @@ Classes
:rtype: pd.DataFrame


.. py:method:: save_data(path: str = '.') -> Tuple[os.PathLike, os.PathLike]

Cache modelData and predictorData to files.

:param path: Where to place the files
:type path: str

:returns: The paths to the model and predictor data files
:rtype: (os.PathLike, os.PathLike)


.. py:method:: fix_pdc(df: pandas.DataFrame) -> pandas.DataFrame
:staticmethod:

Expand Down
49 changes: 39 additions & 10 deletions docs/Python/_sources/autoapi/pdstools/cdh_utils/index.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Functions
pdstools.cdh_utils.readZippedFile
pdstools.cdh_utils.get_latest_file
pdstools.cdh_utils.getMatches
pdstools.cdh_utils.cache_to_file
pdstools.cdh_utils.safe_range_auc
pdstools.cdh_utils.auc_from_probs
pdstools.cdh_utils.auc_from_bincounts
Expand All @@ -40,13 +41,14 @@ Functions



.. py:function:: readDSExport(filename: Union[pandas.DataFrame, str], path: str = '.', verbose: bool = True, **kwargs) -> pandas.DataFrame
.. py:function:: readDSExport(filename: Union[pandas.DataFrame, str], path: str = '.', verbose: bool = True, **kwargs) -> Union[pandas.DataFrame, polars.DataFrame]

Read a Pega dataset export file.
Can accept either a Pandas DataFrame or one of the following formats:
- .csv
- .json
- .zip (zipped json or CSV)
- .feather

It automatically infers the default file names for both model data as well as predictor data.
If you supply either 'modelData' or 'predictorData' as the 'file' argument, it will search for them.
Expand All @@ -61,25 +63,35 @@ Functions
:type path: str, default = '.'
:param verbose: Whether to print out which file will be imported
:type verbose: bool, default = True
:param Keyword arguments: Any arguments to plug into the read csv or json function, from either PyArrow or Pandas.

:returns: * *pd.DataFrame* -- The read data from the given file
* *Examples* -- >>> df = readDSExport(filename = 'modelData', path = './datamart')
>>> df = readDSExport(filename = 'ModelSnapshot.json', path = 'data/ADMData')
:keyword return_pl: Whether to return polars dataframe
If false, transforms to Pandas
:kwtype return_pl: bool
:keyword Any arguments to plug into the read csv or json function, from either Polars or Pandas.:

>>> df = pd.read_csv('file.csv')
>>> df = readDSExport(filename = df)
:returns: The read data from the given file
:rtype: pd.DataFrame

.. rubric:: Examples

>>> df = readDSExport(filename = 'modelData', path = './datamart')
>>> df = readDSExport(filename = 'ModelSnapshot.json', path = 'data/ADMData')

>>> df = pd.read_csv('file.csv')
>>> df = readDSExport(filename = df)

.. py:function:: import_file(file, extension, **kwargs)

.. py:function:: import_file(file, extension, **kwargs) -> Union[polars.DataFrame, pandas.DataFrame]

.. py:function:: readZippedFile(file: str, verbose: bool = False, **kwargs) -> pandas.DataFrame
Imports a file using Polars


.. py:function:: readZippedFile(file: str, verbose: bool = False, **kwargs) -> polars.DataFrame

Read a zipped file.
Reads a dataset export file as exported and downloaded from Pega. The export
file is formatted as a zipped multi-line JSON file or CSV file
and the data is read into a pandas dataframe.
and the data is read into a Polars dataframe.

:param file: The full path to the file
:type file: str
Expand Down Expand Up @@ -113,6 +125,23 @@ Functions
.. py:function:: getMatches(files_dir, target)


.. py:function:: cache_to_file(df: Union[polars.DataFrame, pandas.DataFrame], path: os.PathLike, name: str) -> os.PathLike

Very simple convenience function to cache data.

Caches in feather (arrow) format for very fast reading.

:param df: The dataframe to cache
:type df: Union[pl.DataFrame, pd.DataFrame]
:param path: The location to cache the data
:type path: os.PathLike
:param name: The name to give to the file
:type name: str

:returns: The filepath to the cached file
:rtype: os.PathLike


.. py:function:: safe_range_auc(auc: float) -> float

Internal helper to keep auc a safe number between 0.5 and 1.0 always.
Expand Down
34 changes: 27 additions & 7 deletions docs/Python/_sources/autoapi/pdstools/index.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -288,6 +288,17 @@ Functions
:rtype: pd.DataFrame


.. py:method:: save_data(path: str = '.') -> Tuple[os.PathLike, os.PathLike]

Cache modelData and predictorData to files.

:param path: Where to place the files
:type path: str

:returns: The paths to the model and predictor data files
:rtype: (os.PathLike, os.PathLike)


.. py:method:: fix_pdc(df: pandas.DataFrame) -> pandas.DataFrame
:staticmethod:

Expand Down Expand Up @@ -437,13 +448,14 @@ Functions



.. py:function:: readDSExport(filename: Union[pandas.DataFrame, str], path: str = '.', verbose: bool = True, **kwargs) -> pandas.DataFrame
.. py:function:: readDSExport(filename: Union[pandas.DataFrame, str], path: str = '.', verbose: bool = True, **kwargs) -> Union[pandas.DataFrame, polars.DataFrame]

Read a Pega dataset export file.
Can accept either a Pandas DataFrame or one of the following formats:
- .csv
- .json
- .zip (zipped json or CSV)
- .feather

It automatically infers the default file names for both model data as well as predictor data.
If you supply either 'modelData' or 'predictorData' as the 'file' argument, it will search for them.
Expand All @@ -458,14 +470,22 @@ Functions
:type path: str, default = '.'
:param verbose: Whether to print out which file will be imported
:type verbose: bool, default = True
:param Keyword arguments: Any arguments to plug into the read csv or json function, from either PyArrow or Pandas.

:returns: * *pd.DataFrame* -- The read data from the given file
* *Examples* -- >>> df = readDSExport(filename = 'modelData', path = './datamart')
>>> df = readDSExport(filename = 'ModelSnapshot.json', path = 'data/ADMData')
:keyword return_pl: Whether to return polars dataframe
If false, transforms to Pandas
:kwtype return_pl: bool
:keyword Any arguments to plug into the read csv or json function, from either Polars or Pandas.:

:returns: The read data from the given file
:rtype: pd.DataFrame

.. rubric:: Examples

>>> df = readDSExport(filename = 'modelData', path = './datamart')
>>> df = readDSExport(filename = 'ModelSnapshot.json', path = 'data/ADMData')

>>> df = pd.read_csv('file.csv')
>>> df = readDSExport(filename = df)
>>> df = pd.read_csv('file.csv')
>>> df = readDSExport(filename = df)


.. py:function:: CDHSample(plotting_engine='plotly', query=None)
Expand Down
395 changes: 197 additions & 198 deletions docs/Python/articles/AGBModelVisualisation.html

Large diffs are not rendered by default.

Loading