DatabaseManager

DatabaseManager is the base class providing the methods to interact with databases.

class pandemy.DatabaseManager(container=None, engine_config=None, **kwargs)[source]

Bases: abc.ABC

Base class with functionality for managing a database.

Each database type will subclass from DatabaseManager and implement the initializer which is specific to each database type. DatabaseManager is never used on its own, but merely provides the methods to interact with the database to its subclasses.

Initialization of a DatabaseManager creates the connection string (conn_str) and the database engine, which are used to connect to and interact with the database. These are available as attributes on the instance. The initializer can contain any number of parameters needed to connect to the database and should always support container, engine_config and **kwargs.

Parameters
Attributes
  • conn_str (str) – The connection string for connecting to the database.

  • engine (sqlalchemy.engine.Engine) – The engine for interacting with the database.

delete_all_records_from_table(table, conn)[source]

Delete all records from the specified table.

Parameters
Raises
execute(sql, conn, params=None)[source]

Execute a SQL statement.

Parameters
  • sql (str or sqlalchemy.sql.elements.TextClause) – The SQL statement to execute. A string value is automatically converted to a sqlalchemy.sql.elements.TextClause with the sqlalchemy.sql.expression.text() function.

  • conn (sqlalchemy.engine.base.Connection) – An open connection to the database.

  • params (dict or list of dict or None, default None) –

    Parameters to bind to the SQL statement sql. Parameters in the SQL statement should be prefixed by a colon (:) e.g. :myparameter. Parameters in params should not contain the colon ({'myparameter': 'myvalue'}).

    Supply a list of parameter dictionaries to execute multiple parametrized statements in the same method call, e.g. [{'parameter1': 'a string'}, {'parameter2': 100}]. This is useful for INSERT, UPDATE and DELETE statements.

Returns

A result object from the executed statement.

Return type

sqlalchemy.engine.CursorResult

Raises

See also

Examples

To process the result from the method the database connection must remain open after the method is executed i.e. the context manager cannot be closed before processing the result:

import pandemy

db = SQLiteDb(file='mydb.db')

with db.engine.connect() as conn:
    result = db.execute('SELECT * FROM MyTable;', conn=conn)

     for row in result:
         print(row)  # process the result
         ...
load_table(sql, conn, params=None, index_col=None, columns=None, parse_dates=None, localize_tz=None, target_tz=None, dtypes=None, chunksize=None, coerce_float=True)[source]

Load a SQL table into a pandas.DataFrame.

Specify a table name or a SQL query to load the pandas.DataFrame from. Uses pandas.read_sql() function to read from the database.

Parameters
  • sql (str or sqlalchemy.sql.elements.TextClause) – The table name or SQL query.

  • conn (sqlalchemy.engine.base.Connection) – An open connection to the database to use for the query.

  • params (dict of str or None, default None) – Parameters to bind to the SQL query sql. Parameters in the SQL query should be prefixed by a colon (:) e.g. :myparameter. Parameters in params should not contain the colon ({'myparameter': 'myvalue'}).

  • index_col (str or sequence of str or None, default None) – The column(s) to set as the index of the pandas.DataFrame.

  • columns (list of str or None, default None) – List of column names to select from the SQL table (only used when sql is a table name).

  • parse_dates (list or dict or None, default None) –

    • List of column names to parse as dates.

    • Dict of {column_name: format string} where format string is strftime compatible in case of parsing string times, or is one of (D, s, ns, ms, us) in case of parsing integer timestamps.

    • Dict of {column_name: arg dict}, where the arg dict corresponds to the keyword arguments of pandas.to_datetime(). Especially useful with databases without native datetime support, such as SQLite.

  • localize_tz (str or None, default None) – Localize naive datetime columns of the returned pandas.DataFrame to specified timezone. If None no localization is performed.

  • target_tz (str or None, default None) – The timezone to convert the datetime columns of the returned pandas.DataFrame into after they have been localized. If None no conversion is performed.

  • dtypes (dict or None, default None) – Desired data types for specified columns {‘column_name’: data type}. Use pandas or numpy data types or string names of those. If None no data type conversion is performed.

  • chunksize (int or None, default None) – If chunksize is specified an iterator of DataFrames will be returned where chunksize is the number of rows in each pandas.DataFrame. If chunksize is supplied timezone localization and conversion as well as dtype conversion cannot be performed i.e. localize_tz, target_tz and dtypes have no effect.

  • coerce_float (bool, default True) – Attempts to convert values of non-string, non-numeric objects (like decimal.Decimal) to floating point, useful for SQL result sets.

Returns

dfpandas.DataFrame with the result of the query or an iterator of DataFrames if chunksize is specified.

Return type

pandas.DataFrame or Iterator[pandas.DataFrame]

Raises

See also

Examples

When specifying the chunksize parameter the database connection must remain open to be able to process the DataFrames from the iterator. The processing must occur within the context manager:

import pandemy

db = pandemy.SQLiteDb(file='mydb.db')

with db.engine.connect() as conn:
    df_gen = db.load_table(sql='MyTableName', conn=conn, chunksize=3)

    for df in df_gen:
        print(df)  # Process your DataFrames
        ...
manage_foreign_keys(conn, action)[source]

Manage how the database handles foreign key constraints.

Should be implemented by DatabaseManagers whose SQL dialect supports enabling/disabling checking foreign key constraints. E.g. SQLite.

Parameters
Raises
save_df(df, table, conn, if_exists='append', index=True, index_label=None, chunksize=None, schema=None, dtype=None, method=None)[source]

Save the pandas.DataFrame df to specified table in the database.

If the table does not exist it will be created. If the table already exists the column names of the pandas.DataFrame df must match the table column definitions. Uses pandas.DataFrame.to_sql() method to write the pandas.DataFrame to the database.

Parameters
  • df (pandas.DataFrame) – The DataFrame to save to the database.

  • table (str) – The name of the table where to save the pandas.DataFrame.

  • conn (sqlalchemy.engine.base.Connection) – An open connection to the database.

  • if_exists (str, {'append', 'replace', 'fail'}) –

    How to update an existing table in the database:

  • index (bool, default True) – Write pandas.DataFrame index as a column. Uses the name of the index as the column name for the table.

  • index_label (str or sequence of str or None, default None) – Column label for index column(s). If None is given (default) and index is True, then the index names are used. A sequence should be given if the pandas.DataFrame uses a pandas.MultiIndex.

  • chunksize (int or None, default None) – The number of rows in each batch to be written at a time. If None, all rows will be written at once.

  • schema (str, None, default None) – Specify the schema (if database flavor supports this). If None, use default schema.

  • dtype (dict or scalar, default None) – Specifying the data type for columns. If a dictionary is used, the keys should be the column names and the values should be the SQLAlchemy types or strings for the sqlite3 legacy mode. If a scalar is provided, it will be applied to all columns.

  • method (None, 'multi', callable, default None) –

    Controls the SQL insertion clause used:

    • None:

      Uses standard SQL INSERT clause (one per row).

    • ’multi’:

      Pass multiple values in a single INSERT clause. It uses a special SQL syntax not supported by all backends. This usually provides better performance for analytic databases like Presto and Redshift, but has worse performance for traditional SQL backend if the table contains many columns. For more information check the SQLAlchemy documentation.

    • callable with signature (pd_table, conn, keys, data_iter):

      This can be used to implement a more performant insertion method based on specific backend dialect features. See: pandas SQL insertion method.

Raises

See also

SQLiteDb

SQLiteDb is a DatabaseManager for the flexible file based database SQLite.

class pandemy.SQLiteDb(file=':memory:', must_exist=False, container=None, engine_config=None, **kwargs)[source]

Bases: pandemy.databasemanager.DatabaseManager

A SQLite DatabaseManager.

Parameters
  • file (str or pathlib.Path, default ':memory:') – The file (with path) to the SQLite database. The default creates an in memory database.

  • must_exist (bool, default False) – If True validate that file exists unless file=':memory:'. If it does not exist pandemy.DatabaseFileNotFoundError is raised. If False the validation is omitted.

  • container (SQLContainer or None, default None) – A container of database statements that the SQLite DatabaseManager can use.

  • engine_config (dict or None) – Additional keyword arguments passed to the sqlalchemy.create_engine() function.

  • **kwargs (dict) – Additional keyword arguments that are not used by SQLiteDb.

Raises

See also

manage_foreign_keys(conn, action='ON')[source]

Manage how the database handles foreign key constraints.

In SQLite the check of foreign key constraints is not enabled by default.

Parameters
  • conn (sqlalchemy.engine.base.Connection) – An open connection to the database.

  • action ({'ON', 'OFF'}) – Enable (‘ON’) or disable (‘OFF’) the check of foreign key constraints.

Raises

OracleDb

OracleDb is an Oracle DatabaseManager

class pandemy.OracleDb(username, password, host, port=None, service_name=None, sid=None, connect_args=None, container=None, engine_config=None, url=None, engine=None, **kwargs)[source]

Bases: pandemy.databasemanager.DatabaseManager

An Oracle DatabaseManager.

Requires the cx_Oracle package to be able to create a connection to the database.

To use a DSN connection string specified in the Oracle connection config file, tnsnames.ora, set host to the desired network service name in tnsnames.ora and leave port, service_name and sid as None. Using a tnsnames.ora file is needed to connect to Oracle Cloud Autononmous Databases.

New in version 1.1.0.

Parameters
  • username (str) – The username of the database account.

  • password (str) – The password of the database account.

  • host (str) – The host name or server IP-address where the database is located.

  • port (int or str or None, default None) – The port the host is listening on. The default port of Oracle databases is 1521.

  • service_name (str or None, default None) – The name of the service used for the database connection.

  • sid (str or None, default None) – The SID used for the connection. SID is the name of the database instance on the host. Note that sid and service_name should not be specified at the same time.

  • container (SQLContainer or None, default None) – A container of database statements that OracleDb can use.

  • connect_args (dict or None, default None) – Additional arguments sent to the driver upon connection that further customizes the connection.

  • engine_config (dict or None, default None) – Additional keyword arguments passed to the sqlalchemy.create_engine() function.

  • **kwargs (dict) – Additional keyword arguments that are not used by OracleDb.

Other Parameters
Raises

See also

Examples

Create an instance of OracleDb and connect using a service:

>>> db = pandemy.OracleDb(
... username='Fred_the_Farmer',
... password='Penguins-sheep-are-not',
... host='fred.farmer.rs',
... port=1234,
... service_name='woollysheep'
... )
>>> str(db)
'OracleDb(oracle+cx_oracle://Fred_the_Farmer:***@fred.farmer.rs:1234?service_name=woollysheep)'
>>> db.username
'Fred_the_Farmer'
>>> db.password
'Penguins-sheep-are-not'
>>> db.host
'fred.farmer.rs'
>>> db.port
1234
>>> db.service_name
'woollysheep'
>>> db.url
oracle+cx_oracle://Fred_the_Farmer:***@fred.farmer.rs:1234?service_name=woollysheep
>>> db.engine
Engine(oracle+cx_oracle://Fred_the_Farmer:***@fred.farmer.rs:1234?service_name=woollysheep)

Connect with a DSN connection string using a net service name from a tnsnames.ora config file and supply additional connection arguments and engine configuration:

>>> import cx_Oracle
>>> connect_args = {
... 'encoding': 'UTF-8',
... 'nencoding': 'UTF-8',
... 'mode': cx_Oracle.SYSDBA,
... 'events': True
... }
>>> engine_config = {
... 'coerce_to_unicode': False,
... 'arraysize': 40,
... 'auto_convert_lobs': False
... }
>>> db = pandemy.OracleDb(
... username='Fred_the_Farmer',
... password='Penguins-sheep-are-not',
... host='my_dsn_name',
... connect_args=connect_args,
... engine_config=engine_config
... )
>>> db
OracleDb(
    username='Fred_the_Farmer',
    password='***',
    host='my_dsn_name',
    port=None,
    service_name=None,
    sid=None,
    container=None,
    connect_args={'encoding': 'UTF-8', 'nencoding': 'UTF-8', 'mode': 2, 'events': True},
    engine_config={'coerce_to_unicode': False, 'arraysize': 40, 'auto_convert_lobs': False},
    url=oracle+cx_oracle://Fred_the_Farmer:***@my_dsn_name,
    engine=Engine(oracle+cx_oracle://Fred_the_Farmer:***@my_dsn_name)
)
classmethod from_url(url, container=None, engine_config=None)[source]

Create an instance of OracleDb from a SQLAlchemy URL.

Parameters
Raises

pandemy.CreateConnectionURLError – If url is invalid.

Examples

If you are familiar with the connection URL syntax of SQLAlchemy you can create an instance of OracleDb directly from a URL:

>>> url = 'oracle+cx_oracle://Fred_the_Farmer:Penguins-sheep-are-not@my_dsn_name'
>>> db = pandemy.OracleDb.from_url(url)
>>> db
OracleDb(
    username='Fred_the_Farmer',
    password='***',
    host='my_dsn_name',
    port=None,
    service_name=None,
    sid=None,
    container=None,
    connect_args={},
    engine_config={},
    url=oracle+cx_oracle://Fred_the_Farmer:***@my_dsn_name,
    engine=Engine(oracle+cx_oracle://Fred_the_Farmer:***@my_dsn_name)
)
classmethod from_engine(engine, container=None)[source]

Create an instance of OracleDb from a SQLAlchemy Engine.

Parameters

Examples

If you already have a database engine and would like to use it with OracleDb simply create the instance like this:

>>> from sqlalchemy import create_engine
>>> url = 'oracle+cx_oracle://Fred_the_Farmer:Penguins-sheep-are-not@fred.farmer.rs:1234/shears'
>>> engine = create_engine(url, coerce_to_unicode=False)
>>> db = pandemy.OracleDb.from_engine(engine)
>>> db
OracleDb(
    username='Fred_the_Farmer',
    password='***',
    host='fred.farmer.rs',
    port=1234,
    service_name=None,
    sid='shears',
    container=None,
    connect_args={},
    engine_config={},
    url=oracle+cx_oracle://Fred_the_Farmer:***@fred.farmer.rs:1234/shears,
    engine=Engine(oracle+cx_oracle://Fred_the_Farmer:***@fred.farmer.rs:1234/shears)
)

This is useful if you have special needs for creating the engine that cannot be accomplished with the default constructor of OracleDb. See for instance the cx_Oracle SessionPool example in the SQLAlchemy docs.