Dataset reference in Python by identifier instead of name

user17188 Member
edited February 20 in Magic ETL Ideas

To identify input dataset by identifier instead of name, very useful for working with Python modules, as right now you need to update the source code when changing a name.

2 votes

Active · Last Updated


  • timehat
    timehat Contributor

    Is this within the Python tile in MagicETL or when using Jupyter? (Or some other python integration?)

  • Python tile.

  • Whether you can identify an input dataset by identifier instead of name largely depends on the capabilities and features of the Python integration within your specific ETL tool or data platform.

    • In MagicETL: If you are working within the Python tile in MagicETL, you would need to check if MagicETL allows dataset referencing by identifiers (like unique IDs or keys) instead of names. This feature might be available in the tool's documentation or you might need to explore the tool's interface to see if datasets can be referenced by properties other than their names.
    • In Jupyter or Other Python Integrations: If you are working in a Jupyter notebook or another Python integration within the platform, you typically have more flexibility. Python scripts can be written to accept parameters or read configuration files, allowing you to change input datasets without modifying the source code. This approach requires setting up your Python environment to dynamically accept dataset identifiers, which can be passed as arguments or read from an external configuration.

    In both cases, it's beneficial to consult the specific documentation of your ETL tool or data platform to understand the best practices for handling dataset references. If the tool supports API access or external configuration files, these can be utilized to make your ETL processes more dynamic and less dependent on hard-coded values."